![]() What Is Elastic Beats (or More Correctly, What ARE Elastic Beats)?Įlastic Beats are a series of different data shippers that are set up and configured to send data from a server or computer into Elasticsearch-either directly, or via Logstash. In this blog, I’ll take a deeper look at Beats to understand how it works, what you might use it for, and how it compares with Logstash. You’ve made it all the way down here? Bravo! If you need any help with Elasticsearch – don’t forget does Elasticsearch Consulting, Production Support, as well as Elasticsearch Training.After my last blog post about Logstash, Elasticsearch, and Kibana, I wanted to investigate something else I kept coming across during my Logstash research: Elastic Beats.īeats initially appeared to me to be a way to send data to Elasticsearch, the same as Logstash, leading me to wonder how Beats is different and where it fits in the ELK stack. Ingest work is almost all CPU, so you can choose compute-optimized machines for the job By default, all nodes can perform Ingest tasks (node.ingest=true in elasticsearch.yml). ![]() For Ingest, it’s best to have dedicated Ingest nodes.If parsing is simple, Logstash’s Dissect filter might be a good replacement for Grok.Make sure Logstash’s pipeline batch size and number of threads are configured to make the best use of your hardware: use all the CPU, but don’t spend too much time on context switching.Define the grok rules matching most logs first because both Ingest and Logstash exit the chain on the first match by default.Ingest nodes can also act as “client” nodes.For a single grok rule, it was about 10x faster than Logstash Ingest node is lighter across the board.Logstash is easier to configure, at least for now, and performance didn’t deteriorate as much when adding rules.Performance Conclusions: Logstash vs Elasticsearch Ingest Node Overall, the throughput-to-CPU ratio of the Ingest node dropped by a factor of 9 compared to the Apache logs scenario. Because now there are up to 23 rules to evaluate, throughput goes down to about 10K EPS:Īnd the CPU bottleneck shifts to the Ingest node: On the Logstash side, we have a beats listener, a grok filter and an Elasticsearch output: input ", ![]() With Logstash 5.0 in place, we pointed Filebeat to it, while tailing the raw Apache logs file. So we added another c3.xl instance (4 vCPUs) to do the parsing, first with Logstash, then with a separate Elasticsearch dedicated Ingest node.Įlasticsearch Ingest Node vs Logstash Performance – #elasticsearch5 #logstash This 4K EPS throughput/40 percent CPU ratio is the most efficient way to send logs to Elasticsearch – if you can log in JSON. ![]() That said, throughput dropped to about 4K EPS because JSON logs are bigger and saturate the network:ĬPU dropped as well, but not that much because now Elasticsearch has to do more work (more fields to index): Conveniently, Filebeat can parse JSON since 5.0. Ideally, you’d log in JSON and push directly to Elasticsearch. It turned out that network was the bottleneck, which is why pushing raw logs doesn’t saturate the CPU:Įven though we got a healthy throughput rate of 12-14K EPS:īut raw, unparsed logs are rarely useful. We also installed Sematext agent to monitor Elasticsearch performance. We used an AWS c3.large for Filebeat (2 vCPU) and a c3.xlarge for Elasticsearch (4 vCPU). To get a baseline, we pushed logs with Filebeat 5.0alpha1 directly to Elasticsearch, without parsing them in any way. This way we could also check how both Ingest ’s Grok processors and Logstash ’s Grok filter scale when you start adding more rules.īaseline performance: Shipping raw and JSON logs with Filebeat Specifically, we tested the grok processor on Apache common logs (we love logs here), which can be parsed with a single rule, and on CISCO ASA firewall logs, for which we have 23 rules. Is it worth sending data directly to Elasticsearch or should we keep Logstash? We decided to take it for a spin and see how this new functionality (called Ingest) compares with Logstash filters in both performance and functionality. Unless you are using a very old version of Elasticsearch you’re able to define pipelines within Elasticsearch itself and have those pipelines process your data in the same way you’d normally do it with something like Logstash.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |