FluentD vs. Logstash: The Ultimate Log Agent Battle

FluentD vs. Logstash: The Ultimate Log Agent Battle

Logging and log management are critical aspects of modern IT management, observation, and security.

Modern distributed IT systems, whether microservice architectures or monolithic systems, are complex systems.

Log management systems allow DevOps teams and developers to keep an eye on performance, investigate errors, or visualize events as they occur.

Log collectors, or aggregators, are critical aspects of the log management infrastructure.

With millions of downloads, two log collectors have risen to widespread acclaim — Fluentd and Logstash.

Which one is best for your log management system — Fluentd vs. Logstash?

What are Log Collectors

As the name suggests, a log collector pulls, parses, normalizes, and enriches data forwarded to other programs such as Elasticseach or databases such as PostgreSQL.

Logs themselves are descriptive data or content that record events or errors. These logs are generally stored for later analysis and are often monitored in real-time.

These logs might represent click streams according to customer location for an online business, error events, or intrusions in your cloud-hosted architecture.

What is a Log Management System

Log collectors do not exist in isolation — they are components of a more extensive system.

This system is generally known as the log stack and consists of software to store and visualize logs forwarded from the log collector or aggregator.

For example, log collectors such as Fluentd and Logstash are often combined with Elasticsearch and Kibana to form the ELK stack when Logstash is used or the EFK stack when Fluentd is used.

Elasticsearch is a text search engine and analysis engine that allows users to store their data for fast search centrally and powerful analytics that scales with ease.

Kibana allows users to visualize and navigate Elasticsearch data simply and intuitively.

Together, these different components are most commonly used to monitor, troubleshoot, and secure large distributed systems and other systems.

While system monitoring is a primary use case for logging systems, there are many other ELK or EFK stack cases, such as business intelligence and web analytics. Click streams, for example, are another common use of the ELK and EFK stack.

Fluentd vs. Logstash: Key Differences

Open Source

Both Fluentd and Logstash are open source.

Fluentd is an Apache 2.0 Licensed, fully open-source software. The source code is available on GitHub.

Logstash is also fully open source under the Apache 2 license.

Treasure Data built, manages, and maintains Fluentd and is part of the CNCF foundation.

Elastic built, manages, and maintains Logstash and also developed ElasticSearch and Kibana.

Platform

Both Fluentd and Logstash run on Windows and Linux.

Originally, Logstash had a platform advantage as it was written in JRuby, which runs on the JVM and is naturally cross-platform.

Fluentd, written in Ruby and C, was not available on Windows until 2015. Today Fluentd is fully cross-platform.

Ecosystem and Plugins

Both Fluentd and Logstash have active and expansive plugin ecosystems.

Plugins allow developers and DevOps teams to configure logging systems by input, parser, filter, output, formatter, storage, and buffer. For example

Logstash is a centralized plugin ecosystem managed under a single github repository. This allows for a simple, one-stop location for all plugins.

Fluentd has a decentralized plugin ecosystem. Fluentd does provide an official repository, but a vast majority of its plugins are hosted on individual repositories.

This means that the Logstash ecosystem is more easily navigated.

Development Language

Fluentd uses CRuby as a development language, while Logstash uses JRuby.

As you would expect, JRuby uses the JVM, and as a result, Logstash is a more memory-expensive log collector.

Event Routing

Event routers in a log stack send messages and events between applications and systems.

In a logging system, how event routing is handled is a critical consideration.

Logstash and Fluentd differ considerably in this regard, and you should consider this when deciding which logging stack to choose.

Fluentd routes events on tags. Each event, or new log, in a Fluentd system, contains a tag that tells Fluentd where to route the event. Each event source, or input, has a tag that is essentially instructions for Fluentd on where to output the event log.

Logstash employs if-then logic. All data in a Logstash system arrives on a single stream, and Logstash then uses if-then statements to determine how to forward the log algorithmically.

The difference in event routing is not to be overlooked. Using algorithmic if-then statements for event routing makes a Logstash system procedural. The tag system employed for Fluentd is more declarative.

In many cases, the tag event routing system employed by Fluentd is better for complex logic.

Transport

Log systems involve inputs such as files or data stores to get data into the log collector.

Logstash lacks an in-memory message queue. Currently, Logstash’s in-memory queue holds just 20 events. Resiliency in a Logstash system requires Logstash to be an external queue like Redis.

For DevOps teams and developers, this means more configuration and another dependency to manage.

Fluentd handles resiliency with an internal and configurable buffering system that allows in-memory or on-disk parameters.

Memory

Logstash, running on the JVM, uses more memory.

Both Fluentd and Logstash have lighter, low resource products.

For Fluentd, this is fluent-bit. fluent-bit is written in C and has a pluggable architecture supporting more than 500 plugins configured using

Logstash provides Elastic Beats as a lightweight alternative.

When to Use FluentD

Fluentd has long been preferred by teams using Docker. As a result, Docker has native, built-in support for Fluentd, but not for Logstash.

Similarly, native Docker support means that Fluentd is often the best option when monitoring Kubernetes environments.

If you are using Docker, Fluentd might be best.

When memory is critical — for example, embedded software — Fluentd’s relative memory efficiency due to the lack of JVM is probably best.

Finally, if your system does not involve the JVM, avoid this new requirement by choosing Fluentd.

When to Use Logstash

Logstash was built with Elasticsearch and Kibana in mind. If you are looking for a log collector to work with a system involving Elasticsearch and Kibana, then Logstash is likely your best bet.

Generally speaking, when you prefer a more managed, supported system, Logstash wins out. By way of a simple example, the managed plugin ecosystem provided by Logstash is an indicator of a more managed product.

Logging at Scale

The Fluentd vs. Logstash consideration is only one of many challenges when you look to monitor logs at scale.If you are looking for a comprehensive IT infrastructure and application monitoring solution, consider logiq.ai’s integrated monitoring solution. Logiq.ai is committed to delivering turnkey monitoring products to manage your logging needs.

Originally published at https://logiq.ai.

Turning log data into real time insights. Real-time log aggregation and analysis to any S3 compatible object-store| Eliminate Cost Per GB | www.logiq.ai

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store