How To Setup Fluentd With High Availability
Setup Fluentd With Forwarder-Aggregator Architecture
Fluentd is one of the most popular alternative to logstash because of its features which are missing in logstash. So before setting up fluentd let's have a look and compare both:
- Fluentd has builtin architecture for high availability (There could be more than one aggregator)
- Fluentd consumes less memory as compare to logstash
- Log parsing, tagging is more easy
- Tag based log routing is possible
Let's start building our centralized logging system using Elasticsearch, Fluentd and Kibana (EFK).
We will be following the architecture which has fluentd-forwarder(td-agent), fluentd-aggregator, Elasticsearch and Kibana. Fluentd-forwarder (the agent) reads the logs from a file and forward the logs to aggregator. Aggregator decides what should be the index_name and which host of Elasticsearch to send logs. The Elasticsearch is on a separate instance to receive logs and it also have kibana setup to visualise elasticsearch data.
# Architecture:
Following is the architecture for High Availability. Here there are multiple log-forwarder on each application node which are forwarding logs to log-aggregators. There are two aggregators shown in below architecture, if one fails then forwarders start sending logs to the second one.
# Video Tutorial
# Setup Elasticsearch & Kibana
We have already covered the setup of Elasticsearch and Kibana in one of our tutorial. Please follow this post to install Elasticsearch and Kibana.
# Log Pattern
We are considering the log format shown below:
INFO [2018-02-17 17:14:55,827 +0530] [pool-5-thread-4] [] com.amazon.sqs.javamessaging.AmazonSQSExtendedClient: S3 object deleted, Bucket name: sqs-bucket, Object key: 63c1a5b8-4ddc-4136-b086-df6a8486414a. INFO [2018-02-17 17:14:56,124 +0530] [pool-5-thread-9] [] com.amazon.sqs.javamessaging.AmazonSQSExtendedClient: S3 object read, Bucket name: sqs-bucket, Object key: 2cc06f96-283f-4da7-9402-f08aab2df999.
# Log Regex
This regex is based on the logs above and needs to be specified in the source section of td-agent.conf file of forwarder.
/^(?<level>[^ ]*)[ \t]+\[(?<time>[^\]]*)\] \[(?<thread>[^\]]*)\] \[(?<request>[^\]]*)\] (?<class>[^ ]*): (?<message>.*)$/
# Setup fluentd-aggregator
We will setup only one aggregator for this tutorial. However, you may setup two aggregators for high availability. On aggregator instance run following command:
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
sudo apt-get install make libcurl4-gnutls-dev --yes sudo apt-get install build-essential sudo /opt/td-agent/embedded/bin/fluent-gem install fluent-plugin-elasticsearch
# After setup edit conf file and customise configuration
sudo vi /etc/td-agent/td-agent.conf
Content of /etc/td-agent/td-agent.conf. Replace host IP with your elasticsearch instance IP.
<source> @type forward port 24224 </source> <match myorg.**> @type copy <store> @type file path /var/log/td-agent/forward.log </store> <store> @type elasticsearch_dynamic #elasticsearch host IP/domain host 192.168.1.4 port 9200 index_name fluentd-${tag_parts[1]+ "-" + Time.at(time).getlocal("+05:30").strftime(@logstash_dateformat)} #logstash_format true #logstash_prefix fluentd time_format %Y-%m-%dT%H:%M:%S #timezone +0530 include_timestamp true flush_interval 10s </store> </match>
Restart fluentd-aggregator process and check the logs with the following command:
sudo service td-agent restart
# check logs
tail -f /var/log/td-agent/td-agent.log
# Setup fluentd-forwarder
To setup forwarder, run following command on application instance.
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent2.sh | sh
# customise config in file td-agent.conf
sudo vi /etc/td-agent/td-agent.conf
Content of /etc/td-agent/td-agent.conf. Replace path with the path of your application log and aggregator IP with the IP of your aggregator Instance. You may use domain instead of IPs.
<match td.*.*> @type tdlog apikey YOUR_API_KEY auto_create_table buffer_type file buffer_path /var/log/td-agent/buffer/td <secondary> @type file path /var/log/td-agent/failed_records </secondary> </match> ## match tag=debug.** and dump to console <match debug.**> @type stdout </match> ## built-in TCP input ## @see http://docs.fluentd.org/articles/in_forward <source> @type forward port 24224 </source> <source> @type http port 8888 </source> ## live debugging agent <source> @type debug_agent bind 127.0.0.1 port 24230 </source> <source> @type tail path /var/log/myapp.log pos_file /var/log/td-agent/myorg.log.pos tag myorg.myapp format /^(?<level>[^ ]*)[ \t]+\[(?<time>[^\]]*)\] \[(?<thread>[^\]]*)\] \[(?<request>[^\]]*)\] (?<class>[^ ]*): (?<message>.*)$/ time_format %Y-%m-%d %H:%M:%S,%L %z timezone +0530 time_key time keep_time_key true types time:time </source> <match myorg.**> @type copy <store> @type file path /var/log/td-agent/forward.log </store> <store> @type forward heartbeat_type tcp #aggregator IP host 192.168.1.86 flush_interval 30s </store> # secondary host is optional # <secondary> # host 192.168.0.12 # </secondary> </match>
Restart fluentd-forwarder process and check logs with the following command:
sudo service td-agent restart
# check logs
tail -f /var/log/td-agent/td-agent.log
Now after restarting td-agent on both forwarder and aggregator, you can see data being stored to elasticsearch. When elasticsearch start receiving data from aggregator, you can make index pattern in kibana and start visualising the logs.
Finally after creating index pattern, logs will start appearing in Discover tab of dashboard
Hurray!!! You have successfully setup EFK stack to centralise your logging.
# Create Index Pattern In Kibana
Once you start getting logs in Elasticsearch, You can create an index pattern in kibana to visualise the logs. We have specified the index_name in fluentd to be of format fluentd-myapp-2018.02.12, so we will create an index pattern fluentd-* Follow the below steps shown in pictures to create an index pattern.Finally after creating index pattern, logs will start appearing in Discover tab of dashboard
Hurray!!! You have successfully setup EFK stack to centralise your logging.
good work
ReplyDeleteWe are glad that it helped you. Please like and subscribe our youtube channel to keep us motivated
Deletehttps://www.youtube.com/watch?v=USCSpeQrVZM
Keep Learning & Keep Sharing
This comment has been removed by the author.
DeleteI am trying to setup the EFK stack. But im facing with an issue, Im not able to push the logs from Forwarder to Aggregator. Can u please help me in resolving this issue
ReplyDeleteSure, please check your forwarder logs for the error. You can also follow our video tutorial which exactly explains the above steps. video tutorial will give you the exact idea how am doing the stuff.
Deletehttps://www.youtube.com/watch?v=USCSpeQrVZM
In any case you can post your error from logs and we would be happy to help.
This comment has been removed by the author.
DeleteThis comment has been removed by the author.
ReplyDeletehow to configure it on localhost?
ReplyDeleteThanks @Ajit , this document helps me to configure HA Fluentd , Now i am able to index log of my kubernetes Cluster ,
ReplyDeleteGood works, Keep it up , Kudos :)
Good doc.Thankyou
ReplyDeleteThanks for sharing these information it was really helpful just one query could you please provide some detail about fluentd output kafka
ReplyDeleteHow do we do the same HA setup in Amazon EKS ecosystem?
ReplyDeleteIs there a way to create Disaster Recovery/Business Continuity setup for a ES 6x cluster in a different region?
ReplyDelete