Saturday 13 February 2016

Monitor Nginx and Send a Mail if Stopped

Monitor Nginx Up-time

Upstart script or job to monitor Nginx 

Send a mail if Nginx is down


Downtime -  A nightmare for a DevOps or Sysadmin. Sometimes the Nginx stops working and we aren't aware of it which results in a downtime. 
I wrote a upstart script to monitor Nginx at a regular interval of time and shoot a mail if the Nginx is not working.
The method being followed in this tutorial is as follows
If you haven't setup postfix to send mail if Nginx goes down, than follow this tutorial to set it up.
Create a file in /etc/init/ directory, say check_nginx.conf

sudo vim /etc/init/check_nginx.conf

Add the following content to the file

# Just a custom description for our Job
description "Nginx monitoring job"

# On which conditions the job should start. In this case it's very simple: On the system startup (this is basically when the system is booted)
#start on startup
start on runlevel [2345]

# On which conditions the job should stop. In this case when the system reboot (http://upstart.ubuntu.com/cookbook/#runlevels)
stop on runlevel [06]

# This are the User and User Group that will be used to run the Job. On our case it should be the user that we have set on our capistrano script for instance.
# You can check at `config/deploy/<environment>.rb` on this line `server <some_ip_addreess>, user: <deploy_user>`

setuid ubuntu
setgid ubuntu

# This indicate that we want to restart the Job if it crashes
respawn
respawn limit 5 10

#Without declaring these as normal exit codes, it just respawns.
normal exit 0 TERM

script
# this script runs in /bin/sh by default
# respawn as bash so we can source in RVM
exec /bin/bash <<EOT
  # use syslog for logging
  exec &> /dev/kmsg

  while true
  do

  sudo service nginx status | grep -qi "not running" && echo "Nginx is not running, It has been started by the script. Check the issue" | mail -s "[ALERT] Nginx Down on MyServer" myemail@domain.com && sudo service nginx start

sleep 10s
done

EOT

end script  

The script is simple enough to get an idea how is it working. The script section in this upstart is checking the status of Nginx and if it is not running than it starts the Nginx and send a mail to
myemail@domain.com. You can specify multiple email addresses here by a comma separated list without any spaces. The sleep time for loop to check the Nginx status is 10s here, you can change it as per your requirement.

Now we need to start this upstart job. Run the following command to start the job.

sudo service check_nginx start

This will start the job and now our script is continuously checking the status of Nginx. We can stop the Nginx manually and see if the script starts it again and send the mail to the specified email Id.


sudo service nginx stop

Now you should get a mail at myemail@domain.com. If you haven't received than please check the tutorial again, you might have configured something wrong.

3 comments:

  1. On starting the service using the following command:
    sudo service check_nginx start
    Output:
    Failed to start check_nginx.service: Unit check_nginx.service not found.

    ReplyDelete
    Replies
    1. does it have any updated snippet..??

      Delete
  2. getting same as in local

    ReplyDelete

 

Copyright @ 2013 Appychip.

Designed by Appychip & YouTube Channel