Monitor Nginx Up-time
Upstart script or job to monitor Nginx
Send a mail if Nginx is down
Downtime - A nightmare for a DevOps or Sysadmin. Sometimes the Nginx stops working and we aren't aware of it which results in a downtime.
I wrote a upstart script to monitor Nginx at a regular interval of time and shoot a mail if the Nginx is not working.
The method being followed in this tutorial is as follows
- Setup postfix and SMTP to send mail from the server.
- Create the upstart job to check the status of Nginx.
- Automatic restart of Nginx by the upstart job.
- Send a mail to the Sysadmin.
Create a file in /etc/init/ directory, say check_nginx.conf
sudo vim /etc/init/check_nginx.conf
Add the following content to the file
# Just a custom description for our Job description "Nginx monitoring job" # On which conditions the job should start. In this case it's very simple: On the system startup (this is basically when the system is booted) #start on startup start on runlevel [2345] # On which conditions the job should stop. In this case when the system reboot (http://upstart.ubuntu.com/cookbook/#runlevels) stop on runlevel [06] # This are the User and User Group that will be used to run the Job. On our case it should be the user that we have set on our capistrano script for instance. # You can check at `config/deploy/<environment>.rb` on this line `server <some_ip_addreess>, user: <deploy_user>` setuid ubuntu setgid ubuntu # This indicate that we want to restart the Job if it crashes respawn respawn limit 5 10 #Without declaring these as normal exit codes, it just respawns. normal exit 0 TERM script # this script runs in /bin/sh by default # respawn as bash so we can source in RVM exec /bin/bash <<EOT # use syslog for logging exec &> /dev/kmsg while true do sudo service nginx status | grep -qi "not running" && echo "Nginx is not running, It has been started by the script. Check the issue" | mail -s "[ALERT] Nginx Down on MyServer" myemail@domain.com && sudo service nginx start sleep 10s done EOT end script
The script is simple enough to get an idea how is it working. The script section in this upstart is checking the status of Nginx and if it is not running than it starts the Nginx and send a mail to
myemail@domain.com. You can specify multiple email addresses here by a comma separated list without any spaces. The sleep time for loop to check the Nginx status is 10s here, you can change it as per your requirement.
Now we need to start this upstart job. Run the following command to start the job.
sudo service check_nginx start
This will start the job and now our script is continuously checking the status of Nginx. We can stop the Nginx manually and see if the script starts it again and send the mail to the specified email Id.
sudo service nginx stop
Now you should get a mail at myemail@domain.com. If you haven't received than please check the tutorial again, you might have configured something wrong.
On starting the service using the following command:
ReplyDeletesudo service check_nginx start
Output:
Failed to start check_nginx.service: Unit check_nginx.service not found.
does it have any updated snippet..??
Deletegetting same as in local
ReplyDelete