I am doing research for finding suitable monitoring tools for monitoring our internal and external services. It should be:
- Allow custom (language-agnostic) scripts
- Able to recover the process
- Have ability for sending alerts to multiple channels
Other features might be useful are:
- Ability to export check results to a collector / database
- Having dashboard UI
The tools can be free or paid and ideally is an open source project. If you have any suggestions, please leave a comment below.
Currently, here’s the result of my investigations:
Monit is lightweight utility for managing and monitoring processes, programs, files, directories and filesystems on a UNIX system. Additionally, it has basic web dashboard for seeing all the things that monitored. It’s written in C.
It has support for GET HTTP method as part of its connection testing. For non-GET HTTP methods, then the available option is writing custom script by using its program status testing. The script is expected to returns proper status (exit code).
Using similar approach we can monitor cronjob easily. Alternatively, you can also follow a tutorial for checking cronjob by checks cronjob log.
Monit support writing its checks data and events to external service by using its collector. You can configure it by configuring on
set mmonit http://monit:email@example.com:8080/collector
For alerting, Monit support sending alerts through email natively. For other channels, you can use third-party script. Say SMS, there is Monit2Twilio. You can follow the tutorial for HipChat integration and also for Slack integration. Basically you can write any custom script to allow the alerting to support any channel that you want.
Monit can be easily start/stop/restart the processes according to the events. You can also configuring the process initialisation script by specifying
start program and
stop program on the checker configuration.
When you install Sensu server, the client is also installed by default, but not active. You can activate the client by adding a configuration on
/etc/sensu/conf.d/client.json. If you want to install client on a separate machine, then you can just install Sensu client and pass a RabbitMQ configuration that used by Sensu server. See the details here.
Sensu client works by running checks and sending the results to AMQP broker and then the server will handle the results from the broker.
You can write any script as a Sensu checker, as long as it returns proper exit code. 0 for
OK, 1 for
WARNING, 2 for
CRITICAL, and 3 or greater for
There is no differences between HTTP, filesystem, directory, or any others checking, you need to write custom script for that.
You can also let Sensu sends alert by using its handler plugin. Just what checker does, you need to write custom script to send alerts through your desired channels. The
pipe plugin will read event data from
STDIN and you can parse that one for more information about particular event performed. Fortunately, Sensu has built-in support for Email, HipChat, Slack and many more.
By utilising Sensu handler, you can also write script or pass a command to start/stop/restart a process beside sending alerts. Another possibility is you can also writing a script to send those events data to database or any collector so you can have a copy for that info.
Sensu has pretty dashboard called Uchiwa which can be used to monitors all Sensu instances along with its checks and events data.
Sensu also comes with enterprise version which cost start from $50 for maximum 50 Sensu clients. This version has built-in dashboard.
Inspeqtor support sending alerts via email by configuring its global configuration at
/etc/inspeqtor/inspeqtor.conf. Currently that’s the only alerts channel supported. If you want to have Slack or HipChat integration then you’ll be dissappointed at this moment because that’s only exist on PRO version.
Inspeqtor can monitors processes using its INQ configuration. It can restart the process as long as the process follow the restart verb, i.e.
/etc/init.d/<service> restart. It is opinionated software, so you must follow its recommendation.
While it’s good to monitor pid based process, filesystem and etc, you can’t write custom script as a checker. So you can’t test HTTP request and other things that has no support out of the box.
I already mention about PRO version above. PRO version also have some other functionalities like sending event metrics to database, monitors cron job and also real-time monitoring for Go based daemon. The PRO version cost start at $25/month for 0-20 machines.
There is a repo which maintains and tracks all monitoring and metrics tools. I am still digging through some of them to find better candidates for our needs for monitoring our services.