Mon is considered king of all monitoring tools.

Note: This article obsoletes my previous article on MON.

First of all, install the following perl modules:-

perl -MCPAN -e "install Time::HiRes" perl -MCPAN -e "install Time::Period"

Download Mon and mon-client software :-
cd /root
wget http://kernel.org/pub/software/admin/mon/mon-1.2.0.tar.bz2
wget http://kernel.org/pub/software/admin/mon/mon-client-1.2.0.tar.bz2

cd /usr/local/

[root@www local]# tar xjf /root/mon-1.2.0.tar.bz2

Rename the directory:-

[root@www local]# mv mon-1.2.0 mon

cd /usr/local/mon/etc

cp example.cf mon.cf

Edit mon.cf, change the paths from /usr/lib/mon to /usr/local/mon .

[root@www etc]# vi mon.cf


NOTE:

A “watch” definition (a line which begins with the word “watch” and is
followed by “service” definitions) is terminated by an
empty line, or by a subsequent definition. You may not put blank lines
inside of your watch definitions.

global options

cfbasedir = /usr/local/mon/etc
alertdir = /usr/local/mon/alert.d
mondir = /usr/local/mon/mon.d
maxprocs = 20
histlength = 100
randstart = 60s

dtlogfile = /var/log/mon-downtim.log
dtlogging = yes

authentication types:
getpwnam standard Unix passwd, NOT for shadow passwords
shadow Unix shadow passwords (not implemented)
userfile “mon” user file

authtype = getpwnam

NB: hostgroup and watch entries are terminated with a blank line (or
end of file). Don’t forget the blank lines between them or you lose.

group definitions (hostnames or IP addresses)

hostgroup webservers www.example.com

hostgroup mailservers mail.example.com

hostgroup dbservers db.example.com

watch mailservers
service ping
description ping servers
interval 5m
monitor fping.monitor
depend routers:ping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert mis@domain.com
alert page.alert mis-pagers@domain.com
alertevery 1h
period wd {Sat-Sun}
alert mail.alert bofh@domain.com
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert mis@domain.com
alert page.alert mis-pagers@domain.com
alertevery 1h
service smtp
interval 10m
monitor smtp.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert mis-pagers@domain.com
service imap
interval 10m
monitor imap.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert mis-pagers@domain.com
service pop
interval 10m
monitor pop3.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert mis-pagers@domain.com

watch webservers
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert mis@domain.com
alert page.alert mis-pagers@domain.com
alertevery 1h
service ping
interval 2m
monitor fping.monitor
allow_empty_group
period wd {Sun-Sat}
alert qpage.alert mis-pagers
alertevery 45m
service http
interval 4m
monitor http.monitor
allow_empty_group
period wd {Sun-Sat}
alert qpage.alert mis-pagers
upalert mail.alert -S “web server is back up” mis
alertevery 45m
service freespace
interval 15m
monitor freespace.monitor /f330:5000 /f540:5000 ;;
period wd {Sun-Sat}
alert mail.alert mis@domain.com

alert delete.snapshot

alertevery 1h
service ftp
interval 5m
monitor ftp.monitor
period wd {Sun-Sat}
alert mail.alert mis@domain.com
alertevery 1h

watch dbservers
service ping
description ping servers
interval 5m
monitor fping.monitor
depend routers:ping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert mis@domain.com
alert page.alert mis-pagers@domain.com
alertevery 1h
period wd {Sat-Sun}
alert mail.alert bofh@domain.com
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert mis@domain.com
alert page.alert mis-pagers@domain.com
alertevery 1h

Next, download and install fping:

fping is a ping(1) like program which uses the Internet Control
Message Protocol (ICMP) echo request to determine if a host is
up. fping is different from ping in that you can specify any
number of hosts on the command line, or specify a file containing
the lists of hosts to ping. Instead of trying one host until it
timeouts or replies, fping will send out a ping packet and move
on to the next host in a round-robin fashion. If a host replies,
it is noted and removed from the list of hosts to check. If a host
does not respond within a certain time limit and/or retry limit it
will be considered unreachable.
Checking 2500 hosts (99% of which are unreachable) via ping can take hours.
fping was written to solve the problem of pinging N number of hosts
in an efficient manner. By sending out pings in a round-robin fashion
and checking on responses as they come in at random, a large number of
hosts can be checked at once.

Unlike ping, fping is meant to be used in scripts and its
output is easy to parse.


cd /root

wget http://fping.sourceforge.net/download/fping.tar.gz

tar xzf fping.tar.gz
cd fping-2.4b2_to/

./configure
make
make install

cd /root

Fping will get installed in /usr/local/sbin as a result of “make install” . /usr/local/sbin is in the search path be default. If it is not, you can specify the full / absolute path to fping program in the mon.d/fping.monitor file by manually editing it at a particular line ( line # 53) :-

vi /usr/local/mon/mon.d/fping.monitor

my $CMD = “fping -e -r $RETRIES -t $TIMEOUT”;

Add the following lines to /etc/services:

mon 2583/tcp # MON
mon 2583/udp # MON traps

Copy the mon startup script to /etc/init.d/ :-

cp /usr/local/mon/etc/S99mon /etc/init.d/mon

vi /etc/init.d/mon


#!/bin/sh
#

start/stop the mon server

#

You probably want to set the path to include

nothing but local filesystems.

#

chkconfig: 2345 99 10

description: mon system monitoring daemon

processname: mon

config: /usr/local/mon/etc/mon/mon.cf

pidfile: /var/run/mon.pid

#
PATH=/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/mon
export PATH

Source function library.

. /etc/rc.d/init.d/functions

The following two variables are introduced by Kamran

MONCONFIGFILE=/usr/local/mon/etc/mon.cf
MON=/usr/local/mon/mon

See how we were called.

case “$1” in
start)
echo -n “Starting mon daemon: “

The following line is edited by Kamran. Replaced absulute path with variable names.

daemon $MON -f -l -c $MONCONFIGFILE
echo
touch /var/lock/subsys/mon
;;
stop)
echo -n “Stopping mon daemon: “
killproc mon
echo
rm -f /var/lock/subsys/mon
;;
status)
status mon
;;
restart)
killall -HUP mon
;;
*)
echo "Usage: mon {start|stop|status|restart}"
exit 1
esac

exit 0


[root@www mon]#

chmod +x /etc/init.d/mon
chkconfig –level 35 mon on

service mon start

[root@www mon-client-1.2.0]# service mon status
mon (pid 20609) is running…
[root@www mon-client-1.2.0]#


Check:-

[root@www mon]# ps aux | grep mon
root 2134 0.0 0.0 3788 284 ? S Jun27 0:00 /usr/sbin/courierlogger -pid=/var/spool/authdaemon/pid -start /usr/libexec/courier-authlib/authdaemond
root 2135 0.0 0.0 52496 436 ? S Jun27 0:00 /usr/libexec/courier-authlib/authdaemond
root 2148 0.0 0.0 54708 728 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2149 0.0 0.0 54708 732 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2150 0.0 0.0 54708 728 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2151 0.0 0.0 54708 732 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
root 2152 0.0 0.0 54708 728 ? S Jun27 0:01 /usr/libexec/courier-authlib/authdaemond
dbus 2153 0.0 0.0 21256 344 ? Ss Jun27 0:00 dbus-daemon –system
qscand 2432 0.0 0.0 21564 976 ? Ss Jun27 0:16 /usr/bin/freshclam -d -c 24 –quiet -p /var/run/clamav/freshclam.pid –daemon-notify=/etc/clamd.conf
root 20354 0.0 0.8 106076 8984 ? S 21:56 0:00 /usr/bin/perl /usr/local/mon/mon -f -l -c /usr/local/mon/etc/mon.cf
root 20364 0.0 0.0 61148 680 pts/0 S+ 21:57 0:00 grep mon

That was a lot of output. Let’s filter out the word courier.

[root@www mon]# ps aux | grep mon | grep -v courier dbus 2153 0.0 0.0 21256 344 ? Ss Jun27 0:00 dbus-daemon –system
qscand 2432 0.0 0.0 21564 976 ? Ss Jun27 0:16 /usr/bin/freshclam -d -c 24 –quiet -p /var/run/clamav/freshclam.pid –daemon-notify=/etc/clamd.conf
root 20354 0.0 0.8 106076 9004 ? S 21:56 0:00 /usr/bin/perl /usr/local/mon/mon -f -l -c /usr/local/mon/etc/mon.cf
root 20372 0.0 0.3 85284 3244 ? S 21:57 0:00 /usr/bin/perl /usr/local/mon/mon.d/smtp.monitor mail.example.com
root 20377 0.1 0.3 87352 3304 ? S 21:57 0:00 /usr/bin/perl /usr/local/mon/mon.d/imap.monitor mail.example.com
root 20380 0.0 0.0 61148 680 pts/0 S+ 21:57 0:00 grep mon
[root@www mon]#

Time to copy the client CGI program to proper location:-

mkdir /var/www/cgi-bin/mon

cp /usr/local/mon/clients/mon.cgi /var/www/cgi-bin/mon/

[root@www mon]# chmod +x /var/www/cgi-bin/mon/mon.cgi

vi /var/www/cgi-bin/mon.cgi
. . .
$organization = “TestSite”; # Organization name.
$monadmin = “kamran@example.com”; # Your e-mail address. Make sure the backslash is present.
$reload_time = 30; # Seconds for page reload.
. . .

Try accessing this page from the web browser :-

http://www.example.com/cgi-bin/mon/mon.cgi

If you get a blank page, check your apache error log for your site:-

[Thu Jul 30 22:08:01 2009] [error] [client 76.74.237.16] Can’t locate Mon/Client.pm in @INC (@INC contains: /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/site_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7 /usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl/5.8.5 /usr/lib/perl5/site_perl /usr/lib64/perl5/vendor_perl/5.8.8/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.7/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.6/x86_64-linux-thread-multi /usr/lib64/perl5/vendor_perl/5.8.5/x86_64-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7 /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5 /usr/lib/perl5/vendor_perl /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi /usr/lib/perl5/5.8.8) at /var/www/cgi-bin/mon/mon.cgi line 138.
[Thu Jul 30 22:08:01 2009] [error] [client 76.74.237.16] BEGIN failed–compilation aborted at /var/www/cgi-bin/mon/mon.cgi line 138.
[Thu Jul 30 22:08:01 2009] [error] [client 76.74.237.16] Premature end of script headers: mon.cgi

, then, this means that Mon/Client.pm is to be installed.

[root@www mon]# perl -MCPAN -e “install Mon::Client”


Running make install
Prepending /root/.cpan/build/Mon-0.11-x4te9h/blib/arch /root/.cpan/build/Mon-0.11-x4te9h/blib/lib to PERL5LIB for ‘install’
Manifying blib/man3/Mon::Protocol.3pm
Manifying blib/man3/Mon::SNMP.3pm
Manifying blib/man3/Mon::Client.3pm
Installing /usr/lib/perl5/site_perl/5.8.8/Mon/SNMP.pm
Installing /usr/lib/perl5/site_perl/5.8.8/Mon/Protocol.pm
Installing /usr/lib/perl5/site_perl/5.8.8/Mon/Client.pm
Installing /usr/share/man/man3/Mon::SNMP.3pm
Installing /usr/share/man/man3/Mon::Protocol.3pm
Installing /usr/share/man/man3/Mon::Client.3pm
Appending installation info to /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/perllocal.pod
TROCKIJ/Mon-0.11.tar.gz
/usr/bin/make install – OK
[root@www mon]#

If this doesn’t work for you, then you can also untar the mon-client package and install these modules from there.

Note: In my experience installing the perl modules which came with mon-client was better option, otherwise I was getting an error as:-

[Thu Jul 30 22:18:55 2009] [error] [client 76.74.237.16] Can’t locate object method “list_views” via package “Mon::Client” at /var/www/cgi-bin/mon/mon.cgi line 2175, line 1. [Thu Jul 30 22:18:55 2009] [error] [client 76.74.237.16] Premature end of script headers: mon.cgi

, and googling on this one did not help!

So, as you can see below, you will see the same three perl modules + an additional one in the mon-client tarball:-

cd /root

tar xjf mon-client-1.2.0.tar.bz2

cd mon-client-1.2.0

ls
CHANGES COPYING COPYRIGHT Makefile.PL MANIFEST Mon README test.pl VERSION

ls Mon/
Client.pm Config.pm Protocol.pm SNMP.pm

To actuall install them, use:

perl Makefile.PL
make
make test
make install

Try reloading the page : http://www.example.com/cgi-bin/mon/mon.cgi

This time, you should be able to see the page showing some statistics.

Remember to relax your firewall to allow outgoing traffic for the protocols / monitors you are using for different servers.
Similarly the servers you are monitoring should also have a relaxed firewall to allow incoming connections from the monitoring server.

Securing access to mon.cgi :-

In your apache config file, add the follwing code:-

vi /etc/httpd/conf/httpd.conf

<Directory “/var/www/cgi-bin/mon”>
AllowOverride AuthConfig
Options None
Order allow,deny
Allow from all
</Directory>

service httpd reload

Create a .htaccess file in mon’s cgi directory:-

[root@www mon]# vi /var/www/cgi-bin/mon/.htaccess
AuthName “Authorization Required”
AuthType Basic
AuthUserFile /var/www/vhosts/.htpasswd
Require valid-user
[root@www mon]#

Change permissions and ownership of the .htaccess file:-

chown siteftpuser:apache /var/www/cgi-bin/mon/.htaccess
chmod 640 /var/www/cgi-bin/mon/.htaccess

[root@www mon]# htpasswd -c /var/www/vhosts/.htpasswd monitor

Read the MON documentation in the doc directory , on how to write monitors and alerts:-

[root@www mon]# ls doc/
CHANGES.mon.cgi monshow.1 README.msql-mysql.monitor README.software
globals README.alerts README.paging README.syslog.monitor
how-to-write-a-monitor.txt README.cgi-bin README.protocol README.traps
how-to-write-an-alert.txt README.hints README.rpc.monitor README.variables
mon.8 README.mon.cgi README.snmpdiskspace.monitor
moncmd.1 README.monitors README.snmpvar.monitor

The following articles are quite helpful, having sample MON configurations, etc.

Regards,

Kamran