Monitoring with MON
This is an extract from an implementation I did on a heartbeat based system. Mon is an excellent monitoring and reporting tool.
It is assumed that there are two servers connected to each other over multple heartbeat links.
MON for heartbeat monitoring:
Both servers:
wget ftp://ftp.kernel.org/pub/software/admin/mon/mon-1.2.0.tar.gz
wget ftp://ftp.kernel.org/pub/software/admin/mon/mon-client-1.2.0.tar.gz
tar xzf mon-1.2.0.tar.gz
tar xzf mon-client-1.2.0.tar.gz
cd mon-client-1.2.0
perl Makefile.PL
make && make install
\[root@gateway2 mon-client-1.2.0\]# make && make install
Manifying blib/man3/Mon::Protocol.3pm
Manifying blib/man3/Mon::SNMP.3pm
Manifying blib/man3/Mon::Client.3pm
Manifying blib/man3/Mon::Config.3pm
Manifying blib/man3/Mon::Protocol.3pm
Manifying blib/man3/Mon::SNMP.3pm
Installing /usr/lib/perl5/site\_perl/5.8.8/Mon/Protocol.pm
Installing /usr/lib/perl5/site\_perl/5.8.8/Mon/SNMP.pm
Installing /usr/lib/perl5/site\_perl/5.8.8/Mon/Config.pm
Installing /usr/lib/perl5/site\_perl/5.8.8/Mon/Client.pm
Installing /usr/share/man/man3/Mon::SNMP.3pm
Installing /usr/share/man/man3/Mon::Config.3pm
Installing /usr/share/man/man3/Mon::Protocol.3pm
Installing /usr/share/man/man3/Mon::Client.3pm
Writing /usr/lib/perl5/site\_perl/5.8.8/i386-linux-thread-multi/auto/Mon/.packlist
Appending installation info to /usr/lib/perl5/5.8.8/i386-linux-thread-multi/perllocal.pod
\[root@gateway2 mon-client-1.2.0\]#
Mon requires that *.ph be created from the system header files. :-
cd /usr/include/
h2ph -r -l .
\# Let's copy the entire mon distribution to /usr/local
mkdir /usr/local/mon/
cp -r /root/mon-1.2.0/\* /usr/local/mon/
cd /usr/local/mon/etc
cp example.cf mon.cf
Note: You may not put blank lines inside of your watch definitions, in this config file.
vi /usr/local/mon/etc/mon.cf
#
# Example “mon.cf” configuration for “mon”.
#
# $Id: example.cf,v 1.1.1.1.4.1 2007/06/25 13:10:08 trockij Exp $
#
Please read the mon.8 manual page! #
NOTE:
#
# A “watch” definition (a line which begins with the word “watch” and is
# followed by “service” definitions) is terminated by an
# empty line, or by a subsequent definition. You may not put blank lines
# inside of your watch definitions.
#
#
# global options
#
cfbasedir = /usr/local/mon/etc
alertdir = /usr/local/mon/alert.d
mondir = /usr/local/mon/mon.d
maxprocs = 20
histlength = 100
randstart = 60s
#
# authentication types:
# getpwnam standard Unix passwd, NOT for shadow passwords
# shadow Unix shadow passwords (not implemented)
# userfile “mon” user file
#
authtype = getpwnam
#
# NB: hostgroup and watch entries are terminated with a blank line (or
# end of file). Don’t forget the blank lines between them or you lose.
#
#
# group definitions (hostnames or IP addresses)
#
hostgroup servers 192.168.0.251 192.168.0.252
hostgroup mailhosts 192.168.0.251 192.168.0.252
hostgroup routers 192.168.1.254 192.168.2.254 192.168.3.254
#
# For the servers in building 1, monitor ping and telnet
# BOFH is on weekend call :)
#
watch servers
service ping
description ping servers in servers group
interval 1m
monitor fping.monitor
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert kamran@wbitt.com
alert page.alert mis-pagers@domain.com
alertevery 1h
period NOALERTEVERY: wd {Mon-Fri} hr {7am-10pm}
alert mail.alert kamran@wbitt.com
alert page.alert mis-pagers@domain.com
period wd {Sat-Sun}
alert mail.alert kamran@wbitt.com
alert page.alert bofh@domain.com
watch mailhosts
service fping
period wd {Mon-Fri} hr {7am-10pm}
alert mail.alert kamran@wbitt.com
alert page.alert mis-pagers@domain.com
alertevery 1h
service smtp
interval 1m
monitor smtp.monitor
period wd {Mon-Fri} hr {7am-10pm}
alertevery 1h
alertafter 2 30m
alert page.alert mis-pagers@domain.com
#
# If the routers aren’t pingable, send a page using
# a phone line and the IXO protocol, which doesn’t
# rely on the network. Failure of a router is pretty serious,
# so check every two minutes.
#
# Send out one page every 45 minutes, but log the failure
# to a file every time.
#
watch routers
service ping
description routers which connect bd1 and bd2
interval 1m
monitor fping.monitor
period wd {Sun-Sat}
alert qpage.alert mis-pagers
alertevery 45m
period LOGFILE: wd {Sun-Sat}
alert file.alert -d /usr/lib/mon/log.d routers.log
[root@gateway2 etc]#
Few monitors in C language. Others are not. If you want to use them:-
cd /usr/local/mon/mon.d/
vi Makefile
…
MONPATH=/usr/local/mon
…
# make
gcc -o rpc.monitor -O2 -Wall -g rpc.monitor.c
gcc -o dialin.monitor.wrap -O2 -Wall -g \\
-DREAL_DIALIN_MONITOR=\\“/usr/local/mon/mon.d/dialin.monitor\\” \\
dialin.monitor.wrap.c
# make install
install -d /usr/lib/mon/mon.d
install rpc.monitor /usr/lib/mon/mon.d/
install -g uucp -m 02555 dialin.monitor.wrap /usr/lib/mon/mon.d/
Copy the rest of the monitors directly to to /usr/lib/mon/mon.d/
cp \*.monitor /usr/lib/mon/mon.d/
-Add the following lines to /etc/services:
mon 2583/tcp # MON
mon 2583/udp # MON traps
cp /usr/local/mon/etc/S99mon /etc/init.d/mon
vi /etc/init.d/mon
#!/bin/sh
#
# start/stop the mon server
#
# You probably want to set the path to include
# nothing but local filesystems.
#
# chkconfig: 2345 99 10
# description: mon system monitoring daemon
# processname: mon
# config: /usr/local/mon/etc/mon.cf
# pidfile: /var/run/mon.pid
#
PATH=/usr/local/mon:/bin:/usr/bin:/sbin:/usr/sbin
export PATH
MON=”/usr/local/mon/mon”
CONFIGFILE=”/usr/local/mon/etc/mon.cf”
# Source function library.
. /etc/rc.d/init.d/functions
# See how we were called.
case "$1" in
start)
echo -n "Starting mon daemon: "
daemon $MON -f -l -c $CONFIGFILE
echo
touch /var/lock/subsys/mon
;;
stop)
echo -n "Stopping mon daemon: "
killproc mon
echo
rm -f /var/lock/subsys/mon
;;
status)
status mon
;;
restart)
killall -HUP mon
;;
\*)
echo "Usage: mon {start|stop|status|restart}"
exit 1
esac
exit 0
chmod +x /etc/init.d/mon
perl -MCPAN -e "install Time::HiRes"
perl -MCPAN -e "install Time::Period"
mkdir /usr/local/mon/log.d/
chkconfig --level 35 mon on
service mon start
ERROR:
Starting mon daemon: Can’t locate Time/Period.pm in @INC
perl -MCPAN -e "install Time::HiRes"
perl -MCPAN -e "install Time::Period"
service mon start
\# ps aux | grep "mon"
dbus 1928 0.0 0.3 2720 864 ? Ss Feb26 0:00 dbus-daemon --system
root 11879 0.0 2.2 10360 5732 ? S 12:32 0:00 /usr/bin/perl /usr/local/mon/mon -f -l -c /usr/local/mon/etc/mon.cf
root 12163 0.0 0.0 0 0 ? Z 12:39 0:00 \[mon\]
root 12199 0.0 0.2 3892 656 pts/0 R+ 12:39 0:00 grep mon
Getting mails that fping not found:
Subject: ALERT servers/ping: could not open pipe to fping: No such file or directory (Thu Feb 28 12:38:08)
Summary output : could not open pipe to fping: No such file or directory
Download and install fping:
wget http://fping.sourceforge.net/download/fping.tar.gz
tar xzf fping.tar.gz
cd fping-2.4b2\_to/
./configure
make
make install
vi /usr/local/mon/mon.d/fping.monitor
my $CMD = "/usr/local/sbin/fping -e -r $RETRIES -t $TIMEOUT";
cp clients/mon.cgi /var/www/cgi-bin/
chmod +x /var/www/cgi-bin/mon.cgi
vi /var/www/cgi-bin/mon.cgi
. . .
$organization = “TestSite”; # Organization name.
$monadmin = “kamran\\@wbitt.com”; # Your e-mail address. Make sure the backslash is present.
$reload_time = 30; # Seconds for page reload.
. . .
Check through browser: http://192.168.0.251/cgi-bin/mon.cgi .
Alhumdulillah.