网络通信 频道

FreeBSD:解决Nagios不发报警短信的问题

原因:邮件路径不对

  其他的行也更这个类似;最有用的信息我用红色标记,其大意是不能执行上面的2进制或可执行文件。在这个条目中,只有2个执行文件—printf及mail。我把它按原样单独拿出来执行,操作过程如下:

  (1)/usr/bin/printf  “"%b" "***** Nagios 2.9 *****\n”  输出 ***** Nagios 2.9 *****,这是正常的结果。

  (2)/bin/mail -s "Host DOWN alert for mail-server!" sery@163.com 输出su: /bin/mail: No such file or directory,没找到路径或目录。前面还手动发了邮件的,明明有mail这个客户端程序呀!可能这个路径不对,是linux的mail路径。查一下freebsd的mail路径,执行find / -name 得到mail在freebsd的路径为/usr/bin/mail 。

  到这里,我们知道了为啥不能发邮件的根本原因,接下来,我把nagios的配置文件commands.cfg的host-notify-by-email、service-notify-by-email的”/bin/mail”替换为“/usr/bin/mail”。其完整形式为:

  # 'host-notify-by-email' command definition
  define command{
  command_name    host-notify-by-email
  command_line    /usr/bin/printf "%b" "***** Nagios 2.9 *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/bin/mail -s "Host $HOSTSTATE$ alert for $HOSTNAME$!" $CONTACTEMAIL$
  }
  # 'notify-by-email' command definition
  define command{
  command_name    service-notify-by-email
  command_line    /usr/bin/printf "%b" "***** Nagios 2.9 *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$" | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ alert - $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
  }

  修改完配置文件commands.cfg后重启 Nagios,再查看nagios日志,不再有“Make sure the script or binary you are trying to execute actually exists...”报错,并且有发送报警邮件的记录了:

  [root@nagios /usr/local/nagios/var]# tail -f nagios.log
  [1217170467] SERVICE ALERT: mail-server;check_tcp 995;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
  [1217170534] Auto-save of retention data completed successfully.
  [1217170577] HOST ALERT: mail-server;DOWN;SOFT;1;CRITICAL - Plugin timed out after 10 seconds
  [1217170587] HOST ALERT: mail-server;DOWN;SOFT;2;CRITICAL - Plugin timed out after 10 seconds
  [1217170597] HOST ALERT: mail-server;DOWN;SOFT;3;CRITICAL - Plugin timed out after 10 seconds
  [1217170607] HOST ALERT: mail-server;DOWN;SOFT;4;CRITICAL - Plugin timed out after 10 seconds
  [1217170607] HOST ALERT: mail-server;UP;SOFT;5;PING OK - Packet loss = 0%, RTA = 111.63 ms
  [1217170607] SERVICE ALERT: mail-server;check_tcp 995;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
  [1217170687] SERVICE ALERT: mail-server;check_tcp 995;OK;SOFT;3;TCP OK - 3.137 second response time on port 995
  [1217171057] SERVICE NOTIFICATION: sery;fav-0;check_tcp 443;CRITICAL;service-notify-by-email;CRITICAL - Socket timeout after 10 seconds

  收邮件,迫不及待,哈哈,我的163邮箱收到久违的报警信息了。再回去瞧一眼邮件日志/var/log/malllog,也记录了这个发送情况。

  经验总结:通过日志记录,对于我们排查故障确实有着不可估量的好处。在实际的工作中,我们应该随时检查系统日志以及应用程序相关的日志,从记录项中寻找蛛丝马迹,从而得出解决问题的方法。

  2008/7/27
  海淀福缘门悟真阁
 

0
相关文章