首页 运维 正文
云监控 Nagios 安装步骤

 2022-10-23    456  

前言

最近在研究云监控的相关工具,之前写过Ganglia的安装步骤,这回来记录下Nagios的安装步骤。

本文不讲解相关原理,若想了解请参考其他资料。

本文目的:即使之前未触过nagios,也能按照文中步骤搭建自己的nagios监控集群。

@Author duangr

@Website http://my.oschina.net/duangr/blog/183160


1. Nagios简介

Nagios是一个可运行在Linux/Unix平台之上的开源监视系统,可以用来监视系统运行状态和网络信息。Nagios可以监视所指定的本地或远程主机以及服务,同时提供异常通知功能。在系统或服务状态异常时发出邮件或短信报警***时间通知网站运维人员,在状态恢复后发出正常的邮件或短信通知。


2. 相关环境

Host Name IP OS
Arch
duangr-1 192.168.56.10 CentOS 6.4 x86_64
duangr-2 192.168.56.11 CentOS 6.4 x86_64
duangr-3 192.168.56.12 CentOS 6.4 x86_64

3. 部署规划

Nagios主节点需要安装:

  • nagios
  • nagios-plugin
  • nrpe
  • php
  • apache

Nagios从节点需要安装:

  • nagios-plugin
  • nrpe

安装路径规划

nagios安装路径 /usr/local/nagios
php安装路径 /usr/local/php
apache安装路径 /usr/local/apache2

4. 代码获取

  • nagios-4.0.2.tar.gz
  • nagios-plugins-1.5.tar.gz
  • nrpe-2.15.tar.gz
  • httpd-2.2.23.tar.gz
  • php-5.4.10.tar.gz

5. 前提依赖

5.1 主机环境检查(全部节点)

#rpm-qgccglibcglibc-commongdgd-develxinetdopenssl-devel
gcc-4.4.7-3.el6.x86_64
glibc-2.14.1-6.x86_64
glibc-common-2.14.1-6.x86_64
gd-2.0.35-11.el6.x86_64
packagegd-develisnotinstalled
packagexinetdisnotinstalled
openssl-devel-1.0.0-27.el6.x86_64

若有缺失,请先安装. 可通过如下几个镜像网站下载相关安装包:

  • http://rpm.pbone.net/
  • http://mirrors.163.com/centos/6.4/os/x86_64/Packages/
  • http://mirrors.sohu.com/centos/6.4/os/x86_64/Packages/

安装后再次检查如下:

#rpm-qgccglibcglibc-commongdgd-develxinetdopenssl-devel
gcc-4.4.7-3.el6.x86_64
glibc-2.14.1-6.x86_64
glibc-common-2.14.1-6.x86_64
gd-2.0.35-11.el6.x86_64
gd-devel-2.0.35-11.el6.x86_64
xinetd-2.3.14-38.el6.x86_64
openssl-devel-1.0.0-27.el6.x86_64

#p#

6. 编译安装

6.1 创建用户nagios(全部节点)

useraddnagios-d/usr/local/nagios
passwdnagios(密码自定义)

6.2 安装nagios主程序(主节点安装)

tar-zxfnagios-4.0.2.tar.gz
cdnagios-4.0.2
./configure--prefix=/usr/local/nagios
makeall
makeinstall&&makeinstall-init&&makeinstall-commandmode&&makeinstall-config

将nagios添加为服务

chkconfig--addnagios
chkconfignagiosoff
chkconfig--level35nagioson
chkconfig--listnagios
nagios0:关闭1:关闭2:关闭3:启用4:关闭5:启用6:关闭

6.3 安装nagios插件(全部节点安装)

tar-zxfnagios-plugins-1.5.tar.gz
cdnagios-plugins-1.5
./configure--prefix=/usr/local/nagios--with-nagios-user=nagios--with-nagios-group=nagios
make&&makeinstall

如果出现mysql相关的编译错误,是mysql的默认安装路径被修改导致的,调整with-mysql后重新make

./configure--prefix=/usr/local/nagios--with-mysql=/usr/local/mysql
make&&makeinstall

6.4 安装NRPE(全部节点安装)

tar-zxfnrpe-2.15.tar.gz
cdnrpe-2.15
./configure--enable-command-args
makeall
makeinstall-plugin

下面步骤只需要在被监控节点执行

makeinstall-daemon&&makeinstall-daemon-config&&makeinstall-xinetd

6.4.1 被监控节点配置

如果是被监控节点,需要配置NRPE已守护进程运行(通过xinetd来运行)

1、更改/etc/xinetd.d/nrpe文件,设置允许nagios主节点服务器连接

vi/etc/xinetd.d/nrpe
only_from=127.0.0.1192.168.56.10

2、在/etc/services结尾增加:

nrpe5666/tcp#NRPE

3、增加对参数的支持

vi/usr/local/nagios/etc/nrpe.cfg
dont_blame_nrpe=1

4、启动xinetd

servicexinetdrestart

5、验证nrpe是否监听

netstat-at|grepnrpe

6、测试nrpe是否正常运行

/usr/local/nagios/libexec/check_nrpe-Hlocalhost
NRPEv2.15

6.4.2 主节点配置

如果是监控服务主节点,在全部被监控节点NRPE配置完成后,可以依次做下检测

/usr/local/nagios/libexec/check_nrpe-H192.168.56.11
NRPEv2.15
/usr/local/nagios/libexec/check_nrpe-H192.168.56.12
NRPEv2.15

6.5 安装Apache(主节点安装)

tar-zxfhttpd-2.2.23.tar.gz
cdhttpd-2.2.23
./configure--prefix=/usr/local/apache2
make&&makeinstall

6.6 安装PHP(主节点安装)

cd/export/home/tools/soft/php
tar-zxfphp-5.4.10.tar.gz
cd/php-5.4.10
./configure--prefix=/usr/local/php--with-apxs2=/usr/local/apache2/bin/apxs
make&&makeinstall

6.7 使用apache 发布PHP的WEB

vi /usr/local/apache2/conf/httpd.conf

....
Listen80
....
<IfModuledir_module>
DirectoryIndexindex.htmlindex.php
AddTypeapplication/x-httpd-php.php
</IfModule>
....
#settingfornagios
ScriptAlias/nagios/cgi-bin"/usr/local/nagios/sbin"
<Directory"/usr/local/nagios/sbin">
AuthTypeBasic
OptionsExecCGI
AllowOverrideNone
Orderallow,deny
Allowfromall
AuthName"NagiosAccess"
AuthUserFile/usr/local/nagios/etc/htpasswd
Requirevalid-user
</Directory>
Alias/nagios"/usr/local/nagios/share"
<Directory"/usr/local/nagios/share">
AuthTypeBasic
OptionsNone
AllowOverrideNone
Orderallow,deny
Allowfromall
AuthName"nagiosAccess"
AuthUserFile/usr/local/nagios/etc/htpasswd
Requirevalid-user
</Directory>

为web访问时添加用户名和密码(此处用户名为admin,可自定义)

/usr/local/apache2/bin/htpasswd-c/usr/local/nagios/etc/htpasswdadmin

启动apache

/usr/local/apache2/bin/apachectlstart

访问页面:http://192.168.56.10/nagios/

#p#

7. 配置Nagios

7.1 配置远程被监控节点

7.1.1 修改配置文件

#su-nagios
$vi/usr/local/nagios/etc/nrpe.cfg

修改为如下配置内容:

command[check_users]=/usr/local/nagios/libexec/check_users-w$ARG1$-c$ARG2$
command[check_load]=/usr/local/nagios/libexec/check_load-w$ARG1$-c$ARG2$
command[check_disk]=/usr/local/nagios/libexec/check_disk-w$ARG1$-c$ARG2$-p$ARG3$
command[check_procs]=/usr/local/nagios/libexec/check_procs-w$ARG1$-c$ARG2$-s$ARG3$
command[check_procs_args]=/usr/local/nagios/libexec/check_procs$ARG1$
command[check_swap]=/usr/local/nagios/libexec/check_swap-w$ARG1$-c$ARG2$

以上监控命令功能:

  • check_users 监控登陆用户数
  • check_load 监控CPU负载
  • check_disk 监控磁盘的使用
  • check_procs 监控进程数量,状态包括 RSZDT
  • check_swap 监控SWAP分区使用

7.1.2 重启xinetd服务

配置完上述命令后,重启 xinetd服务

servicexinetdrestart

7.1.3 校验配置

检查监控命令配置是否ok

/usr/local/nagios/libexec/check_nrpe-Hlocalhost-ccheck_users-a510
/usr/local/nagios/libexec/check_nrpe-Hlocalhost-ccheck_load-a15,10,530,25,20
/usr/local/nagios/libexec/check_nrpe-Hlocalhost-ccheck_disk-a20%10%/
/usr/local/nagios/libexec/check_nrpe-Hlocalhost-ccheck_procs-a200400RSZDT
/usr/local/nagios/libexec/check_nrpe-Hlocalhost-ccheck_swap-a20%10%

7.2 配置监控服务主节点

7.2.1 cgi.cfg(控制CGI访问的配置文件)

(使用 nagios 用户)

vi /usr/local/nagios/etc/cgi.cfg

修改如下内容,为admin用户增加权限:

default_user_name=admin
authorized_for_system_information=nagiosadmin,admin
authorized_for_configuration_information=nagiosadmin,admin
authorized_for_system_commands=nagiosadmin,admin
authorized_for_all_services=nagiosadmin,admin
authorized_for_all_hosts=nagiosadmin,admin
authorized_for_all_service_commands=nagiosadmin,admin
authorized_for_all_host_commands=nagiosadmin,admin

7.2.2 nagios.cfg(nagios主配置文件)

(使用 nagios 用户)

vi /usr/local/nagios/etc/nagios.cfg

#cfg_file=/export/home/nagios/etc/objects/localhost.cfg(注释掉)
cfg_dir=/export/home/nagios/etc/servers

主配置文件声明了监控脚本的存储路径为 ./servers,默认没有此目录,需要手工创建

nagios 会读取 servers 目录下面后缀为.cfg的全部文件作为配置文件

cd/usr/local/nagios/etc
mkdirservers
cdservers

7.2.3 定义监控的主机组

声明一个监控的主机组,将主机环境中提到的三台主机全部加入监控

vi /export/home/nagios/etc/servers/group.cfg

新文件,内容如下:

definehostgroup{
hostgroup_nameduangr-server
aliasduangrServer
membersduangr-1,duangr-2,duangr-3
}

解释下上面的配置:

  • hostgroup_name: 主机组的名称,可随意指定
  • alias: 主机组别名,可随意指定
  • members: 主机组成员,多个主机名称之前使用逗号分隔。另外主机名称必须与 define host 中host_name 一致。

主机的定义,后面会说到。

7.2.4 定义监控的主机

下面开始定义具体的主机

7.2.4.1 本地主机监控配置

先定义本地主机 duangr-1

vi /export/home/nagios/etc/servers/duangr-1.cfg

新文件,内容如下:

definehost{
uselinux-server
host_nameduangr-1
aliasduangr-1
address192.168.56.10
}

defineservice{
uselocal-service
host_nameduangr-1
service_descriptionHostAlive
check_commandcheck-host-alive
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionUsers
check_commandcheck_local_users!20!50
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionCPU
check_commandcheck_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionDiskRoot
check_commandcheck_local_disk!20%!10%!/
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionDiskHome
check_commandcheck_local_disk!20%!10%!/export/home
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionZombieProcs
check_commandcheck_local_procs!5!10!Z
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionTotalProcs
check_commandcheck_local_procs!250!400!RSZDT
}
defineservice{
uselocal-service
host_nameduangr-1
service_descriptionSwapUsage
check_commandcheck_local_swap!20!10
}

说明下,由于是此主机也是监控服务主节点所在主机,因此可以使用check_local_* 的相关命令来进行监控。

这个文件中已经将常用的监控项配置进去。

7.2.4.2 远程主机监控配置

再定义远程主机duangr-2和duangr-3

定义远程主机的监控之前,需要先定义check_nrpe命令

vi /usr/local/nagios/etc/objects/commands.cfg

在文件的***面添加如下内容:

#'check_nrpe'commanddefinition
definecommand{
command_namecheck_nrpe
command_line$USER1$/check_nrpe-H$HOSTADDRESS$-t30-c$ARG1$
}
definecommand{
command_namecheck_nrpe_args
command_line$USER1$/check_nrpe-H$HOSTADDRESS$-t30-c$ARG1$-a$ARG2$
}

定义duangr-2主机的监控配置

$ vi /usr/local/nagios/etc/servers/duangr-2.cfg

新文件,内容如下:

definehost{
uselinux-server
host_nameduangr-2
aliasduangr-2
address192.168.56.11
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionHostAlive
check_commandcheck-host-alive
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionUsers
check_commandcheck_nrpe_args!check_users!510
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionCPU
check_commandcheck_nrpe_args!check_load!15,10,530,25,20
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionDiskRoot
check_commandcheck_nrpe_args!check_disk!20%10%/
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionDisk/export/home
check_commandcheck_nrpe_args!check_disk!20%10%/export/home
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionProcsZombie
check_commandcheck_nrpe_args!check_procs!510Z
}

defineservice{
uselocal-service
host_nameduangr-2
service_descriptionProcsTotal
check_commandcheck_nrpe_args!check_procs_args!"-w400-c600"
}
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionSwapUsage
check_commandcheck_nrpe_args!check_swap!20%10%
}

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;下面是一些常用进程的监控,主要是云平台相关进程
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;监控crond进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:crond
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Ccrond"
}
;;监控zookeeper进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:QuorumPeerMain
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-aserver.quorum.QuorumPeerMain"
}
;;监控storm的从节点进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:supervisor
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-adaemon.supervisor"
}
;;监控storm的主节点进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:nimbus
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-adaemon.nimbus"
}
;;监控MetaQ进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:MetaQ
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-ametamorphosis-server-w"
}
;;监控Redis进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:redis-server
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Credis-server"
}
;;监控hadoop主节点NameNode进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:NameNode
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-aserver.namenode.NameNode"
}
;;监控hadoop主节点SecondaryNameNode进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:SecondaryNameNode
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-aserver.namenode.SecondaryNameNode"
}
;;监控hadoop主节点ResourceManager进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:ResourceManager
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-aserver.resourcemanager.ResourceManager"
}
;;监控hadoop从节点DataNode进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:DataNode
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-aserver.datanode.DataNode"
}
;;监控hadoop从节点NodeManager进程
defineservice{
uselocal-service
host_nameduangr-2
service_descriptionPS:NodeManager
check_commandcheck_nrpe_args!check_procs_args!"-c1:1-Cjava-aserver.nodemanager.NodeManager"
}

说明下,由于duangr-2是远程主机,因此使用check_nrpe_args命令来监控.

这个文件中已经将常用的监控项配置进去, 同时还包含了hadoop、storm、zookeeper、metaq、redis的相关进程监控,主要的监控思路是判断进程是否存在。

定义duangr-3主机的监控配置

vi duangr-3.cfg

内容与duangr-2.cfg类似,只需要修改 host_name 、alias、 address即可.

7.2.4.3 邮件监控

定义监控人邮件地址

vi /usr/local/nagios/etc/objects/contacts.cfg

definecontact{
contact_namenagiosadmin;Shortnameofuser
usegeneric-contact;Inheritdefaultvaluesfromgeneric-contacttemplate(definedabove)
aliasNagiosAdmin;Fullnameofuser
emailyourname@domain.com
;<<*****CHANGETHISTOYOUREMAILADDRESS******
}

除了配置监控邮件的接收人外,还要确保:

  • 本主机与邮件服务器互通
  • 本主机SendMail可以使用外部SMTP服务发送邮件

7.2.4.4 校验配置

/usr/local/nagios/bin/nagios-v/usr/local/nagios/etc/nagios.cfg

7.2.4.5 启动

/usr/local/nagios/bin/nagios-d/usr/local/nagios/etc/nagios.cfg

nagios已经是一个服务,也可以执行如下操作:

servicenagiosstart/stop/restart/status

#p#

8. 监控页面

http://192.168.56.10/nagios

原文链接:https://77isp.com/post/8936.html

=========================================

https://77isp.com/ 为 “云服务器技术网” 唯一官方服务平台,请勿相信其他任何渠道。