Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Ansible/Nagios install/configure/deploy
- Purpose: Quick howto that combined howto steps from lots of other sources.
- Enviro:
- - NB master server: nb801d2c4u.lab.krt / 192.168.1.28, on RHEL7.x ...will be our Ansible server and Nagios monitoring server.
- - NB media server: nb801d2cmed / 192.168.1.26, on RHEL7.x; will be a remote server to-be-monitored
- - NB clients: nbclient1 / 192.168.1.33, on RHEL7.x; nbclient2 / 192.168.1.36, on Debian; will be a remote server to-be-monitored
- Apps:
- - Ansible. The config stuff is all in /etc/ansible after install.
- - Nagios:
- - /etc/nagios is config stuff for RHEL
- - /usr/lib64/nagios is where all the executables and plugins live.
- From the server that will run 'Nagios server' (from which we will monitor remote servers):
- 1. Install Ansible:
- yum --nogpgcheck install ansible
- 2. Configure passwordless SSH to remote hosts to be monitored:
- #cat id_rsa.pub | ssh root@192.168.1.26 'cat >> .ssh/authorized_keys'
- ....repeat for all to-be-monitored remote hosts
- ....or we could have used:
- # ssh-copy-id root@192.168.1.26
- # ssh-copy-id root@192.168.1.27
- # ssh-copy-id root@192.168.1.29
- 3. Install Nagios, Nagios Remote Plugin Executor, Nagios plugins, Nagios nrpe plugin, httpd (webserver) and PHP:
- # yum --nogpgcheck --enablerepo=epel -y install nagios nrpe nagios-plugins.x86_64 nagios-plugins-all.x86_64 nagios-plugins-nrpe httpd php
- 4. Edit local Nagios server /etc/nagios/nrpe.cfg 'allowed_hosts=' to add local IP as well as commands used (see example attached).
- # vi /etc/nagios/nrpe.cfg
- 5. Test local disk monitor command:
- # /usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /
- 6. Enable httpd, set the password for nagiosadmin user and start httpd and nagios:
- # systemctl enable httpd
- # htpasswd -c /etc/nagios/passwd nagiosadmin
- # systemctl start httpd
- # systemctl start nagios
- ...test login: http://192.168.1.28/nagios/
- At this point our Nagios server is only monitoring itself. Next we'll use Ansible to push install Nagios Remote Plugin Executor (NRPE) and the nrpe.cfg config file to the remote Nagios clients to be monitored.
- 1. On the Ansible/Nagios server, populate a Ansible 'hosts' file with a list of remote hosts. In this example nb801d2c4u.lab.krt is our local Nagios server and two RHEL7 hosts in a group called 'nbrhelhosts' and then a Debian sytem in 'nbdebianhosts' group:
- -
- /etc/ansible/hosts contents:
- all:
- hosts:
- nb801d2c4u.lab.krt:
- children:
- nbrhelsvrs:
- hosts:
- nb801d2cmed.lab.krt:
- nb811d2cmed.lab.krt:
- nb812d2cmed.lab.krt:
- nb812ad2cmed.lab.krt:
- nb812bd2cmed.lab.krt:
- nb812cd2cmed.lab.krt:
- nbrhelclnts:
- hosts:
- nbclient1.lab.krt:
- nbdebianhosts:
- hosts:
- nbclient2.lab.krt:
- Note: add more clients to your own 'hosts' configuration as needed.
- 2. Test Ansible 'hosts' file and functionality/connectivity:
- # ansible all -m ping
- # ansible nbrhelsvrs -m ping
- 3. Next we create a Ansible playbook YAML file named nrpe-deploy.yaml to install NRPE, install Nagios plugins, and configure NRPE. But we'll 'accidentally' forget the nrpe.cfg configuration file to demonstrate that we can update the same YAML playbook file later and re-execute it from the Ansible server and Ansible will re-execute the steps that are 'new' (as well as the original steps that may have failed on certain remote hosts previously):
- a. Create the nrpe-deploy.yaml on the Ansible server, nrpe-deploy.yaml contents:
- ---
- - hosts: nbrhelsvrs
- remote_user: root
- tasks:
- - name: install epel
- yum:
- name: epel-release
- state: latest
- - name: install nrpe
- yum:
- name: nrpe
- state: latest
- - name: install nagios plugins
- yum:
- name: nagios-plugins-all
- state: latest
- ....example execution:
- [root@nb801d2c4u ansible]# ansible-playbook nrpe-deploy.yaml
- PLAY [nbrhelsvrs] *******************************************************************************************************
- TASK [Gathering Facts] ***************************************************************************************************
- ok: [nb812cd2cmed.lab.krt]
- ok: [nb812bd2cmed.lab.krt]
- ok: [nb812ad2cmed.lab.krt]
- ok: [nb811d2cmed.lab.krt]
- ok: [nb801d2cmed.lab.krt]
- TASK [install epel] ******************************************************************************************************
- fatal: [nb812cd2cmed.lab.krt]: FAILED! => {"changed": false, "msg": "No package matching 'epel-release' found available, installed or updated", "rc": 126, "results": ["No package matching 'epel-release' found available, installed or updated"]}
- changed: [nb801d2cmed.lab.krt]
- TASK [install nrpe] ******************************************************************************************************
- changed: [nb812bd2cmed.lab.krt]
- changed: [nb812ad2cmed.lab.krt]
- changed: [nb811d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- TASK [install nagios plugins] ********************************************************************************************
- changed: [nb812bd2cmed.lab.krt]
- changed: [nb812ad2cmed.lab.krt]
- changed: [nb811d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- PLAY RECAP ***************************************************************************************************************
- nb801d2cmed.lab.krt : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb801d2cmed.lab.krt : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb811d2cmed.lab.krt : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb812ad2cmed.lab.krt : ok=4 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb812cd2cmed.lab.krt : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
- [root@nb801d2c4u ansible]#
- Now on the Ansible server, we will create a nrpe.cfg file for the remote Nagios clients, update the Ansible server-side nrpe-deploy.yaml file and re-deploy:
- 1. first create the file nrpe.cfg on the ansible server with contents:
- -------->start nrpe.cfg file contents<----------
- # bind to all interfaces
- server_address=0.0.0.0
- # allow access by localhost and the Nagios server:
- allowed_hosts=127.0.0.1,192.168.1.28
- # allow command args
- dont_blame_nrpe=1
- # example monitor commands
- command[check_users]=/usr/lib64/nagios/plugins/check_users -w 5 -c 10
- command[check_load]=/usr/lib64/nagios/plugins/check_load -r -w .15,.10,.05 -c .30,.25,.20
- #command[check_hda1]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
- command[check_root]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/mapper/rhel-root
- command[check_nbvol]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/nbu/nbvol
- command[check_advdsk]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/mapper/nbu-advdsk
- command[check_msdp]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /dev/mapper/nbu-msdpcache
- command[check_zombie_procs]=/usr/lib64/nagios/plugins/check_procs -w 5 -c 10 -s Z
- command[check_total_procs]=/usr/lib64/nagios/plugins/check_procs -w 150 -c 200
- -------->end nrpe.cfg file contents<----------
- 2. Next, append following lines to the nrpe-deploy.yaml file:
- - name: deploy nrpe.cfg
- copy:
- src: nrpe.cfg
- dest: /etc/nagios/nrpe.cfg
- force: yes
- backup: yes
- register: deploy_nrpe
- 3. From the Ansible server, redeploy (and whats cool is before running this I fixed a broken yum installer on nb812cd2cmed.lab.krt so this re-execution successfully completed the NRPE and Nagios plugin installs on nb812cd2cmed.lab.krt, then deployed the nrpe.cfg file):
- [root@nb801d2c4u ansible]# ansible-playbook nrpe-deploy.yaml
- PLAY [nbrhelhosts] *******************************************************************************************************
- TASK [Gathering Facts] ***************************************************************************************************
- ok: [nbclient1.lab.krt]
- ok: [nb812bd2cmed.lab.krt]
- ok: [nb812ad2cmed.lab.krt]
- ok: [nb811d2cmed.lab.krt]
- ok: [nb801d2cmed.lab.krt]
- TASK [install epel] ******************************************************************************************************
- ok: [nb801d2cmed.lab.krt]
- ok: [nb812bd2cmed.lab.krt]
- ok: [nb812ad2cmed.lab.krt]
- ok: [nb811d2cmed.lab.krt]
- changed: [nb812cd2cmed.lab.krt]
- TASK [install nrpe] ******************************************************************************************************
- ok: [nb801d2cmed.lab.krt]
- ok: [nb812bd2cmed.lab.krt]
- ok: [nb812ad2cmed.lab.krt]
- ok: [nb811d2cmed.lab.krt]
- changed: [nb812cd2cmed.lab.krt]
- TASK [install nagios plugins] ********************************************************************************************
- ok: [nb801d2cmed.lab.krt]
- ok: [nb812bd2cmed.lab.krt]
- ok: [nb812ad2cmed.lab.krt]
- ok: [nb811d2cmed.lab.krt]
- changed: [nb812cd2cmed.lab.krt]
- TASK [deploy nrpe.cfg] ***************************************************************************************************
- changed: [nb812cd2cmed.lab.krt]
- changed: [nb812bd2cmed.lab.krt]
- changed: [nb812ad2cmed.lab.krt]
- changed: [nb811d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- TASK [start/restart and enable nrpe] *************************************************************************************
- changed: [nb812cd2cmed.lab.krt]
- changed: [nb812bd2cmed.lab.krt]
- changed: [nb812ad2cmed.lab.krt]
- changed: [nb811d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- changed: [nb801d2cmed.lab.krt]
- PLAY RECAP ***************************************************************************************************************
- nb801d2cmed.lab.krt : ok=6 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb801d2cmed.lab.krt : ok=6 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb801d2cmed.lab.krt : ok=6 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb811d2cmed.lab.krt : ok=6 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb812ad2cmed.lab.krt : ok=6 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb812cd2cmed.lab.krt : ok=6 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- nb812cd2cmed.lab.krt : ok=6 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
- [root@nb801d2c4u ansible]#
- 4. On the Nagios Server, append following to bottom of /etc/nagios/objects/commands.cfg:
- # .check_nrpe. command definition
- define command{
- command_name check_nrpe
- command_line /usr/lib64/nagios/plugins/check_nrpe -H $HOSTADDRESS$ -t 30 -c $ARG1$
- }
- 5. On the Nagios Server, create <hostname>.cfg files inside /etc/nagios/servers/ for each host to be monitored. In this example we created one for one of the remote NB media servers:
- # vi nb801d2cmed.cfg
- .....add following content:
- ###############################################################################
- #
- # HOST DEFINITION
- #
- ###############################################################################
- # Define a host for the local machine
- define host {
- use linux-server ; Name of host template to use
- ; This host definition will inherit all variables that are defined
- ; in (or inherited by) the linux-server host template definition.
- host_name nb801d2cmed
- alias nb801d2cmed
- address 192.168.1.26
- register 1
- }
- ###############################################################################
- #
- # SERVICE DEFINITIONS
- #
- ###############################################################################
- # Define a service to "ping"
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description PING
- check_command check_ping!100.0,20%!500.0,60%
- }
- # Define a service to check the disk space of the root partition
- # on the local machine. Warning if < 20% free, critical if
- # < 10% free space on partition.
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description Root Partition
- check_command check_nrpe!check_root
- }
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description NB Partition
- check_command check_nrpe!check_nbvol
- }
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description Advanced Disk Partition
- check_command check_nrpe!check_advdsk
- }
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description MSDP
- check_command check_nrpe!check_msdp
- }
- # Define a service to check the number of currently logged in
- # users on the local machine. Warning if > 20 users, critical
- # if > 50 users.
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description Current Users
- check_command check_local_users!20!50
- }
- # Define a service to check the number of currently running procs
- # on the local machine. Warning if > 250 processes, critical if
- # > 400 processes.
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description Total Processes
- check_command check_local_procs!250!400!RSZDT
- }
- # Define a service to check the load on the local machine.
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description Current Load
- check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
- }
- # Define a service to check the swap usage the local machine.
- # Critical if less than 10% of swap is free, warning if less than 20% is free
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description Swap Usage
- check_command check_local_swap!20%!10%
- }
- # Define a service to check SSH on the local machine.
- # Disable notifications for this service by default, as not all users may have SSH enabled.
- define service {
- use generic-service ; Name of service template to use
- host_name nb801d2cmed
- service_description SSH
- check_command check_ssh
- notifications_enabled 0
- }
- .....save the /etc/nagios/servers/nb801d2cmed.cfg file and the others you create.
- 6. Make the entire servers directory and contents owned by root.nagios:
- # chown -R root.nagios /etc/nagios/servers
- 7. confirm Nagios config:
- # /usr/sbin/nagios -v /etc/nagios/nagios.cfg
- Nagios Core 4.4.3
- Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
- Copyright (c) 1999-2009 Ethan Galstad
- Last Modified: 2019-01-15
- License: GPL
- Website: https://www.nagios.org
- Reading configuration data...
- Read main config file okay...
- Read object config files okay...
- Running pre-flight check on configuration data...
- Checking objects...
- Checked 8 services.
- Checked 1 hosts.
- Checked 1 host groups.
- Checked 0 service groups.
- Checked 1 contacts.
- Checked 1 contact groups.
- Checked 24 commands.
- Checked 5 time periods.
- Checked 0 host escalations.
- Checked 0 service escalations.
- Checking for circular paths...
- Checked 1 hosts
- Checked 0 service dependencies
- Checked 0 host dependencies
- Checked 5 timeperiods
- Checking global event handlers...
- Checking obsessive compulsive processor commands...
- Checking misc settings...
- Total Warnings: 0
- Total Errors: 0
- Things look okay - No serious problems were detected during the pre-flight check
- [root@nb801d2c4u ~]#
- Summary:
- - We installed Ansible, Nagios on a central command server.
- - We remotely push installed nrpe and Nagios plugins plus nrpe configuration (nrpe.cfg) to remote hosts to be monitored.
- - We can add new commands to nrpe.cfg, or new machines to /etc/ansible/hosts, and then execute ansible-playbook nrpe-deploy.yaml again to update all remote machines or install new machines.
- References:
- https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html
- - We actually installed from a yum repository, this next link has some good info:
- https://support.nagios.com/kb/article/nagios-core-installing-nagios-core-from-source-96.html#RHEL
- https://www.ansible.com/overview/how-ansible-works
- https://docs.ansible.com/ansible/latest/user_guide/intro_inventory.html
- https://docs.ansible.com/ansible/latest/user_guide/guide_rolling_upgrade.html
- https://docs.ansible.com/ansible/latest/user_guide/playbooks_best_practices.html#directory-layout
- - Excellent HowTo on NRPE install:
- https://www.neteye-blog.com/2018/04/how-to-deploy-nrpe-on-centos-7-with-ansible/
- - Nagios manual (TOC):
- https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/toc.html
- - Nagios QuickStart guide:
- https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/4/en/quickstart.html
- - NRPE install/config guide:
- https://tecadmin.net/install-nrpe-on-centos-rhel/
- - yet another Nagios tutorial w/some good info:
- https://www.edureka.co/blog/nagios-tutorial/
- - Nagios server-side remote host define/config:
- https://www.scaleway.com/en/docs/deploy-nagios-on-scaleway/
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement