This howto explains how to check for OS updates on CentOS, Redhat, Fedora, Debian, Ubuntu and OpenSUSE systems using Checkmk 1.6.0 / 2.0 / 2.1 / 2.2 Raw Edition. You will also learn how to set a custom check interval for update checks so that you are not hammering update servers (the WATO way doesn't work). This will make update checks asynchronous and massively decrease resource usage. Download my patched linux-updates plugin for RHEL/CentOS/Fedora because the YUM plugin is now broken since Checkmk 1.6.0.
This is only for advanced users, depending on the amount of servers this is going to be quite a lot of work. I leave the task of provisioning this to the reader. One possible solution is to consider Checkmk Enterprise Edition.
This is only relevant if you have previously installed update check plugins.
If you are updating from an earlier Checkmk version such as 1.5, you'll realize that all existing YUM/DNF plugins no longer work. If you have previously used the APT Update Check plugin then this one also needs removed. Not only it would error out on service discovery, APT is now integrated since Checkmk 1.6.0. Just to be on the safe side the zypper plugin should also be removed and be replaced with the latest version. If you are updating to 2.0.0, then remove the zypper plugin completely, since it is now integrated into Checkmk as well.
su - <sitename> mkp remove apt mkp remove yum mkp remove linux-updates mkp remove zypper
On each client (which is how I will call the monitored hosts from now on) you would have to remove the agent plugins (and their caches), as we will replace them with the latest version later on. This step is very important otherwise you might run into unforseeable issues such as plugins not being discovered. If you are monitoring the Checkmk server using Checkmk then this also needs to be done on the server.
cd /usr/lib/check_mk_agent/plugins rm *yum* rm *zypper* rm *apt* rm *linux*updates* cd /var/lib/check_mk_agent/cache rm -f *yum*.cache rm -f *zypper*.cache rm -f *linux*updates*.cache rm -f *apt*.cache
You are now safe to upgrade to Checkmk 1.6.0 / 2.x. If you realize that service discovery bails out with random errors then you have more modules installed that are now broken. Just like explained above, find them using mkp list and then remove them.
During my tests I realized that quite a few plugins are now broken in Checkmk 1.6.0, even more in 2.0 and 2.1/2.2. Gladly, the defunct APT Update Check plugin is no longer needed because APT is now integrated in Checkmk. The YUM Update Check is also defunct so I am providing a patched agent plugin for the linux-updates check that will work with all Redhat based derivates. The Zypper plugin for SUSE hosts still works, it is my favorite plugin because the WATO integration is really nice.
So download these to directory /opt on your Checkmk server:
Linux System Updates Check: https://checkmk.de/check_mk-exchange-file.php?&file=linux-updates-1.0.mkp
Zypper Update Check: https://checkmk.de/check_mk-exchange-file.php?&file=zypper-1.2.mkp (already integrated in Checkmk 2.0.0)
Patched System Updates Agent: mk_linuxupdates.sh.gz
On your Checkmk server, issue the following commands:
cd /opt ; gunzip mk_linuxupdates.sh.gz chmod ugo+x mk_linuxupdates.sh su - <sitename> cd /opt mkp install linux-updates-1.0.mkp mkp install zypper-1.2.mkp
Leave these files here, you will need mk_linuxupdates.sh later. Also you will easily remember which versions are installed. The APT and Zypper plugins can be configured using WATO, feel free to configure how updates should get alerted and what to do when there are package locks.
This howto assumes that the Checkmk server is reachable via SSH on port 22 and that we will do one update check in two hours (7200 seconds). This means that regardless of the configured service check interval on your Checkmk server, the actual update check is only run once in 7200 seconds and the output is returned from cache otherwise. Not only will this save a lot of resources, it will also tremendously speed up service discovery and checks because this will turn the actual update check to become asynchronous. This results in service discovery receiving no data the first time updates are being checked. Do another full scan a minute later and you will receive proper output.
On the client, do the following (replace cmk-server with the ip address of your Checkmk server):
cd /usr/lib/check_mk_agent/plugins/ mkdir 7200 ; cd 7200 scp -p root@cmk-server:/opt/mk_linuxupdates.sh .
If updates are performed on a host the WARN and CRIT state will remain until plugin cache expires in 7200 seconds. To invalidate the cached data and to trigger the check execution immediately, you can run the command:
if [ -e "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" fi if [ -e "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" fi
If you don't want to invalidate the cache manually, please see yum-plugin-post-transaction-actions or write a wrapper that you will execute on the client for manually installing updates. The wrapper would run the actual update command, followed by the command seen above.
Please note that CentOS does not flag any updates as security updates except they are coming from EPEL repository. RHEL does flag correctly. Thanks to using the patched linux-updates plugin, you will now see the names of installable updates in long output, which can be very helpful to determine if immediate actions need to be taken.
On the client, do the following (replace cmk-server with the ip address of your Checkmk server):
cd /usr/lib/check_mk_agent/plugins/ mkdir 7200 ; cd 7200 scp -p root@cmk-server:/opt/mk_linuxupdates.sh . sed -i -e 's/yum.log/dnf.log/g' mk_linuxupdates.sh
Please note how we are replacing yum.log with dnf.log in mk_linuxupdates.sh since Fedora uses DNF.
If updates are performed on a host the WARN and CRIT state will remain until plugin cache expires in 7200 seconds. To invalidate the cached data and to trigger the check execution immediately, you can run the command:
if [ -e "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" fi if [ -e "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" fi
If you don't want to invalidate the cache manually, please see yum-plugin-post-transaction-actions or write a wrapper that you will execute on the client for manually installing updates. The wrapper would run the actual update command, followed by the command seen above.
On the client, do the following (replace cmk-server with the ip address of your Checkmk server and cmk-version with the version currently installed e.g. 1.6.0p1.cre):
cd /usr/lib/check_mk_agent/plugins/ mkdir 7200 ; cd 7200 scp -p root@cmk-server:/opt/omd/versions/cmk-version/share/check_mk/agents/plugins/mk_zypper .
If you are using package locks (zypper al), make sure to edit the WATO options for this check otherwise that will be a WARN.
Checkmk 2.0.0 users: Currently, there are no configurable WATO options for the integrated zypper check, edit the agent on your clients, find and remove 'zypper ll ;' to fix this.
If updates are performed on a host the WARN and CRIT state will remain until plugin cache expires in 7200 seconds. To invalidate the cached data and to trigger the check execution immediately, you can run the command:
if [ -e "/var/lib/check_mk_agent/cache/plugins_7200\plugins_mk_zypper.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_7200\plugins_mk_zypper.cache" fi if [ -e "/var/lib/check_mk_agent/cache/plugins_mk_zypper.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_mk_zypper.cache" fi
If you don't want to invalidate the cache manually, please write a wrapper that you will execute on the client for manually installing updates. The wrapper would run the actual update command, followed by the command seen above. I'm not aware of any post transaction hooks for zypper.
On the client, do the following (replace cmk-server with the ip address of your Checkmk server and cmk-version with the version currently installed e.g. 1.6.0p1.cre):
cd /usr/lib/check_mk_agent/plugins/ mkdir 7200 ; cd 7200 scp -p root@cmk-server:/opt/omd/versions/cmk-version/share/check_mk/agents/plugins/mk_apt .
If updates are performed on a host the WARN and CRIT state will remain until plugin cache expires in 7200 seconds. To invalidate the cached data and to trigger the check execution immediately, you can run the command:
if [ -e "/var/lib/check_mk_agent/cache/plugins_7200\mk_apt.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_7200\mk_apt.cache" fi if [ -e "/var/lib/check_mk_agent/cache/plugins_mk_apt.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_mk_apt.cache" fi
If you don't want to invalidate the cache manually you can copy the APT hook under ~/share/doc/check_mk/treasures/mk_apt_hook (on the Checkmk server) to /etc/apt/apt.conf.d/98mk-apt on the client or write a wrapper that you will execute on the client for manually installing updates. The wrapper would run the actual update command, followed by the command seen above.
Once all agent plugins are installed on the clients, go to WATO → Hosts and edit each host that is checked for updates. Under each host click on Services and do a Full Scan. The first time this is being run you will either not find any update service or the check will return empty data and to go UNK state. Simply run Full Scan again after a minute and you should see it properly. Click on Fix all missing/vanished. Repeat on all required hosts and do not forget to Activate Changes afterwards.
Does one of your update checks turn critical and suggest that your CentOS/Fedora host needs to be rebooted eventhough you already did (or there were no updates installed that would require a reboot)? It took me a little while to debug this issue caused by the linux-updates plugin and it appears to be caused by the rescue-kernel being installed on CentOS/Redhat/Fedora. The easiest fix is to remove the rescue-kernel and keep it from getting installed again, followed by invalidating the linux-updates cached data.
Find the rescue kernel installed on your system:
# ls -lrt /boot/vmlinuz*rescue* -rwxr-xr-x. 1 root root 5392080 Sep 23 20:00 vmlinuz-0-rescue-9cdb9ab3246a4b3f9c0849ecd597f25e
Remove the rescue image from the /boot directory first.
rm -f /boot/vmlinuz-0-rescue-9cdb9ab3246a4b3f9c0849ecd597f25e
Remove the rescue image from Grub2 using the grubby command with -–remove-kernel option.
grubby --remove-kernel=/boot/vmlinuz-0-rescue-9cdb9ab3246a4b3f9c0849ecd597f25e
Verify that the rescue image menuentry is now removed from the grub2 configuration file.
cat /boot/grub2/grub.cfg | grep rescue
Let's make sure the rescue-kernel doesn't get installed again during the next kernel update.
yum remove dracut-config-rescue
Now invalidate the cached data.
if [ -e "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" fi if [ -e "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" fi
That's it. Recheck the update service now. Wait a minute, recheck again, you should be set to green now.
This rarely ever happens. This is mostly caused when yum/dnf delivers odd unexpected output, may that be due to errors or because Celine Dion never stops singing. Cache files get polluted with excess lines so the update check assumes that there are avaliable updates.
On the checked host, remove the plugin cache and invalidate the cached data.
rm -f /var/cache/mk_linux*.stat rm -f /var/lib/check_mk_agent/cache/plugins_mk_linuxupdates* if [ -e "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_7200\mk_linuxupdates.sh.cache" fi if [ -e "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" ] then touch -t 197511170500 "/var/lib/check_mk_agent/cache/plugins_mk_linuxupdates.sh.cache" fi
That's it. Recheck the update service now. Wait a minute, recheck again, you should be set to green now.