Sanoid monitoring bash script advice

I’m using sanoid and syncoid to make snapshots and sync them to backup pools. I want to monitor those snapshots and I see that sanoid has built in monitoring functions that work with nagios. I’m using healthchecks.io to monitor most of my homelab. So I wrote this bash script to monitor my pools and snapshots. Any suggestions?

#!/bin/bash

output_snapshots=$(/usr/sbin/sanoid --monitor-snapshots)
output_capacity=$(/usr/sbin/sanoid --monitor-capacity)
output_health=$(/usr/sbin/sanoid --monitor-health)

if [[ $output_snapshots == OK* ]] && [[ $output_capacity == OK* ]] && [[ $output_health == OK* ]]; then
  curl -m 10 --retry 5 https://healthchecks.io/ping/xxxxxxxxxxxxx
else
  echo "One or more checks did not return OK."
  curl -m 10 --retry 5 https://healthchecks.io/ping/xxxxxxxxxxxxx/fail

If you’d like to make the script more robust I suggest adding some edge case coverage.

  • check and handle if curl fails. Right now it looks possible for Sanoid to return OK but curl could fail (say, network is down) without a way for healthchecks.io to distinguish between each case

  • log errors to a local file.

Consider capturing the friendly output of each sanoid check, not just the exit code. Then send the friendly output, concatenated from all non-OK checks, to healthchecks.io instead of just “one or more checks did not return OK.”

That way you’ll be able to see what is wrong, rather than that merely something is wrong, when you get your alerts from the healthchecks.io service.

1 Like