Via /etc/cron.d
I’m running the following script:
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
# Run every hour.
0 */1 * * * root /root/monitor-snapshots.sh
/root/monitor-snapshots.sh
:
#!/bin/bash
SANOID=/usr/sbin/sanoid
if ! MONRESULT=$($SANOID --monitor-snapshots); then
echo $MONRESULT | mutt -s "Error: Snapshot policy problem." -- postmaster@mymachine.local
fi
On the physical box which runs this script (a Proxmox host) I’ve installed sSMTP with the following config:
root=postmaster
mailhub=mailrise.local:8025
hostname=storagebox.local
UseTLS=No
UseSTARTTLS=No
FromLineOverride=YES
And mailrise.local
is a separate VM running mailrise which acts as an SMTP gateway for Discord.
Randomly, I get the following errors:
Cron <root@pve> /root/monitor-snapshots.sh (root)
ERROR: No valid lockfile found - Did a rogue process or user update or delete it?
sendmail: 450 failed to send notification
Error sending message, child exited 1 ().
Could not send the message.
After getting such an alert, if I ssh into the machine and run syncoid --monitor-snapshots
manually - there is no error.
My /etc/sanoid/sanoid.conf
has multiple datasets all configured like this:
[tank/some-dataset]
use_template = shortTemplate
[other/some-dataset]
use_template = shortTemplate
autosnap = no
hourly_warn = 1500m
hourly_crit = 1500m
daily_warn = 36h
daily_crit = 48h
[template_shortTemplate]
frequently = 0
hourly = 1
daily = 3
weekly = 3
monthly = 3
yearly = 0
autosnap = yes
autoprune = yes
tank
is a 2-disk mirror and other
is a 2-disk mirror. syncoid
is used to send snapshots from tank
→ other
, and I have sanoid --monitor-snapshots
running to ensure other
gets those snapshots properly.
I’ll admit I’m probably doing something wrong here, and this setup (with sSMTP and mailrise in the mix) isn’t terrible straightforward. I use this mailrise VM to funnel a lot of other notifications into Discord, though, and it works fine in those cases.
Appreciate any pointers in the right direction on how to go about tracing this down, if it is a problem with how I’ve setup sanoid.