Monitoring Sanoid with a non-root user error: (Could not write to $cache!)

Hi All,

I am trying to monitor a new Sanoid setup using a non-root user from my monitoring agent and am running into a weird issue.

Here is the error I am getting:

vmhost02:~$ sudo -u zabbix /usr/sbin/sanoid --monitor-snapshots
print() on closed filehandle FH at /usr/sbin/sanoid line 1543.
print() on closed filehandle FH at /usr/sbin/sanoid line 1544.
Could not write to $cache!\n at /usr/sbin/sanoid line 829.

I’ve given the user zabbix sudo permissions to execute the /usr/sbin/sanoid file using the following in the /etc/sudoers file:

zabbix ALL=NOPASSWD: /usr/sbin/sanoid

What is strange is that if I run the same command on my user who has full sudo rights, it works. If I try again on my zabbix user, everything works. Eventually the command stops working on zabbix user.

I do notice that the /var/cache/sanoid/snapshots.txt file gets updated when the issue starts.

Here is my sanoid.conf file:

sanoid.conf
#Catch All incase a dataset was missed
[zpool1/vm-images]
        process_children_only = yes
        recursive = yes
        frequently = 0
        hourly = 0
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = ye

[zpool1/vm-images/q36001/c-drive]
        use_template = production

[zpool1/vm-images/q36001/q-drive]
        use_template = production


#############################
#templates below this line #
#############################

[template_production]
        frequently = 0
        hourly = 36
        daily = 30
        monthly = 3
        yearly = 0
        autosnap = yes
        autoprune = yes

Sometimes it just randomly starts working again.

I’m very new to Sanoid so there may be something I am doing wrong.

Not sure if this helps but here is my /var/cache/sanoid/snapshots.txt file

vmhost02:~$ ls -l /var/cache/sanoid/snapshots.txt 
-rw-r--r-- 1 root root 8794 May 27 20:45 /var/cache/sanoid/snapshots.txt

Sanoid Version

vmhost02:~$ sanoid --version
/usr/sbin/sanoid version 2.2.0
(Getopt::Long::GetOptions version 2.54; Perl version 5.38.2)

Any help is greatly appreciated!

I believe the issue is similar to what is being discussed here.

I am currently on version 2.2.0 and these changes seem to affect version 2.2.1

I tried 2.2.1 and that did not fix the issue.

I updated to 2.2.1 and that seems to have fixed the issue. It didn’t at first but once I reran one more time as my sudo user (not zabbix), everything was fixed.

I will monitor for the next few hours to see if it happens again.

I will keep going at it and publish my findings here for future readers

Edit (May 28 - 9:AM EDT): This seems to have fixed the issue. Hopefully this helps someone in the future.