Is it bad idea to run containerd using zfs snapshotter?

My system now performs very badly on this:

~# time zfs list -t all |wc -l
26127

real	5m4.847s
user	0m2.973s
sys	5m1.770s

It was running a k8s cluster using containerd, and after several thousand container restarts it could barely create a container at all. It is zfs 0.8.3 shipped by ubuntu 20.04; I switched to another system which runs zfs 2.2.2 and uses overlay snapshotter, and that runs fine so far.

Is this a known problem, and do I simply need to avoid containerd zfs snapshotter?

How many snapshots do you have per dataset?

I’m unable to check it now; I have only a handful of datasets, and most of them don’t have many snapshots, only the containerd-managed one does. So that ~26k snapshots should be mostly owned by a single dataset.

Then that performance is as expected. Once you break 100-200 snapshots per dataset, things go downhill rapidly with zfs list -t snap operations. Performance of the dataset itself is generally fine, but attempting to list the snapshots can in fact take several minutes at a time.

The issue is most commonly seen on rust pools, but if you go too far overboard on solid state pools–and 26,000 snapshots on one dataset qualifies!–you’ll see it there also.

This is the reason sanoid caches its snapshot list! :slight_smile:

1 Like

I see, seems others have experienced it too ZFS Snapshotter - Excessive ZFS Snapshots? · Issue #71 · containerd/zfs · GitHub