Quick question about snapshot memory usage

GeorgeAlexander · August 2, 2024, 10:48pm

I am on Ubuntu 24.10 with a data pool (booting from something else), and I suspect that installing zfs-auto-snapshot has made my system very slow.

So the question is does having a lot of snapshots increase memory usage, or should I look elsewhere?

Also, htop is no help in determining whether this is a good hypothesis. How can I check for myself?

thx

mgerdts · August 3, 2024, 12:12am

If the pool is pretty full, it can get slower. Years ago this started at 80%, but I think it kicks in later now. Data written while the pool was fullish may be slow to read.

If you enabled dedup that will cause problems unless you have gobs of RAM.

mercenary_sysadmin · August 3, 2024, 1:52am

Snapshots don’t affect RAM usage. They do take up storage, however, as the snapshots diverge from production–essentially, if you take a snapshot then delete a file, the file still exists in the snapshot, therefore you’re taking up all the space that your production takes, plus the space that the deleted file takes.

They also, if you accumulate too many, will slow down some ZFS operations pretty badly. General use of the production filesystem isn’t affected enough to notice, but if you accumulate 100 or so snapshots per dataset (and this is a per-dataset issue, not really a per-pool issue) you’ll wind up with zfs list -t snap and similar operations taking for-bloody-ever to complete, until you prune the snapshots back down to a reasonable number.

GeorgeAlexander · September 22, 2024, 7:49pm

aside: I would guess something like httm cn provide insight to this?

Are cases of (unexpected) slowness documented somewhere, even unofficially?

Is there a good reason for this? IOW I don’t really understand how zfs works, and I’m guessing not many people do).

GeorgeAlexander · September 22, 2024, 7:50pm

no dedup - i thought that at first too.

i think it was getting full.

thx.

mercenary_sysadmin · September 23, 2024, 12:46am

Are cases of (unexpected) slowness documented somewhere, even unofficially?

At the very least, slides in presentations made at conferences by myself and Allan Jude. Beyond that, not sure. I discussed it once with Matthew Ahrens at the ZFS Dev summit, also.

I can’t give you a DEEP insight into the issue, but essentially iterating through large numbers of snapshots of the same dataset necessitates going deep down many links of the metadata tree, which in turn is an extremely punishing very small block I/O operation.

This isn’t something you encounter by using the actual filesystem, mind you. Strictly encountered when iterating snapshots of the filesystem.

Sorry I can’t give you any more technical detail than that, but honestly that was about as much detail as I was looking for myself when I asked about it. If you’re not planning to go elbow deep in the codebase yourself, that’s probably sufficient, and I don’t plan to go elbow deep in the codebase, so…

GeorgeAlexander · September 30, 2024, 6:17am

Anything you got is fine, thanks.