Hey All,
I’m trying to understand ZVOLs a bit more and I haven’t found an answer that makes much sense in any of the openzfs or oracle docs I’ve come across. Do ZVOLs actually run on top of a zfs filesystem or are they just logical devices? I’ve generally avoided ZVOLs because I tried to avoid any potential performance impacts or complications from having nested filesystems. Thanks
Both zvols and filesystems (ZPL - zfs posix layer) are datasets that sit above the DMU (data management unit) layer. Neither is layered on the other.
There are differences in the way the IO pipeline works that tends to give filesystems better performance. Or at least that was the case several years back. I’ve not kept up with zfs development to see if zvol performance has improved.
It’s been reported to be “performance fixed! just as fast as normal datasets now!” several times over the last fifteen years or so; so far I have disagreed strongly with that assessment each time somebody presented me with it and I then tested it.
Allan (Jude) recently told me that performance problems had been addressed in zvols… but it ain’t even the first time Allan has told me that, and at this point I will only believe it when I actually test and confirm it, and not before!
So normal datasets are more performant than a zvol? I understand that the use case between zvols and normal datasets is very different but I’m a bit surprised to hear that.
When I talked about filesystems being layered, I could have added a better example. The diagram helps me somewhat in understanding how the two are connected and gives me some idea of the answer, but if any filesystem has overhead and if have a ZVOL that I format that to, ext4 or something, do I get the overhead over ext4 + the the overhead of zfs?
People misuse the term dataset. As defined in the code, filesystems, zvols, and snapshots are all datasets.
If you put ext4 or any other native Linux filesystem on top of a zvol, you will have the overhead of both. Since zfs uses the ARC for caching and ext4 uses the buffer cache, you may end up with the filesystem data and metadata being cached twice. Managing two caches has CPU overhead in addition to storing an extra worthless copy in RAM.
You get some of the overhead of ZFS in addition to all of the overhead of ext4.
Specifically, you get the overhead in the form of checksumming, compression (if any), encryption (if any), and so forth. The overhead you don’t get is the actual filesystem bits of ZFS–so, no inodes or other filesystem metadata, just the block management stuff.
You will also end up with double caching, if you don’t manually disable ZFS caching on the zvol… which would pretty much suck in a lot of circumstances, since ZFS’ cache is considerably smarter than the kernel page cache. ZFS won’t cache ext4’s inodes and other metadata as inodes and metadata, but it will cache their blocks just like it caches any other blocks… which the kernel page cache won’t know about, so it will also cache them there as metadata.
Ah, you guys are awesome, thank you for taking your time to answer that, it makes much more sense. So a ZVOL would make sense if I wanted a block device AND zfs features like checksumming/compression/encryption. Otherwise I might as well just create a partition on a disk or use LVM then, correct?
Yes, that’s exactly correct.