Benefits or drawbacks to `truncate` vs `fallocate` for file-based pool

Follow up from my question here:

Evidence says it is possible, and stable, to use a file for a production pool under restricted available options. That question was fully answered and solved, so I’m opening this new, related question.

Doing some research, I found:

If I am going to use a file for a prod pool, it seems I have two options:

  • truncate - creates a sparse file (at least on ext4?) and any data written to it will result in fragments being written to disk
  • fallocate - sparse file, actual addresses reserved, so no fragmenting (at least until possibly an expansion, though looks like an expansion could be made without fragmenting it)

This file in which the zpool would be created is on an ssd. So I’m inclined just to use truncate, so that the controller can wear-level more freely, and, fragmentation doesn’t matter for speed.

Any other consideration I should make?

edit: add link, minor formatting

Never found a difference with zfs. Doesn’t mean there isn’t one, however.:wink:

On btrfs it matters.

You cannot preallocate on ZFS, because ZFS is copy-on-write. There is no effective difference between truncate and fallocate (or any other pre-allocation technique, including using dd) on ZFS.

1 Like

With TRIM enabled, will the wear-levelling be impacted at all? Wouldn’t ZFS’s tirm tell the drive what blocks are empty, even with fallocate?

There is no pre-allocation whatsoever in ZFS, period. New data goes into currently-unused blocks. I don’t know how to put this any more clearly. :slight_smile:

I don’t think OP is preallocating in ZFS.

I thought they were debating between creating a sparse file or a pre allocated file in a non-zfs file system for a file-backed vdev.

There’s no pre allocating in ZFS. And ZFS will compress if you try to fill up space with zeros.

1 Like

@ikabir is correct. I have an ext4 host on a laptop. I would like to zfs send | receive a dataset from my server zfs pool onto my laptop, for offline availability. (I prefer not to use syncthing et al).

So on my ext4 laptop, my plan is to create a file with one of these two methods, then create a zfs pool inside of that file, so that I can receive one, relatively-small dataset from my server.

1 Like

Ah, fair enough. In that case, fallocate will pre-allocate a mostly-contiguous (or as-contiguous-as-possible) set of physical sectors to the desired size at creation time, and those physical sectors are marked as in use right then and cannot be used for anything other than that file.

Truncate, by contrast, just says “this file can grow to X size” without preallocating the sectors. This can result in fragmentation–but generally speaking, you probably don’t really need an entire VM image’s backup to be particularly “contiguous” and fragmentation is only likely to be an issue on EXTREMELY already-full devices, eg a 1TB SSD with 950GB already in use when you do a truncate -s 10G newfile.dat.

Even though truncate doesn’t preallocate the physical sectors the file can eventually use, those physical sectors do not change once they are allocated. So if you used truncate and fallocate on a relatively-virginal filesystem with tons of free space, I would not expect significate difference in performance when performing random access reads and writes on the created file.

On the other hand, if your filesystem is already extremely full and there are likely not many big sources of contiguous free sectors available, fallocate can try to do a better job of reserving a sane grouping of those sectors for you than you would get when a sparse file organically expands bit by bit over time.

Have you considered instead shrinking the root ext4 filesystem to a small portion of the drive (I typically use 100GiB for this), then using the rest of the drive as a real no-kidding ZFS pool? That’s what I do with my laptops–a standard Ubuntu install onto a 100GiB partition, then log in, create a pool on the big partition on the rest of the drive, zfs create pool/home and zfs create pool/home/[username] and now I’ve got a nice simple ext4 laptop fully supported by the OS, but all the data I give a crap about is on proper ZFS. :slight_smile:

My ssd is < 50% full. Sounds like my understanding was correct, no difference between the two, but with truncate, it will allow the controller to select nand cells for wear leveling at least once, on initial write. Sounds like they won’t be deallocated from that file if the file contents shrink.

Yes, eventually I was planning on having root on ZFS, across the whole disk (minus ESP). I practiced on my friend’s laptop when he had to reinstall. Made his be my guinea pig :laughing: . But if an ext4 primary is good enough for you, it is probably good enough for me as well, haha. That is what I have on my server right now, main OS on an ext4 ssd, and a 2x hdd disk mirror for all my files. I will add, the distro I use (not ubuntu) has played very nicely with zfs, and it also worked without hardly a hiccup for root on zfs, so the factor of linux & zfs struggling to get along is not a concern for me.

I have had a laptop ssd fail before. Luckily I was able to mount it read-only one time, and clone it with fsarchiver. The next time I tried to mount it, the controller wouldn’t even register as existing at all when I plugged it into a different host via an enclosure. Simply to save the time of re-install and setup, that experience convinced me in the future that I would prefer root on zfs, just grab the whole OS with a send | receive and get back to work. I do keep a dotfiles repo, and if I cleaned it up and wrote an installation script, then the time of reinstall to an ext4 would be reduced. But then, if I don’t make a small partition for the OS, then the controller can wear-level more effectively, instead of splitting the wear leveling area in half (from what I understand…). So, I see pros and cons for both, root on ZFS, and your method of root on a small ext4, with a data partition dedicated to ZFS.

For now, I don’t have time to reinstall, but I will be getting a new laptop in a couple months, and will be forced to make the decision at that point. But, either way, it will be ZFS on its own, large partition at a minimum. :smiley:

1 Like

If you’re up for it, I recommend doing whole root on zfs.

I’ve been running it for years on debian and arch, and the only issue for me is that the installation on debian is as manual as it is for arch. I also run ZFS on Luks for encryption without any issues.

I found the refind bootloader to be a lot easier to deal with than grub for this setup.

I just mount ESP to /boot, install refind, make sure kernel+initramfs is in the root of the ESP, and call it a day.

Just make sure the ESP is big enough and you’ll be fine. Minimum 1GiB, 2-3 GiB is nice, especially on laptops since some vendors only release bios updates as windows or uefi binaries.
Then you can just drop the updates in the ESP, and go to the uefi shell via refind to update bios when the time comes.