Best Practices for ZFS on NVME drives

Are there any important differences or things to pay special attention to when creating pools using NVME drives compared to traditional SAS/SATA based drives? With the W/R being so much higher, is there any constraints that may occur surrounding the usage of ARC compared to a traditional array?

I understand my question is quite generic, but as NVME drives are coming down in price (Seeing 4tb nvme drives at <$200usd now) more and more people will probably be looking at using NVME drives (Perhaps u.2 used Enterprise SSDs) for ZFS pools, and the information on best practice with these types of drives feels scattered.

You can set autotrim to on, and perhaps ashift 13. NVMe drives can be bitchy about reporting their sector size. I recently asked around about this and I haven’t found a real way to figure it out, except by trial and error.

I don’t think there is any downside to setting ashift to too high, but there might be some exceptions I am not aware of. I’m no expert on the matter, but I did do some googling and reading up on it.

Now, I wish it were different (silently hoping Im doing something wrong :wink: ), but see my post here. I just spend > 2000 euros on 2 sets of new SSD’s, including a bunch of enterprise U.2 and no matter what I try or how I test it, my random read and write speeds on a set of 2 mirrors on the U.2 drives with zfs is simply abysmal. 17k iops and 67 MB/s. I suppose the good news (if we can call it that), is that any software raid seems to be slower than a single disk, which suggests there is something else going on. Still, the 17k iops with zfs is extreme, being about 10 times slower than the fastest test Ive done, ext4 on mdadm raid10. But even that is is less than a third of a single disk with ext4.

To be fair, this is just one test with 4k random read/writes. Larger block sizes and sequential tests are either good, or good enough (for me). I don’t mind a drop in performance for the benefits zfs has to offer, but it has to be within reason.