To the point: can ZFS still provide gains in read/write performance in a world of cost-effective super fast multi-TB NVMe SSD (i.e. beyond SATA HDD), without sacrificing resiliency?
I’m shopping around to build a new workstation and I see that even some consumer-level motherboards provide plentiful PCIe 5.0 and PCIe 4.0 lanes that can be used for NVMe storage. It would be a dream to pool 3-4 drives, benefit from the versatility and resiliency of ZFS, while also hold onto or even improve the read/write performance of these fast drives.
But in my pre-purchase research I’ve come to understand that because a central assumption under which ZFS was developed no longer holds – storage access is very slow compared to memory & compute – simple pool organizations like mirroring or raidz1 that previously boosted read/write performance can now hinder read/write performance (or rather, trade some performance for resiliency).
A few questions:
Is the above essentially true?
If I were willing to throw resiliency out the window to maximize read/write performance of 3-4 NVMe SSDs, what kind of ZFS topologies should I look at?
Where on the web can I find expert guidelines for using ZFS on cheap, very fast storage (i.e. online resources that are current with the consumer storage options of, say, 2023–2025)?
I did that years ago, creating a stripe set of 4x 500GB Samsung EVO SATA SSDs. And then one day one of the SSDs started malfunctioning. It would drop out, only to return following a power cycle. This allowed me to get one last backup before adding a 5th SSD and configuring RAIDZ1. (The failed drive was replaced by Samsung under warranty.)
I don’t recall that I ever got good comparison numbers for performance. It was “fast enough.”
The performance of any storage setup is going to depend heavily on the workload. I suggest trying different setups if possible. Also accept the fact that ZFS will likely cost some performance in exchange for security of your data (and it provides other benefits beyond that.)