I’m setting up a new server with all-NVMe storage and wanted to get some baseline numbers before putting it into production.
Setup
- Drives: Samsung PM9A3 (3.8TB)
- Pool: 6 drives in ZFS RAID10 (mirror vdevs)
- Ashift: 12
- OS: Proxmox VE 9
- ZFS settings: everything left at defaults unless noted
What I’m Seeing
- Single drive with XFS → Performance looks great, right in line with the rated specs.
- Same drives in ZFS RAID10 → Performance is really bad. Not just below expectations, but nowhere close to where it should be.
Benchmarking Method
I ran all tests with FIO using this base config
size=1G
ioengine=io_uring
iodepth=32
numjobs=1
direct=1
runtime=60
time_based
I varied the bs depending on the test. On the ZFS side I also played with:
- primarycache
- secondarycache
- recordsize
Test charts are labeled as (FIO block size / ZFS recordsize) .
Example: (4k/128k) = 4k read/write with a 128k ZFS record size.
Question for the group:
Why would performance tank so badly when moving from single-drive XFS to a 6-disk ZFS RAID10 setup? Am I missing something obvious in ZFS tuning for NVMe, or could this be related to the way Proxmox/ZFS?
Any tips on what to check next (recordsize, caching, queue depths, etc.) would be much appreciated.
4k/128k record-size (NO Cache)
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 55607 227.8 0.57 1.27 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 157557 645.4 0.2 0.54
rand_read 92540 379.0 0.34 0.63 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 9265 38.0 3.45 11.6
rand_rw 25943 106.3 0.09 0.5 11125 45.6 2.66 9.9
4k/4k record-size (NO Cache)
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 115068 471.3 0.28 0.56 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 28073 115.0 1.14 1.27
rand_read 98233 402.4 0.32 0.48 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 23679 97.0 1.35 1.63
rand_rw 55780 228.5 0.09 0.14 23891 97.9 1.12 1.4
4k/4k record-size (w/Cache)
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 212126 868.9 0.15 0.32 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 28480 116.7 1.12 1.29
rand_read 204612 838.1 0.15 0.24 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 23328 95.6 1.37 1.61
rand_rw 55156 225.9 0.09 0.14 23621 96.8 1.14 1.38
4k/16k record-size (w/Cache)
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 159751 654.3 0.2 0.45 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 274093 1122.7 0.12 0.33
rand_read 180406 738.9 0.18 0.29 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 91839 376.2 0.35 4.55
rand_rw 115432 472.8 0.17 0.66 49509 202.8 0.24 5.41
4k/16k record-size (NO Cache)
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 109383 448.0 0.29 0.64 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 77803 318.7 0.41 0.87
rand_read 97661 400.0 0.33 0.5 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 15276 62.6 2.09 6.91
rand_rw 33002 135.2 0.07 0.18 14147 57.9 2.08 6.85
XFS NVE drive 4k write
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 573044 2347.2 0.05 0.06 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 391306 1602.8 0.08 0.1
rand_read 433023 1773.7 0.07 0.13 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 354747 1453.0 0.09 0.11
rand_rw 197536 809.1 0.15 0.52 84672 346.8 0.02 0.04
XFS NVE drive 128k write
job read_iops read_MB/s read_avg_ms p99_read_ms write_iops write_MB/s write_avg_ms p99_write_ms
---------- --------- --------- ----------- ----------- ---------- ---------- ------------ ------------
seq_read 50125 6570.1 0.63 0.88 0 0.0 0.0 0.0
seq_write 0 0.0 0.0 0.0 31400 4115.8 1.02 1.09
rand_read 43914 5756.1 0.72 1.97 0 0.0 0.0 0.0
rand_write 0 0.0 0.0 0.0 12220 1601.9 2.61 3.59
rand_rw 15551 2038.4 0.9 1.88 6667 874.0 2.69 3.62