I have a fresh zpool consisting of a 9 HDD RAIDZ2 vdev on Linux. I am not currently using a ZLOG special device. Now I have a dataset with recordsize=1M on it that I want to rsync/scp a few dozen TB of large files to. Once this dataset is filled up it will not be written to much any more. The zpool will not reach near full state.
I want the files to be as contiguous as possible. I am concerned about fragmentation because I am not using a ZLOG special device.
- Would it help reduce fragementation if I make the transactions longer (e.g. 30 seconds) while filling it up?
options zfs zfs_dirty_data_max=50000000000
options zfs zfs_dirty_data_max_max=50000000000
options zfs zfs_txg_timeout=30
-
Would it help reduce fragementation if I set
sync=disabled
on the dataset while filling it up?
I am not concerned about consistency, if anything goes wrong during transfer I can just start over. -
The setting of
logbias
should be irrelevant ifsync=disabled
anyway, right? -
I am not 100% sure but I am assuming rsync does sequential writes and doesn’t do parallel writes. I am not sure how big the write buffer of rsync is or how often it flushes/syncs. Is there an rsync/SSH/network socket option that can reduce fragmentation?
Or is fragmentation from ZIL / interleaved metadata writes never an issue with writes from rsync?
I am probably overanalyzing this but please bear with me.