Not sure if this is the right place to ask, can someone direct me elsewhere if not?
I just built a pool on FreeBSD 12.4-RELEASE-p4 and did a “zfs send -R ” of an existing pool over the network using netcat to a “zfs recv -Fdv ” on the new server. That all went perfectly, but when I scrubbed the pool I noticed something odd. Somehow scrub is causing steady writing to the disk as it works.
There is absolutely no outside activity to the pool yet. The zfs filesystems aren’t even mounted. When not scrubbing, gstat shows absolutely no disk activity. When I start the scrub I’m seeing a steady roughly 30-35MB/sec write happening. This starts even in that beginning part where it’s scanning and continues the entire time into when it’s issuing and even in the phase after the scanning is done and it’s just issuing. This does NOT happen on the server the pool was sent from from which is running an admittedly older FreeBSD 12.3-RELEASE-p5, and I’ve never seen this on any other ZFS implementation, although admittedly they’re usually in service and you do see writes, they’re just not steady like this.
Stop the scrub and this write activity stops also. I’m 100% positive it’s the scrub causing this. The scrub has finished perfectly fine several times, there are absolutely no disk errors in dmesg, SMART shows absolutely nothing that’s a concern, not a single UCE, grown bad sector, logged sector rewrite/reassign, not even any substantial number of error correction invocations. These disks are all completely healthy, I cherry picked them specifically for this array based on SMART stats and very thorough testing.
Gstat shows similar, but here is a sample of what “zpool iostat -v 5” shows basically the whole time the scrub’s running.
capacity operations bandwidth
pool alloc free read write read write
------------------------ ----- ----- ----- ----- ----- -----
rpool 28.5G 98.5G 0 0 0 0
mirror 28.5G 98.5G 0 0 0 0
gpt/os0 - - 0 0 0 0
gpt/os1 - - 0 0 0 0
------------------------ ----- ----- ----- ----- ----- -----
sas15k 2.64T 3.33T 6.86K 1.09K 598M 37.5M
mirror 245G 311G 676 89 55.7M 3.28M
diskid/DISK-0XV339ZJ - - 60 46 55.2M 3.28M
diskid/DISK-0XGXWS2P - - 59 46 55.8M 3.28M
mirror 246G 310G 666 82 55.1M 3.60M
diskid/DISK-0XV0Z03H - - 57 43 54.7M 3.60M
diskid/DISK-0XH9GVYP - - 61 42 55.5M 3.60M
mirror 246G 310G 663 86 57.2M 3.87M
diskid/DISK-0XV12PRL - - 61 38 57.1M 3.87M
diskid/DISK-0XV14YJH - - 61 39 57.7M 3.87M
mirror 245G 311G 514 117 52.6M 4.40M
diskid/DISK-0XV0NP3H - - 55 46 52.6M 4.40M
diskid/DISK-0XH9GV2P - - 64 47 52.2M 4.40M
mirror 245G 311G 607 82 51.1M 3.31M
diskid/DISK-0XV0NMLH - - 54 38 51.0M 3.31M
diskid/DISK-0XV0R2SH - - 57 38 52.0M 3.31M
mirror 245G 311G 614 66 55.1M 2.68M
diskid/DISK-0XV3U9GJ - - 59 32 56.1M 2.68M
diskid/DISK-0XV0VYLL - - 58 31 55.1M 2.68M
mirror 247G 309G 792 76 53.7M 2.68M
diskid/DISK-0XV0PHHH - - 57 39 53.6M 2.68M
diskid/DISK-0XV13K0H - - 60 39 55.3M 2.68M
mirror 246G 310G 647 139 54.5M 3.32M
diskid/DISK-0XV0W4TH - - 58 60 54.5M 3.32M
diskid/DISK-0XH2VLXV - - 58 59 54.7M 3.32M
mirror 245G 311G 708 142 55.1M 3.12M
diskid/DISK-0XV450MJ - - 58 59 54.9M 3.12M
diskid/DISK-0XV182KH - - 58 58 55.1M 3.12M
mirror 245G 311G 558 146 53.5M 3.83M
diskid/DISK-0XV0YJXJ - - 57 61 53.9M 3.83M
diskid/DISK-0XV0P03H - - 57 61 53.5M 3.83M
mirror 245G 311G 571 90 54.3M 3.46M
diskid/DISK-0XV3KN8L - - 68 46 54.3M 3.46M
diskid/DISK-0XV0NWDH - - 58 45 54.4M 3.46M
------------------------ ----- ----- ----- ----- ----- -----
If it matters, topology of both sending and receiving pools are a set of mirror vdevs, 8 vdevs in the source pool and 11 in the destination. This is used for NFS VMware datastores and fibre channel zvol targets.
This is the only thing I’ve been able to find that talks about anything similar and there are no real answers in it: Why does 'zfs scrub' write to disk every 5 seconds? - Unix & Linux Stack Exchange