One or two pools

gausus · February 27, 2024, 12:33pm

Hello,

I’m creating a smal box with 12 disk to use for storing my backups. I’m wondering is it better to create a single pool with 12 disk, or two pools with 6 disks each. The capacity is not a problem, but i’m wondering how can it impact my data security and performance. Any comments and recomendations would be great
Thanks in advance

HankB · February 27, 2024, 2:47pm

My choice would be 2 vdevs of 6 drives each in RAIDZ2 configuration. Each VDEV could lose up to 2 drives and still work and dividing I/O between 2 VDEVs would provide better performance than a single VDEV.

Top performance would probably be provided by 6 VDEVs of mirrored drives at the cost of half of the drives capacity and the risk that the failure of two drives in the same VDEV take out the entire pool.

There are also configurations that would use hot spare drives. I have no experience with that but with that many drives it would be something to look into. I believe that with two 5 drive RAIDZ VDEVS and two spare drives, the pool would operate with any 3 drives failing.

As an aside… Consider how you are going to ID a drive when one needs to be replaced. If the H/W can flash an LED to identify a drive, the problem is solved. Otherwise you probably want to mark the drives so you can relate the ‘bad drive’ indicator in a zpool status report to the actual drive. (DAMHIK, IJK.)

Edit: I just realized you asked about 1 vs. 2 pools. A single pool will provide the greatest flexibility. You can manage datasets within the pool more easily than distributing data between two pools.

mercenary_sysadmin · February 27, 2024, 3:38pm

You want two six-wide vdevs here. A single twelve wide will perform significantly worse, be less robust to drive failures, and–as many people have discovered to their surprise–won’t actually store data much if any more efficiently (since it requires padding where the six wide don’t, and since so many short blocks will have to be written to narrower stripes anyway).

Note that we’re still talking about a single pool, addressable as one big block of undifferentiated storage. Just two six wide vdevs in that pool.

You don’t generally want two separate pools unless you need to isolate one workload from latency introduced by another workload: for example, you might not want an extremely busy key-value store on the same pool as a heavy streaming workload, because the streaming workload will significantly increase latency on the KVS queries.