Hi,
I’ve just started to learn ZFS (I’ve known for it’s existence for along time, but never got around learning / starting using it). It will have a place in my PC and some (hobbyist) servers ;-).
I currently use Snapraid for my cold data, i.e. archive data (mostly media library and backups). It will be written very seldomly, mostly red quite slowly (to play back music/videos and pull some ancient backup) etc.; so for this use case, it is certainly optimal to not spin the whole array to read a single file. Syncing the parity does not need to be done immediately.
This is the use case Snapraid is optimal, typically real RAID solutions are not able to not spin up the whole array when data is red (also: in case too much disks fail, all data will be lost; whereas with snapraid, all disks who have not failed, will have intact data - they are not in any way dependant on each other, parity can just be used to rebuild a failed data disk).
However I could see some benefits doing the same in ZFS, however it seems it can not do it currently, or perhaps I’m mistaken (?). To do what Snapraid currently can, ZFS would need these features:
- Make datasets into a vdev which are tied to one physical device (and, optionally, add a quota so that the dataset can not overflow the disk); no other disks should spin up when data from this dataset is red from;
- Perhaps a new vdev / raid type is needed; i.d. raidCold or raidZX-Cold? Or, current types could have an option (perhaps named ColdStorage)? The parity can not be striped for this use case, but (as the actual data) tied to one physical device.
- Adjust the sync frequency of the parity; it doesn’t matter that much if writes are done seldomly, though, as the disk would be spin up quite rarely in any case (i.e. a “nice to have” feature)
- (for partial recovery) be able to re-import individual data disk as a temporary pool+vdev - but again: not really required, but a “nice to have” feature (i.e. take a disk temporarily elsewhere, read the dataset, put it back into the original array). For minimal feature set, only re-silvering would be enough (as with a regular pool).
Now, considering the above, I have two questions:
- Would this kind of functionality be welcome in ZFS? I don’t want to make a feature request just yet in github since I feel like I could be adding unwanted noise there =)
- Now knowing ZFS internally, are there some underlying design choices which would make implementing this difficult, or is it perhaps even a low-hanging fruit?
- Is this a case of “right tool for the job”; i.e. just keep using Snapraid if that is what suits best the use case =)
EDIT: I have mixed pool and vdev in my post. I hope that’s understandable as I’m new to ZFS ;-). I’ve tried to edit this error out. Also, pointed out that parity can not be striped for this use case (but tied to a physical device, just like any non-parity disk)
EDIT: I’ve realized this would mostly be like standard RAID4 … implemented in ZFS. But with the addition of arbitrary number of parity disks (like on Snapraid). On further reflection this is best implemented on FS level, just like Snapraid does - however in case anyone has any thoughts you’re welcome to post them here ;-). It seems like a user had a use-case for a somewhat similar situation in mind here - but nobody mentioned snapraid there. Conlusion: just use Snapraid, if that is what you need!
Cheers!