So in the context of the fact that you’re planning your pool layout, I’ll try to address this at a relatively high level. Keep in mind that because ZFS handles everything from the raw drives through the filesystem that a ZFS URE outcome very likely differs from a traditional RAID array. Classically, traditional RAID controllers would abort a rebuild if a URE occurred in a RAID5 rebuild as it could not determine where that sector fell in relation to the filesystem or LUNs presented.
Hard drives reliability ratings. One of these is generally called “Nonrecoverable Read Errors Rate” or as you supplied URE, Unrecoverable Read Error, rate. Consumer drives are generally rated at 1 per 10^14. Enterprise drives at 1 per 10^15, 10 times greater but this as you will see is mitigated in how they are used. This rating is in bits. Converting 1 per 10^14 bits to bytes results in 1 per 11.37 TiB read, so the rating says that you could expect to have an URE once every 11.37TiB read on a consumer drive. Since you’re discussing 20TB drives, I’m going to switch my explanation to enterprise-level drives.
Given that enterprise drives have a rating that is ten times better you might say, then unless I have an array that is less than 113.7 TiB in size, I’m golden. Unfortunately that is not the case because enterprise drives are generally grouped together to hold data.
To simplify things let’s talk about six-sided dice. If I have one die, then I have a 1 in 6 (16.7%) chance of rolling a six. If I have 2 dice then my chances of rolling at least one six are increased to 1 in 4 (25%). As I add more dice it becomes more likely that I will get at least one 6 every time I roll them. This is the equivalent of raidz1 or RAID5 that can survive one drive loss.
raidz2 or RAID 6 is like having to roll 2 sixes. With 2 dice my chances are 1 in 36 (2.7%) of rolling 2 sixes at the same time, much better than 1 in 6. With 3 dice it becomes 2 in 27 (7.4%). The probability calculations get really complex with additional dice so I won’t go further. Suffice it to say, the chances are significantly lower.
This also applies to the hard drive error rates. As I add more drives my chance of getting that URE increase. The chance of getting a URE can approach that of a single consumer drive. The chances of 2 UREs affecting the same exact sector on a pair of drives are something like 1 in 10^30. All of that is not to say that one couldn’t hit the lottery but it is extremely unlikely.
I’m going to use 20TB (18.19 TiB) enterprise drives in these scenarios. Three of the top-of-mind configs would be:
- 4+1 raidz1/RAID5 - 72.76 TiB usable
- 4+2 raidz2/RAID6 - 72.76 TiB usable
- 3 x 2-drive mirrors - 54.57 TiB usable
We’re going to talk about the likelihood for having a URE that results in data loss (important caveat) during the rebuild after a single complete drive failure.
Scenario 1, we have 4 drives left with no parity to spare. There is a chance that we could experience an URE and because we don’t have any parity to recover data from ZFS will register a data error on the file or zvol affected. That data can’t be recovered without outside intervention.
Scenario 2, we have 5 drives left with parity. Again there is the chance of URE but because we still have parity, ZFS can rebuild that data automagically with no data loss.
Scenario 3 is a little more tricky and the subject of many debates. With a mirror drive loss, its partner now has no redundancy. The URE chance is only once in 113.7 TiB, but we are putting that single drive under significant stress during that rebuild. Remember it takes 22+ hours to read 20TB off of a drive @ 250MB/sec. If we do hit the lottery and get a URE on the remaining drive ZFS will register a data error on the file or zvol affected. That data can’t be recovered without outside intervention.
- Scenario 1: bad… just say no (or possibly goodbye to your data)… FYI I personally use RAID5/z1 on a NAS that does nothing but store backups as I feel that I have enough data redundancy elsewhere to make up for it.
- Scenario 2: safest, but not most performant
- Scenario 3: more performant than scenario 2, but somewhat riskier.
ETA: All of this to also say: MAKE SURE YOU HAVE BACKUPS!
ETA2: Also, zfs will continue the resilver as best it can, just marking errors on files or zvols it can’t recover properly.