RAID 1+0 vs RAID 0+1

This article is going to be a clarification on the eternal argument about if RAID 1+0 is any different from RAID 0+1.

There are two RAID layouts, RAID 0+1 and RAID 1+0. In both cases, data is striped and mirrored across an even number of physical disks. Let's illustrate the difference by drawing the layouts.

Layouts

RAID 1+0 layout

RAID 1+0 (also called RAID 10) with N drives is a RAID0 (stripe) created over N/2 RAID1s (mirrors).

RAID0 array (stripe)
Mirror 1 Mirror 2 Mirror 3
Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6
A A B B C C
D D E E F F

RAID 10 (1+0) layout. Letters A through F denote data blocks.

RAID 0+1 layout

RAID 0+1 (also called RAID 01) with N drives is a mirror created over two RAID0 (stripe) arrays, each containing N/2 disks.

RAID1 array (mirror)
RAID0, left side RAID0, right side
Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6
A B C A B C
D E F D E F

RAID 01 (0+1) layout. Letters A through F denote data blocks.

Data recovery

Looking at how the data blocks are arranged on the disks, you can see that the content on the disks is identical between these two layouts. You get one pair of disks containing blocks A and D, one pair containing B and E, and one containing C and F. What changes is the disk order (the numbering of disks). Since the recovery is based on the disk content rather than disk order, both layouts are identical from a data recovery perspective. Recovery is generally performed by reducing the array to RAID0 and goes like this:

  • You receive a set of disks, some identical to each other. Par for the course, the disks are, at this point, shuffled out of order, and there is no metadata available.
  • Identify the mirror pairs.
  • From each pair, remove one disk at random. This assumes the disks in the pair are perfectly identical, i.e., fully in sync. See below for more details.
  • You now have a RAID0, which you recover as you normally would.

This process does not differentiate between RAID 1+0 and RAID 0+1. You do not even need to be aware that such a distinction exists. Furthermore, if there is no recognizable metadata, it is impossible to go back and tell if it was initially RAID 1+0 or RAID 0+1.

The mirror pairs which are not identical are called desynchronized. Desynchronization occurs when one disk of the pair is updated while the other is not, for any reason. Desynchronization reduces the chances of successful data recovery. The more out of sync your disk pairs are, the worse the result you get.

Theoretical difference in reliability

Strictly speaking, there isn't any. The array is guaranteed to survive single disk failure, and it is not guaranteed to survive the second one. If you want a guarantee against double disk failure, use RAID6 or three-way mirroring. However, the second failure does not always bring the array down, and computing probabilities is quite entertaining and controversial. The controversy comes mostly from two things:

  • conflating unavailability (the data is there; you can't access it right now, but it can be recovered) with data loss (when the data is no longer there and can't be recovered), and
  • conflating single-controller implementation (where both RAID 0 and RAID 1 layers are implemented by the same hardware controller or the same software driver, like md-raid) with multi-controller implementation (where the first level is done with a hardware controller, and hardware RAIDs are then combined using a software RAID).

All of the calculations in this section assume that disks fail completely and irrecoverably, e.g., explode in such a manner that no repair is possible. Also, the calculations do not account for any divergence between mirror pairs which may or may not occur after the first failure. While this is quite far from real life, it simplifies the initial discussion and highlights the differences. The later section will address some real-life complexities, but I'll be sticking to the basics for now.

Two disk failures in RAID 10 (RAID 1+0)

RAID0 array (stripe)
Mirror 1 Mirror 2 Mirror 3
Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6
A A B B C C
D D E E F F

RAID 10 (1+0) with a single failed drive shaded gray.

One disk fails. Let it be Disk 1, although the calculation does not change if we start with any other disk. Mirror 1 is still online because Disk 2 can provide a copy of blocks A and D. Now, we have five disks left. Of the remaining five, failure of Disk 2 will cause data loss since no copies of blocks A and D remain. Any one of the other four disks can fail, and the array will still be functional. Therefore, there is no chance of losing data if one disk fails, and one in five chance of losing data if two disks fail.

In this case, there is no difference between inaccessibility and data loss. If Disk 1 and Disk 2 are both lost, the data is both unavailable and unrecoverable. If Disk 1 fails and any other disk but Disk 2 fails, there is no data loss and no loss of availability. Also, there is no difference between single-controller implementation and multiple controllers.

Two disk failures in RAID 01 (RAID 0+1)

RAID1 array (mirror)
RAID0, left side RAID0, right side
Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6
A B C A B C
D E F D E F

RAID 01 (0+1) with a single failed drive. Gray marks the physically failed drive. Red marks the inaccessible part of the array.

Let's again say that Disk 1 fails first. From a data recovery point of view, there is no difference between this and the previously discussed RAID 10 case. Disk 1 is lost, but Disk 4 still holds copies of blocks A and D, so no big deal. From the operational point of view, though, things may look very different:

  • If the same controller controls the entire disk set, the controller is aware of all disks and knows that blocks A and D are still available. The controller notes Disk 1 has failed, and there is no further action.
  • If the RAID 1 and RAID 0 levels are handled by different controllers, unaware of each other, the controller responsible for the left side RAID 0 fails the entire left side.

Now, if another disk was to fail:

  • In a same-controller case, there is one in five chance that Disk 4 will fail, rendering the array both inaccessible and unrecoverable because both copies of blocks A and D are destroyed. Any other disk failure is duly noted by the controller, and since the controller knows where to find a copy of the data, the array remains online.
  • In a multi-controller case, there is a three in five chance that a disk on the right side fails. Since the RAID0 controller for the right side is not aware of other disks, it fails the entire right side. The top-level RAID1 controller then discovers that both sides of the mirror failed and, in turn, fails the entire setup. Since the RAID1 controller is unaware that it manages composite devices, it cannot request separate blocks directly from underlying disks. The entire array thus becomes unavailable. However, the chance of a data loss remains one in five because the data is still recoverable unless Disk 4 fails, destroying the only copy of blocks A and D. Let's spell out all possible outcomes and their probabilities for a multi-controller case:
    • two in five chance of nothing happening (Disk 2 or 3 from the left side, which is already failed);
    • one in five chance of losing both availability and data if Disk 4 fails;
    • two in five chance of losing availability, but not data (data is still recoverable if Disk 5 or 6 fails).

In some cases (like here and here), people fold the last two cases into one, assume two layers are implemented separately and conclude that RAID 1+0 is generally more reliable than RAID 0+1. While true to some extent, the differences are, to a significant extent, implementation-dependent.

Important practical aspects

There is a practical aspect of the issue which increases complexity, often to the point where no reliable predictions can be made. It favors RAID 10 (stripe over multiple mirrors) and has to deal with drive contents diverging over time.

Divergence of mirror pairs with time

In every mirror pair, both disks are supposed to be identical at all times. The controller ensures it by writing two copies of data simultaneously, updating both sides of a mirror. As expected, the failed drive is no longer updated. Less expectedly, if two different controllers handle RAID 0+1, the failed side is not updated at all. With time, mirror pairs diverge, producing a situation like this:

RAID1 array (mirror)
RAID0, left side RAID0, right side
Disk 1 Disk 2 Disk 3 Disk 4 Disk 5 Disk 6
A1 B1 C1 A2 B2 C2
D1 E1 F1 D2 E2 F2

RAID 0+1 after running for some time with one side failed. A1 through F1 are data blocks valid at the time of the first disk failure. A2 through F2 are data blocks written or changed after the first disk failure. Disk 1 is faulty (shaded grey), and the left side RAID0 is inaccessible (shaded red).

This behavior only occurs when the array is implemented as separate layers. Single-layer implementation, where the controller is aware of all the disks, continues to update all the surviving copies.

Now, no recovery is possible if any drive on the right side fails because all the mirror pairs are completely out of sync. This happens in RAID 01 (but not RAID10) if the array is allowed to continue running for a long time with one side failed. The divergence increases with time, making successful data recovery less and less likely. With zero divergence time, the RAID 01 theoretical calculation of the fault tree (as above) still holds. However, as divergence time increases, the chance of losing data on RAID 01 approaches three in five.

Other factors affecting the divergence of mirrors

  • Filesystem in use. Copy-on-write (CoW) filesystems (BTRFS and ReFS) will fare slightly better. In a CoW filesystem, the data is never modified in place. Instead, the filesystem writes the modification elsewhere on the disk, and the filesystem pointers are then updated to point to a new (modified) version. Thus, the filesystem maintains previous versions of data longer than the traditional modify-in-place approach would. However, we are still talking at most hours between two consecutive failures, not days and certainly not weeks.
  • Usage pattern. Applications accepting data for archival storage, similar in pattern to write-once-read-never backups or CCTV archival streams, will fare better. This is because the data written before the disk failure is never modified and will remain in sync on disks even in a RAID 01 case. Editing files in place will work worse, depending again on the specific combination of filesystem type and editor software.

Conclusion

The only case where the difference between RAID 0+1 and RAID 1+0 matters is when you are building the array by combining different controllers, either creating a software RAID over several hardware RAIDs or combining two different software RAIDs. If you have to do this, create many RAID 1 mirrors first (on the bottom level), then combine them into a RAID 0 (on the top level). Doing so will save you time and money on data recovery in some cases of double drive failure.

As with any other RAID, ensure alerts are set up and working for when a drive fails. Once the drive fails, you want to replace it as soon as possible to bring the array back to its full redundancy. This is very important, and people tend to overlook it, thinking that as long as the RAID works, one can delay the maintenance.

Remember that both RAID10 and RAID01 are not equipped to handle a second disk failure. If you want your system to be fault-tolerant against two failed disks, use RAID6 or three-way mirrors. Thinking about the intricacies of double failures in RAID 10 is like thinking about playing Russian Roulette. It does not matter if the probability is one in six or two in six. You should not play in the first place.

Filed under: RAID.

Created Monday, September 7, 2020

Updated 12 September 2020

Updated 27 September 2020