Klennet Storage Software

RAID10 quirks in ZFS

RAID10 implementation on ZFS is quirky. Technically, there is no RAID10 in ZFS. What ZFS provides is some undetermined distribution of data over any number of mirrors (RAID1s). In some cases, this distribution is as close to RAID10 as you need for any practical purpose. In other cases, not so much.

  • If you build your system once and for all, never adding new disks to it, ZFS is no different from traditional RAID10.
  • If you plan to expand capacity by adding new disks at some later point, the differences are significant and worthy of consideration.

Let’s see. Traditional RAID10 is built over an even number of drives, 4, 6, 8, and so on, organized like a set of mirrored pairs (RAID1 part). The data blocks are written onto each mirror pair in turn (RAID0 part). The resulting layout looks like this

Mirror 1 Mirror 2
Disk A Disk B Disk C Disk D
1 1 2 2
3 3 4 4
5 5 6 6

Typical RAID10 layout with 4 disks.

RAID10 works fast because many blocks can be read in parallel. In the above example, blocks 1 to 4 can all be read in parallel. Disks A and B serve blocks 1 and 3, while disks C and D serve blocks 2 and 4. In a RAID10, the maximum number of blocks you can read in parallel is equal to the number of disks. Also, linear read speed increases proportionally to the number of disks.

So far, ZFS works the same as traditional RAID10.

Array expansion in ZFS and in traditional RAID10

Now if we want to expand an array by adding more disks to it, things become very different between ZFS and traditional hardware (or software) RAID10. Let's say for example we add two disks to the sample array above and then write three new blocks, 7, 8, and 9 onto the new array.

Hardware RAID rebalances data across a full set of disks. The same applies to the most common software RAID implementations. Once you add three blocks of new data, resulting layout is like this

Mirror 1 Mirror 2 Mirror 3
Disk A Disk B Disk C Disk D Disk E Disk F
1 1 2 2 3 3
4 4 5 5 6 6
7 7 8 8 9 9

Hardware RAID10 after it was expanded and three more blocks of data were written. Newly written blocks 7, 8, and 9 are shaded.

So, hardware RAID10 will integrate the disks fully into the array.

  • The process of rebalancing (often called reshaping) takes a long time (hours and sometimes days). During this time, the array is either offline or very slow.
  • After rebalancing, the performance is the same on the entire array. No matter which data you need to read, the maximum number of blocks you can read in parallel equals the new number of disks. Large sequential read requests are served by all disks.

ZFS does not rebalance. This is the important difference. After you add new disks, the array becomes non-uniform, with two different areas with different performance characteristics.

Mirror 1 Mirror 2 Mirror 3
Disk A Disk B Disk C Disk D Disk E Disk F
1 1 2 2 7 7
3 3 4 4 8 8
5 5 6 6 9 9

ZFS RAID10 after it was expanded and three more blocks of data were written. Newly written blocks 7, 8, 9 are shaded.

  • ZFS expansion process is instantaneous or nearly so, because no rebalancing is invoved.
  • After expansion
    • old data works with old performance (up to four reads in parallel),
    • new data is only written onto the newly added mirror (up to two reads in parallel),
    • there is no (or close to none) data which can be read at six reads in parallel.

Note that this only applies if the original 4-disk array was nearly full before expansion)

Conclusions

  • If you never add new hard drives to your setup, there is nothing special to consider when using ZFS.
  • If you add drives to your ZFS-based RAID10 setup, consider backing up data, destroying the pool, rebuilding the pool from scratch with all drives in it, restoring from backup. Otherwise, you are most likely missing something in performance.

Created Sunday, August 4, 2019

I have a low volume mailing list, for news and tips, which I send out once or twice a month.
Subscribe if you are interested.