There is no JBOD in ZFS (most of the time)

The forum question made me think about non-redundant ZFS pools:

if I configured a pool of drives in ZFS without redundancy (like JBOD) and one of the drives failed, would I still be able to somehow repair the file system and get back the files that weren't stored on the failed drive?

This discussion implies that you understand traditional RAID levels. If you need to refresh your knowledge on this, read the Wikipedia article about RAID.

Let's start with a disclaimer: regardless of the filesystem, RAID, and storage media, there are no guarantees about recovering anything from a non-redundant system. ZFS, however, is complicated and fun to go down the rabbit hole of details.

ZFS uses three-tier logic to manage physical disks. Disks are combined into virtual devices (vdevs). Vdevs are then combined into a pool (or multiple pools, but I'm talking about a single pool now). Vdevs can be of different types – simple (single disk), mirrors (two or more identical disks), or RAIDZ/Z2/Z3 (similar to RAID5, tolerating one, two, or three failed disks, respectively). You can add vdevs to the existing pool, which expands accordingly (it will be significant later).

It may seem that if we make several vdevs consisting of a single disk each and then combine them into a pool, the result will resemble a traditional JBOD. That does not happen. Traditional JBOD will allocate space for data from the start to the end of the array. When one of the disks fills up, the next disk is used, and so on (this is not exactly correct, but a good approximation nonetheless). ZFS pool allocates data blocks on different vdevs in turn. When ZFS writes a large file, it puts file blocks onto different vdevs. However, if you add a new disk (and thus a new vdev) to the pool filled to near capacity, no automatic rebalancing takes place. Whatever files you add to the pool will be mostly written to a newly added disk.

Let's draw some diagrams:

Traditional JBOD Traditional RAID 0
Disk 1Disk 2 Disk 1Disk 2
15 12
26 34
37 56
4 7

Traditional JBOD and RAID 0 layouts.

Balanced ZFS pool,
two vdevs,
never expanded
Unbalanced ZFS pool,
one vdev,
expanded to two
vdev 1vdev 2 vdev 1vdev 2
12 15
34 26
56 37
7 4

Two-vdev ZFS pool layouts.
Left - the pool created initially with two vdevs;
Right - the pool created with one vdev, filled to capacity, then expanded.

Notice that the balanced ZFS pool looks like a RAID 0, and the unbalanced pool looks like a traditional JBOD. In practice, the difference is that some important ZFS metadata will be copied twice and stored on two members of a JBOD.

Knowing the above, we can answer the original question – if [in] a pool ... without redundancy (like JBOD) ... one of the drives failed, would I still be able to somehow ... get back the files that weren't stored on the failed drive?

  1. In the general case, no. ZFS pool without redundancy is not like a JBOD. It behaves more like a RAID0.
  2. There is an edge case: when we expand the pool by adding drives one by one to the nearly full pool, it behaves more like a JBOD. Some files can be recovered then, but still no guarantee.

Another important thing is where does the metadata go? ZFS maintains several copies of metadata, but I'm not sure what will happen if one of the vdevs is full. It might so happen that after a certain point, the updates will only go to the latest-added vdev. This part of the question must be largely academic because it will only happen when the vdev is literally full to capacity.

Filed under: RAID, ZFS.

Created Thursday, December 20, 2018

Updated 26 August 2019