There is no JBOD in ZFS (most of the time)

The forum question made me thinking about non-redundant ZFS pools:

if I configured a pool of drives in ZFS without redundancy (like JBOD) and one of the drives failed, would I still be able to somehow repair the file system and get back the files that weren't stored on the failed drive?

This discussion implies that you understand traditional RAID levels. If you need to refresh your knowledge on this, read the Wikipedia article about RAID.

Let’s first start with a disclaimer: regardless of filesystem, RAID, and storage media, there are no guarantees about recovering anything from a non-redundant system. ZFS, however, is complicated and fun to go down the rabbit hole of details.

ZFS uses three-tier logic to manage physical disks. Disks are combined into virtual devices (vdevs). Vdevs are then combined into a pool (or multiple pools, but I’m talking about single pool now). Vdevs can be of different types – simple (single disk), mirrors (two or more identical disks), or RAIDZ/Z2/Z3 (similar to RAID5, tolerating one, two, or three failed disks respectively). You can add vdevs to the existing pool, and the pool expands accordingly (it will be significant later).

It may seem that if we make several vdevs consisting of a single disk each, and then combine them to a pool, the result will resemble a traditional JBOD. That does not happen. Traditional JBOD will allocate space for data from start to end of the array. When one of the disks fills up, the next disk is used, and so on (this is not exactly correct, but a good approximation nonetheless). ZFS pool allocates data blocks on different vdevs in turn. If a large file is written, its blocks are put onto different vdevs. However, if you add a new disk (and thus a new vdev) to the pool which is filled to near capacity, no automatic rebalancing takes place. Whatever files you add to the pool will be mostly written to a newly added disk.

Let's draw some diagrams

Traditional JBOD Traditional RAID 0
Disk 1Disk 2 Disk 1Disk 2
15 12
26 34
37 56
4 7

Traditional JBOD and RAID 0 layouts.

Balanced ZFS pool,
two vdevs,
never expanded
Unbalanced ZFS pool,
one vdev,
expanded to two
vdev 1vdev 2 vdev 1vdev 2
12 15
34 26
56 37
7 4

Two-vdev ZFS pool layouts.
Left - the pool created initially with two vdevs;
Right - the pool created with one vdev, filled to capacity, then expanded.

Notice that a balanced ZFS pool looks very much like a RAID 0, and unbalanced pool looks very much like a traditional JBOD. In practice, the difference is that some important ZFS metadata will be copied twice and stored on two members of a JBOD.

Knowing the above, we can answer the original question – if [in] a pool ... without redundancy (like JBOD) ... one of the drives failed, would I still be able to somehow ... get back the files that weren't stored on the failed drive?

  1. In general case, no. ZFS pool without redundancy is not like a JBOD. It behaves more like a RAID0.
  2. There is an edge case: when the pool is expanded by adding drives one by one when the pool is nearly full, then it behaves more like JBOD. Some files can be recovered then, but still no guarantee.

Another important thing is, where does the metadata go? ZFS maintains several copies of metadata, but I’m not sure what will happen if one of the vdevs is full. It might so happen that after certain point updates will only go to the latest-added vdev. This part of the question must be largely academic because it will only happen when the vdev is literally full to capacity.

Created Thursday, December 20, 2018

Updated 26 August 2019

I have a low volume mailing list, for news and tips, which I send out once or twice a month.
Subscribe if you are interested.