Using Klennet ZFS Recovery

The recovery process is as follows:

  1. Setup hardware/software for recovery,
  2. select source drives, click Proceed,
  3. configure pool and scan options, click Scan,
  4. wait for the analysis to complete,
  5. select what you need from the output (interpreting output is complicated, and I have a dedicated howto for it),
  6. copy the files out.

Hardware and software setup

Klennet ZFS Recovery runs on Windows, even though ZFS is Linux/FreeBSD/whatever filesystem, not available on Windows. Therefore, you need to do one of the following:

  • bring drives (or clones of the drives) to a Windows machine,
  • install Windows on a spare hard drive, SSD, or even a big USB stick, and bring it to the Linux machine,
  • install Windows in a virtual machine and pass the drives through into that virtual machine.

No matter which method you choose, never initialize or format the drives from the pool if Windows prompts you to.

Selecting source drives

ZFS pools typically consist of multiple physical drives, so you need many drives analyzed simultaneously in recovery. Select as many drives as possible, but do not select drives from multiple pools. All drives must be members of the same ZFS pool. If you need to recover multiple pools, recover each pool in turn.

You may want to enable the Read Partitions checkbox and select partitions instead of entire hard drives. However, do not mix hard drives and partitions. Select either hard drives or partitions, but not both. Typically, ZFS Recovery works with entire hard drives just fine.

Configuring scan options

The general rule of configuration is when in doubt, use defaults.

Dataset types

ZFS can provide two types of storage

  • Regular filesystems, with files and directories. These are directly accessed if you run Linux or FreeBSD. If you run a ZFS-based NAS, you access files and directories via a network share.
  • Block-based storage, known as zvol. From the ZFS side, it is a single large file. To the outside, it is presented as a block device, often over iSCSI.

You need to select which of these dataset types you want to recover. If in doubt, select both.

Partition alignment

Normally, ZFS Recovery computes the offset of each ZFS partition on each disk independently. If you know that all disks use an identical partition layout, you can enforce it by selecting All identical.

  • When the pool was created on identical disks at the same time by some pre-packaged NAS system, either hardware or FreeNAS, use All identical.
  • With custom-build ZFS installations and with pools that were expanded by adding drives at some point in the past, use Independent mode.
  • When in doubt, leave this setting at Independent

RAID levels, checksums, and compression

You can specify which features are in use on your pool to reduce the amount of time spent computing possible combinations. This is primarily for use with pools containing 20 or more drives.

  • There is no harm in specifying extra features. This is why all features are selected by default.
  • Deselecting unused features will result in a slightly faster recovery.
  • Deselecting a feature that is used by the pool will likely result in failed recovery.

Setting up encryption keys

If one or several of your ZFS datasets were encrypted, you need to provide the same encryption keys you provided to unlock these datasets. The following considerations apply:

  • If there are multiple datasets, provide multiple passphrases or keys in any order. ZFS Recovery will try all available passphrases and keys against all datasets, so you do not need to match keys to datasets.
  • If you changed a passphrase or a key on the dataset, provide both new and old keys if possible. Due to a ZFS implementation quirk, ZFS Recovery can sometimes recover all data with any one of these keys available.
  • You cannot provide additional passphrases or keys once the analysis is started, so be careful when typing.

Progress stages and indications

There are six steps in the recovery:

  1. Disk scan, two passes. During this step, the entire disk set is scanned twice. Two passes are required due to the complexity of ZFS. All disks are scanned in parallel, so more disks do not require proportionally more scan time.
  2. Disk order analysis is when ZFS Recovery figures out RAID levels, which disk is at which position in the pool, and so on.
  3. Object set parameters. During this step, ZFS Recovery extracts dataset names, encryption keys (if the dataset is encrypted), and checksum salts (if you use Skein or Edon-R checksums).
  4. Object set analysis. In ZFS, an object set is a table describing all files in the filesystem. Because ZFS is a copy-on-write filesystem, many previous versions of the object set are scattered all over the disks. There may be millions of object sets, most of them partially overwritten and damaged.
    Object sets are processed in reverse transaction order. The latest (most recent) object sets are processed first. You can stop the processing after some time, take a look at whatever is recovered, and if you are not satisfied with it, continue the analysis to hopefully recover more files or older files. However, pushing the "Stop analysis" button does not stop the process immediately. Instead, the object set currently being processed is processed till completion. Thus, stopping may take several minutes with large object sets.
  5. Checksum verification. Once all files are located, ZFS Recovery proceeds to verify the data to see if checksums match the data on disks. This is roughly equivalent to resilvering the pool and takes a long time. You can cancel checksum verification, especially if you only need to recover some small number of files. In this case, there is no point in waiting while ZFS Recovery verifies checksums for the entire pool. You can select the specific files and verify checksums for these files only from the file list.

Continue to Understanding output.

I have a low volume mailing list, for news and tips, which I send out once or twice a month.
Subscribe if you are interested.