Using Klennet ZFS Recovery

The recovery process is as follows:

  1. Setup hardware/software for recovery,
  2. select source drives, click Proceed,
  3. configure pool and scan options, click Scan,
  4. wait for the analysis to complete,
  5. select what you need from the output (interpreting output is complicated, and I have a dedicated howto for it),
  6. copy the files out.

Hardware and software setup

Klennet ZFS Recovery runs on Windows, even though ZFS is Linux/FreeBSD/whatever filesystem and is not available on Windows. Therefore, you need to do one of the following:

  • bring drives (or clones of the drives) to a Windows machine,
  • install Windows on a spare hard drive, SSD, or even a big USB stick, and bring it to the Linux machine,
  • install Windows in a virtual machine and pass the drives through into that virtual machine.

No matter which method you choose, never initialize or format the drives from the pool if Windows prompts you to.

Selecting source drives

ZFS pools typically consist of multiple physical drives, so you need many drives analyzed simultaneously in recovery. Select as many drives as possible, but only select drives from one pool. All drives must be members of the same ZFS pool. If you need to recover multiple pools, recover each pool in turn.

You may want to enable the Read Partitions checkbox and select partitions instead of entire hard drives. However, do not mix hard drives and partitions. Select either hard drives or partitions, but not both. Typically, ZFS Recovery works with entire hard drives just fine.

Configuring scan options

The general rule of configuration is when in doubt, use defaults.

Dataset types

ZFS can provide two types of storage

  • Regular filesystems, with files and directories. These are directly accessed if you run Linux or FreeBSD. If you run a ZFS-based NAS, you access files and directories via a network share.
  • Block-based storage, known as zvol. From the ZFS side, it is a single large file. To the outside, ZFS presents it as a block device, often over iSCSI.

You need to select which of these dataset types you want to recover. If in doubt, select both.

Partition alignment

Normally, ZFS Recovery computes the offset of each ZFS partition on each disk independently. However, if you know that all disks use an identical partition layout, you can enforce it by selecting All identical.

  • When the pool was created on identical disks simultaneously by some pre-packaged NAS system, either hardware or FreeNAS, use All identical.
  • Use Independent mode with custom-built ZFS installations and pools that you expanded by adding drives at some point in the past.
  • When in doubt, leave this setting at Independent.

Allow files without names

If this option enabled, ZFS Recovery allows files with no name available because the corresponding directory is missing. However, there is a good chance the scan will find the name later, so the default behavior is to hold the file until then. Enabling this option typically allows recovering slightly more of the older versions of files at the cost of a more messy recovery for the newer versions and is generally counterproductive.

Scan depth

Determines how many ZFS transactions will be processed. The processing always starts with the latest (most recent) transactions and goes back in time (towards earlier transactions). There can be millions of transaction remnants on a large pool. However, as we move back in time, chances to recover something useful drop, and processing all the transactions wastes time and RAM.

The default setting (one million transactions maximum) should be good for most cases.

Setting up encryption keys

If one or several of your ZFS datasets were encrypted, you need to provide the same encryption keys you provided to unlock these datasets. The following considerations apply:

  • If there are multiple datasets, provide multiple passphrases or keys in any order. ZFS Recovery will try all available passphrases and keys against all datasets, so you do not need to match keys to datasets.
  • If you changed a passphrase or a key on the dataset, provide both new and old keys if possible. Due to a ZFS implementation quirk, ZFS Recovery can sometimes recover all data with any one of these keys available.
  • You cannot provide additional passphrases or keys after you start the analysis, so be careful when typing.

Progress stages and indications

There are six steps in the recovery:

  1. Disk scan, two passes. During this step, ZFS Recovery scans the entire disk set twice. The analysis requires two passes due to the complexity of ZFS. ZFS Recovery scans all disks in parallel, so more disks do not require proportionally more scan time.
  2. Disk order analysis is when ZFS Recovery figures out RAID levels, which disk is at which position in the pool, and so on.
  3. Object set parameters. ZFS Recovery extracts dataset names, encryption keys (if the dataset is encrypted), and checksum salts (if you use Skein or Edon-R checksums) during this step.
  4. Object set analysis. In ZFS, an object set is a table describing all files in the filesystem. Because ZFS is a copy-on-write filesystem, many previous versions of the object set are scattered all over the disks. There may be millions of object sets, most partially overwritten and damaged.
    ZFS Recovery processes object sets in reverse transaction order. The latest (most recent) object sets are processed first. You can stop the processing after some time to look at whatever is recovered. If you are not satisfied with it, continue the analysis to hopefully recover more or older files. However, the process is complicated enough to warrant a dedicated documentation page.
  5. Checksum verification. Once ZFS Recovery locates all files, it verifies the data to see if checksums match the data on the disks. The checksum verification is roughly equivalent to resilvering the pool and takes a long time. You can cancel checksum verification, especially if you only need to recover a small number of files. In this case, there is no point in waiting while ZFS Recovery verifies checksums for the entire pool. You can select the specific files and verify checksums for these files only from the file list.

Continue to Understanding output.