Klennet Storage Software

Understanding ZFS Recovery output

Understanding recovery output may seem tad complicated because of how ZFS works. ZFS pool contains multiple object sets, each of which describes all files in the particular filesystem or a particular version of zvol. There are tens of thousands and sometimes millions old versions of each object set in any given pool. Many if not most of these versions are partially damaged.

Object sets and TXG numbers

Each object set has a sequence number assigned to it as it is written to disk. This number is called Transaction Group Number, or TXG number, or simply TXG. TXG number corresponds to the age of the object set. Latest (newest) object set has highest TXG number. Older object sets have lower TXG numbers. TXG numbers are very useful for ordering, but it is not possble to associate actual clock time in hours or minutes based on TXG numbers.

Duplicate files and file versions

Each of the many recoverable object sets in a ZFS pool describes all the files. If the file is not changed between two transactions, two object sets will contain identical files. So, to reduce the number of duplicate files recovered, ZFS Recovery only keeps one copy of the file, in the latest object set the file was seen in.

Let's walk through the example:

  • There are total of four slots in the sample object set, numbered 1 through 4.
  • There are four recoverable object sets, with the corresponding TXG numbers 50 to 53.
  • Slot 2 is initially occupied by a green file, which is edited to produce a blue file between transactions 50 and 51.
  • Slot 3 is initially occupied by red file, which is deleted between transactions 51 and 52.
  • Slot 4 is occupied by the yellow file, which never changes.
TXG
number
File
1 2 3 4
53 (newest)
52
51
50 (oldest)

Sample filesystem object sets as seen on disk; remember the time goes from bottom to top

Recovery result will only keep files in the object set they were last seen in. All other duplicates are discarded.

  • TXG 50 contains the latest version of a green-blue file before editing.
  • TXG 51 holds the red file, because it is the latest version of the filesystem before the red file is deleted.
  • TXG 52 is removed entirely, because it does not contain any changed files.
  • TXG 53 holds green-blue (after editing) and yellow files, representing the latest actual version of the filesystem.
TXG
number
File
1 2 3 4
53 (newest)
51
50 (oldest)

Sample filesystem as recovered, after removing duplicate files

Despite the fact that real filesystem does not have four files and four transaction groups, the presentation still follows the logic above. Let's look at the screenshot.

Klennet ZFS Recovery results view

Klennet ZFS Recovery results view

  • Object sets (which are effectively filesystem snapshots made at various times) are listed in the top section. You can sort object sets in different ways, but I recommend you sort by TXG, which places most recent object sets on top.
  • Once you select one of the object sets, its corresponding directory tree is displayed at bottom left.
  • If you select a directory, bottom right panel shows the list of files, with their sizes and quality for each file. Quality estimation is based on ZFS checksums. It will show "Good" if all the content can be recovered and matches checksums, "Bad" if nothing can be recovered, or percentage value if only part of the file can be recovered.

Let's examine each line. Keep your eye on the number of files (Total files column).

  • First (topmost) object set, TXG 4948, is a small filesystem used by FreeNAS for its own purposes. It holds 29 files.
  • Second object set, TXG 4946, is a primary filesystem loaded with test data, slightly more than 30 million files.
  • All the subsequent object sets only contain previous versions of files. As the filesystem is pretty static, only few files are changed, and older object sets hold very few files.
  • Note that it is possible to have two datasets changed in a single transaction group and thus have identical TXG numbers (TXG 48FE).

Possible actions

  1. Resume analysis If you stopped object set analysis earlier, and did not find the data you need in the results, clicking Resume analysis will return to and resume the analysis stage. As analysis goes from newest to oldest object sets, continuing analysis allows to go further back in time.
  2. Select files Opens the file selection dialog, where you can do mass selection and deselection of the files.
  3. Verify checksums. This will verify checksums for whatever files are the currently selected. If the checksums are already verified for some of the selected files, these checksums will not be re-verified again. This is only needed if you skipped checksum verification during analysis.
  4. Export file names. This will export list of file names, sizes, and checksum status into a CSV file.
  5. Copy selected files. The most useful of then all, this asks you where to put the selected files, some other settings detailed below, and then starts copying.

File copy options

There are two sets of options associated with copying the files:

  • Write each object set in its own separate directory, or combine all object sets into a single directory.
  • If duplicate files, or multiple versions of the same file are encountered, the existing file can be overwritten, skipped, or multiple versions can be kept.

The copying goes in the same order as object sets are displayed, top to bottom. In a typical order, sorted by TXG numbers with highest topmost, most recent versions of files will be copied first, and oldest versions will be copied last. Thus, if you want the most recent versions, set copier to skip existing files; if you want earliest versions, set copier to overwrite; and if you want all versions, set it to rename.

I have a low volume mailing list, for news and tips, which I send out once or twice a month.
Subscribe if you are interested.