Undeleting ZFS datasets

So, I came across this Reddit post today (emphasis mine).

I booted to Windows and tried Klennet ZFS Recovery, which, also took about a week to scan the whole drive, and the results were not terribly useful. I could see the changes including deleting the datasets, and even documents and pictures I had added to tank/personal/old-secret before attempting any of this, but it was very piecemeal since it was a record of the transactions, not the data.

Doing the undelete is a little bit more complicated than doing a recovery after a full-stop crash. This is because files are deleted in groups ranging from several files to several hundreds of files per transaction, the recovery process is performed per transaction, and the recovery results are grouped by transactions.

The recovery results are grouped by transactions because it is the only reasonable option for a damaged filesystem. The transaction ID is always there, while any other file or dataset attributes may be missing because of the damage. This includes file names, child-to-parent relationships, and timestamps.

This use case is covered by the following checklist

  1. Let recovery run its course, including complete checksum verification.
  2. Have enough space to hold a copy of the entire pool. There will be no separation of datasets. The two options are either to get it piecemeal, or to get the entire lot, and you will be getting the entire lot.
  3. In the object set view, use file selection, and select all files which have good checksums.
  4. Now, with all good files across all transactions and all datasets selected, start copying. You will be prompted for the copier options. Configure copier to
    • put all object sets into a single directory, and
    • skip existing files.

This will produce the copy of all the latest versions of all recoverable files. This works because the copy is performed according to the sort order, and the default sort order is most recent transactions first. So, if the transaction contains a recoverable file, this file is copied out. If the next (older) transaction contains the same file, it will be skipped, because the most recent version was already extracted. The unfortunate side effect is that all datasets are blended into a single directory tree. This is because it is often not possible to determine a name of the object set (be it a dataset or a snapshot) due to filesystem damage. With tens of thousands of object sets, squashing everything into a single directory tree seems like a most practically viable option.

Created 13 November, 2020

I have a low volume mailing list, for news and tips, which I send out once or twice a month.
Subscribe if you are interested.