Preventing data changes during the recovery

I originally intended to write a quick note about problems running ZFS Recovery inside the VM. The subject, however, turned out to be more expansive. The VM concerns are discussed at the end of this article.

When running any Klennet data recovery software, generally any data recovery software, and specifically Klennet ZFS Recovery, ensure the recovery software has exclusive access to the disks.

"Exclusive access" means that nothing else writes to the disks.

This is important because the recovery software mostly relies on the fact that data on the disk does not change. The recovery software often does not keep the filesystem information in RAM. Instead, it keeps references to the data on disks and reads the data from the disks as needed. If the particular item is read once, and there is a reference to that item, the recovery software expects that item to be there again when needed.

There are two ways for the data to change unexpectedly:

  1. the sector on the disk may become bad (or the entire disk may fail), or
  2. some other application (including the operating system) writes something to the disk.

Mechanical problems

There isn't much you can do to protect against disks failing unexpectedly or developing bad sectors.

If you are recovering a single disk or a small number of disks, make copies. Make disk image files or clone the disks, then have data recovery software work with these image files or clones. In this case, you can make another clone if either the original or the clone develops a problem.

If you are recovering a configuration with many disks, making copies of everything may be impractical. At least look through the SMART attributes for all the disks and make copies of any disks that don't have perfectly clean SMART.

Software interference

During the recovery, you need to avoid any modifications of the data on disks by any software that may be running. Since all my recovery software runs on Windows, the scope of the discussion is limited to Windows.

Windows running directly on the hardware

If you are recovering a Linux filesystem (EXT, F2FS, ZFS, or some such), Windows will not touch it. As a general rule, if Windows sees a filesystem it does not recognize, it asks you once to format the disk, and after you answer NO, Windows does not touch that filesystem. So, if you bring a NAS disk set to a Windows machine for recovery, it is enough to answer NO to a "format" prompt for every partition (typically, four times per physical disk).

The same applies to what's known as the "RAW" filesystem, the filesystem which is so badly damaged as to be unrecognizable. Decline one "format" prompt, and Windows loses all interest in the partition.

Now, if you are recovering the FAT, NTFS, or ReFS filesystem, Windows may want to write to it. Whatever quick automatic fixes it attempts (a feature called Self-Healing NTFS, archived) are already done by the time you are thinking about recovery, so it is too late to worry about these. However, you want to prevent any background activity by antivirus software, search indexing, backup software, and whatnot. To do so, you need two things:

  1. Watch the screen while Windows boots for the first time after you have connected the disk. If you get the CHKDSK prompt on system startup, press any key to cancel the disk check. If you can't catch it in time, cut the power off and try again. You don't want CHKDSK doing anything to the damaged disk.
  2. Once in Windows, use Disk Management to unassign the drive letter from the drive you are recovering. Windows is generally not interested in partitions that have no drive letter assigned.
Removing drive letter using Windows Disk Management

Unassigning the drive letter using Windows Disk Management.

Removing a drive letter does not change data on the disk. You can later assign the drive letter back. However, do not remove the drive letter from your system drive, for that will cause issues.

Windows running inside a virtual machine

Running a recovery from inside the virtual machine introduces a couple of new factors.

  1. VM hypervisors are not very good at handling hardware problems. If something is wrong with the hardware, be it disks, controllers, or cabling, being inside the VM often makes it more difficult to isolate the problem.
  2. While Windows does not read non-Windows filesystems, the Linux hypervisor does. Take special care to ensure the host does not have the disks mounted simultaneously with the VM. Yes, this is possible in some configurations.

Filed under: ZFS.

Created Wednesday, April 26, 2023