Comment on Runtime's (GetDataBack) 'Fragmented files are not recoverable'

Runtime (developers of GetDataBack) list their expectations and estimates for recovery success in an article here.

While in most part well-written, their section on fragmentation says:

As annoying as it is, ... [fragmented] files are unrecoverable.

There is no automated data recovery software available that can solve fragmentation satisfactory. If you want to recombine a file consisting of 10 clusters on a 20 GB drive you must analyze, given a cluster size of 32 KB, all possible combinations of one known cluster with 9 other clusters out of possible 625000. There are 6250009 possible combinations, a number with 52 digits.

The only possible and more intelligent approach is a "manual" data recovery for a particular file.
...

Even data recovery service companies will most likely not produce better results.
...

Putting aside the fact that I know the files are recoverable because, for some while, I do these recoveries routinely, the calculation is incorrect.

Given there are M clusters in a file and N clusters on the disk, the naïve approach is indeed to try N(M-1) combinations and see which, if any, produces a valid file. Obviously, this implementation is too slow. What's worse, it is error-prone - the validation routine is bound to produce false positives should you try to validate too many candidate files.

However, there other factors at play, and there are algorithms that are not nearly that slow:

  • A file is unlikely to be fragmented so much as to have no adjacent clusters at all.
  • There is no need to try and determine all clusters simultaneously in one go. In most files, we can look for clusters in turn. This changes the required number of tests from N(M-1) to N*(M-1). It is still bad, but not nearly as bad as before.
  • If the file format allows us to detect long continuous fragments, the effort is reduced even further.

The conclusion is that while the task is indeed computationally enormous, it is not untractable, especially for media of practical size like memory cards.

Filed under: File carving.

Created Wednesday, January 24, 2018

Updated 20 May 2018