Bifragment gap carving in real-world applications

What is bifragment gap carving?

Bifragment gap carving is a data recovery algorithm for a specific use case. It can recover fragmented files within the following constraints:

  1. A file must be stored in exactly two fragments, separated by a gap that can contain arbitrary data.
  2. Two fragments must be correctly ordered (the gap size must be positive).
  3. The gap size must be below a specific configurable limit. This is not a theoretical requirement because one can set a limit larger than the media size, effectively allowing any gap size. However, because computational requirements increase as the range of possible gap sizes increases, the practical limit is low (think up to 1000 clusters as of 2018). More fundamentally, increasing the gap size increases the probability of a false positive, which causes two fragments of different files to be merged.
Bifragment gap carving

Bifragment gap carving; top is unfragmented file; bottom is fragmented file.
S is unfragmented file size; L is gap size, D = (S + L)

How effective is it?

Bifragment carving might seem like a very limited method of data recovery. However, it is much faster than a full-scale analysis of the entire media and scales well when media size increases. Bifragment gap carving may be beneficial to employ even if it can recover only part of the files. Based on a relatively small sample, falling short of 10,000 fragmented files, statistics look like this:

Bifragment file statistics

Distribution of a number of fragments per file among fragmented files.

Bifragment gap size distribution

Percentage of all fragmented files which we can recover for a given gap size.

Put in words, the charts mean this:

  • 65% of all fragmented files are in two fragments,
  • About 35% of all fragmented files can be recovered with a maximum gap size of 128 filesystem clusters.

How Klennet Carver uses bifragment gap carving?

Klennet Carver uses a simplified implementation of bifragment gap carving as the first stage of image recovery. The goal is to quickly eliminate all bifragment files before starting a full-scale analysis, thus reducing overall processing time.

Bifragment gap carving is not used for video or ZIP-based files because it does not significantly improve the algorithms already in use.

The bifragment algorithm is less effective on a hard drive because cluster sizes are much smaller than on a memory card.

Filed under: File carving.

Created Friday, April 6, 2018

Updated 20 May 2018