Bifragment gap carving in real-world applications

What is bifragment gap carving?

Bifragment gap carving is a data recovery algorithm for a very specific use case. It can recover fragmented files within the following constraints:

  1. File must be stored in exactly two fragments, separated by a gap which can contain arbitrary data.
  2. Two fragments must be correctly ordered (that is, gap size must be positive).
  3. The gap size must be below certain configurable limit. This is not really a theoretical requirement, because one can set limit to be larger than the media size, effectively allowing any gap size. However, because computational requirements increase as the range of possible gap sizes increases, the practical limit is low (think up to 1000 clusters as of 2018). More fundamentally, increasing the gap size increases the probability of false positive causing two fragments of different files to be merged.
Bifragment gap carving

Bifragment gap carving; top is unfragmented file; bottom is fragmented file.
S is unfragmented file size; L is gap size, D = (S + L)

How effective is it?

Bifragment carving might seem a very limited method of data recovery. However, it is much faster than a full-scale analysis of the entire media, and it scales well when media size increases, so it may be beneficial to employ even if only part of the files can be recovered. Based on a fairly small sample, falling short of 10,000 fragmented files, statistics look like this:

Bifragment file statistics

Distribution of number of fragments per file among fragmented files.

Bifragment gap size distribution

Percentage of all fragmented files which can be recovered for a given gap size.

Put in words, the charts mean this:

  • 65% of all fragmented files are in two fragments,
  • About 35% of all fragmented files can be recovered with a maximum gap size of 128 filesystem clusters.

How Klennet Carver uses bifragment gap carving?

Klennet Carver uses a simplified implementation of bifragment gap carving as a first stage of image recovery. The goal is to quickly eliminate all bifragment files before starting a full-scale analysis, thus reducing overall processing time.

Bifragment gap carving is not used for video or ZIP-based files, because it does not provide any significatn improvement over the algorithms already in use.

On a hard drive, bifragment algorithm is less effective because cluster sizes are much smaller than on a memory card.

Created Friday, April 6, 2018

Updated 20 May 2018