Klennet Storage Software

Hash stability

Hashing is used to ensure that the content remains the same over time. Once the image is acquired, it is associated with its hash value. At any later time, the hash value can be recomputed over the image to ensure the image was not changed in any way. This is possible because any change of the image, even if one single bit is changed, dramatically changes the hash value. For the hashes to match, two images must be identical.

This is only useful in forensics, data recovery people should not bother, really.

There are complications though, coming from the fact that any change in data, no matter how minor, completely invalidates the hash. No matter how much was changed and at what location, the hash value is invalid.

Using hash to ensure no changes sounds like a great idea, and, when applied to a file, it actually is. A live filesystem volume is an entirely different ballgame. Any modern filesystem writes onto the volume when the volume is opened. This happens even if the volume is not mounted (that is, has no drive letter assigned to it). NTFS, for example, will update its journal every time the physical drive is connected. Effectively, every time you connect a hard drive to a Windows PC, every volume on that hard drive has its contents changed and will get a different hash. Obviously, the same applies to reboots.

Drives with bad sectors are often not stable in the sense that different sectors will be bad at each imaging pass. Obviously, this will cause different hash values with each new pass. So, with bad sectors, it makes sense to hash the target image, not the source drive, if you want to ensure that the take is stable over time and was not tampered with.