Tuesday, May 23, 2006

Crypto with Schneier II: One Way Hash


My reading of Schneier has continued. He still proves to be a very accomodating writer; consider the following on the "One-Time Pad," a encryption scheme that can be literally unbreakable:

"Many Soviet spy messages to agents were encrypted using one-time pads. These messages are still secure today and will remain that way forever. It doesn't matter how long the supercomputers work on the problem. Even after the aliens from Andromeda land with their massive spaceships and undreamed-of computing power, they will not be able to read the Soviet spy messages encrypted with one-time pads (unless they can also go back in time and get the one-time pads)."
In the second chapter he is discussing the basics of protocols and describes the one-way hash: "a function, mathematical or otherwise, that takes a variable-length input string (pre-image) and converts it to a fixed-length output string. Essentially, a computed hash represents a fixed length byte array that is unique for whatever its input stream was. This is interesting and useful beyond cryptography.

Here's a simple idea from Steve Oualline's book "Wicked Cool Perl Scripts," but implemented in C#: you can compute a hash to check to see if two files are different or the same.

The following code uses the System.IO and System.Security.Cryptography namespaces:

private static bool CompareFiles(string path1, string path2) {
byte[] hash1 = GetMD5Hash(path1);
byte[] hash2 = GetMD5Hash(path2);
for (int i = 0; i < hash1.Length; i++)
if (hash1[i] != hash2[i])
return false;
return true;

private static byte[] GetMD5Hash(string pth)
MD5 hasher = MD5.Create();
byte[] fileBytes = File.ReadAllBytes(pth);
byte[] hash = hasher.ComputeHash(fileBytes);
return hash;

I was curious about the performance of this technique - I was originally testing it on 40MB files. I extracted some old database backup files - about 200MB a piece and the comparison took somewhere on the order of 20 seconds. My laptop has a P4 3Ghz with 1GB of RAM. Nice, but not out of the ordinary (especially these days). I've also got suspicions that the speed was not necessarily on the hash computation, but rather on reading out the byte array in the file. I'll test sometime to see...


No comments: