[Breathe, Jonathan. Just breathe. Oh, the reader has arrived.]
Oh, hey friend. Do you remember a couple of months ago when Gail Riplinger claimed the MD5 hashing algorithm was used to distort her voice? If not, allow me to jog your memory:
“Since the MD5 algorithm is open source, programs to distort a voice and make it subtly more difficult to discern are widely available to non-professionals; dozens of apps can be purchased to do this.”
In the above sentence, the word “since” means “because”. The statement “the MD5 algorithm is open source” is given as the cause to the statement “programs to distort a voice…are widely available.” Gail claims MD5 can distort audio.
I pointed out in my response that MD5 hashes are used for many things, but they aren’t used for audio distortion. Her assertion to the contrary demonstrates a complete lack of relevant knowledge on her part.
Recall that the MD5 hashing algorithm accepts variable-length input, and produces 32 characters of output. Most importantly, recall that this algorithm loses all original data. Pass in 64,512 characters of data, and you will get 32 characters of output. Pass in 0 characters of data, and you will get 32 characters of output.
Well, Gail is now trying to save face. Or not. Honestly, I have no idea what this lady is thinking. To even attribute thought to her aimless scrawlings is generous.
In her most recent post, Gail claims that I am “unsure as to how MD5 is related to audio forensics.” As shown in my earlier article, I understand how MD5 is used to test file integrity. But Gail didn’t attribute MD5 to file-integrity; she attributed it to voice distortion.
It seems now that Gail is claiming to have used MD5 to test the file signature (or, the hash generated from the file). As I pointed out in my original article, this is a legitimate use of the technology.
Gail provides a couple of helpful quotes:
“Any changes, even the simple act of opening and resaving a file without any content changes, can alter the calculated MD5 value.”
“Two of the most common hashes used in the audio/video forensic field are message-digest algorithm 5 (MD5….)”
Correct. No disagreement here. These authors understand the topic. Gail does not.
Let us, for now, pretend that Gail never claimed MD5 was used in audio distortion. Let us pretend that Gail claimed from the start that she used MD5 to authenticate James’ debate audio. Now, let us see how quickly even our imagination betrays us.
What audio does Gail have in her possession? Well, she claims to have received “originals” in the form of one or more cassette tapes:
“On the originals, my voice is clear…I received the original tape from a listener.”
I called Gail after reading this, hoping to get her copy of the debate to compare for myself, and perhaps upload for the consideration of others. After all, if what she claims is correct, and James has an edited copy, James should be confronted.
Gail told me that she did not have a digital copy; only a cassette. She did, however, say that she would try to get a family member to digitize the audio. I believe she was going to check with her son-in-law.
So lets assume that Gail does have a recording, and that the recording differs from James’. Lets assume she has her very own cassette tape (as she claims), just as James has his.
What problem has Gail created for herself now? Gail claims to have performed a forensic analysis using MD5 on the debate audio; but MD5 doesn’t work unless you have a digital copy (such as a WAV or MP3) to begin with. You cannot do digital analysis on a cassette tape, Gail.
You may be asking yourself, is there any way Gail can redeem this MD5 story? There might be, but it’s going to take a little more lying on her part.
If KRDS had provided identical (bit for bit) MP3 files to White and Riplinger, and White uploaded a manipulated version to a public share, Riplinger could then download White’s file and check its MD5 hash against the hash from her own original MP3. This would tell her whether White made modifications or not.
But even in this alternate history, Gail’s own source says “…the simple act of opening and resaving a file without any content changes, can alter the calculated MD5 value.” So even if Gail performed a digital analysis, a variant MD5 hash doesn’t necessarily mean manipulation; it could mean nothing more than a harmless re-saving of the data.
Quit now, Gail. For your own sake, quit now.