This guy took some interesting leaps of faith to attain his result.. I'm impressed. I wonder if he had background in this industry to expect each square long wave to represent 1 or 0? I would have expected them to have a more intricate binary-to-audio algorithm considering how much science must exist in this area (modems, HAM, etc)
I don't think there were any leaps of faith in there.
First, they thought 1s/0s might match the waveform. Apparently wasn't the case.
So they decided to look at the data - see the histogram, which is incredibly skewed towards four values. So they tried 1s/0s on those values. Nada.
So they wrote a script to try different endianness, alignments, flipped bits, etc, and they got lucky because 'KORG' was there in ASCII.
Had those failed, there probably would have been quite a few more attempts, but Korg didn't remove their signature from the data, so it wasn't necessary.
It's a reasonable assumption, given the appearance of the audio file as a rough square wave. Also, I'm unfamiliar with the monotribe, but it's probably a rather simple device without a lot of processing power for a complex modulation. PWM is already fairly common in electronics as well.
I stripped out some of the things from the blog post that I tried that lead me down the wrong path, so while it reads like I kept making progress.. I left out the 3 or 4 things I'd try without any results.
I have done a bit of work in the past reverse engineering:
- I figured out the SHA-1 encoding mechanism on xbox 360 .XEX files
- I modified a laptop's BIOS loading bitmap image (that originally said 'toshiba') - http://os-fun.blogspot.com/
- A friend and I extracted the level data for the game "skyroads". We couldn't figure out the file's compression format so we ran the game in DOSBox, and dumped the entire 640k memory with GDB and then searched it for the decompressed levels, then ported them to some open source skyroads game.
This sort of analysis is pretty common for legacy formats -- never seen it for modern data, though, so that's pretty awesome. Here's a story about doing the same sort of analysis with a tape cassette for the Apple I. http://www.pagetable.com/?p=32