Codec2: GNU low-bitrate speech codec (2400 bits/second)

mctavjb9 · on Sept 24, 2010

Low bit rate codecs with acceptable quality will be a boon for VoIP over satellite. Satellite Internet service is both bandwidth-limited and hideously expensive-- Inmarsat Fleet Broadband for cargo ships, for instance, typically costs $6/MB. The idea would be to set up local phone networks that employ one of the GSM codecs or mu-law (allowing the use of plain vanilla handsets) and then transcode the voice to a more efficient codec like this one upstream/downstream of the satellite connection to the PSTN.

Incidentally, David Rowe is one of the pioneers of the Mesh Potato.

IgorPartola · on Sept 24, 2010

Thanks for the Mesh Potato mention. I haven't seen this project before but am very interested in stuff like this. My question is: what would happen if we overnight got an entire country wired? What kind of social, political and economic consequences would that have? I'm not thinking of the US, which already has a decent infrastructure, even if we want it to be better/faster/more neutral. I am thinking of African countries, a market largely underexposed to the internet. I suppose I can picture a startup trying to do this. Just think how many more Facebook users there could be...

mctavjb9 · on Sept 24, 2010

Here are some suggestions for further reading: http://www.villagetelco.org

Steve Song's blog (Telecommunications Fellow at the Shuttleworth Foundation): http://manypossibilities.net/

http://openmobile.futuretext.com/

There was an article in Linux Journal last December that includes more technical details on the Mesh Potato.

IgorPartola · on Sept 25, 2010

Thank you very much.

blj0280 · on Sept 24, 2010

I agree that VoIP does not currently work great over satellite. I also agree that satellite internet has a lot of room for improvement. However, I have to say it is a life saver if your only other option is dial up. I also know that the satellite internet companies are upgrading their network next year to increase bandwidth speed. So, there is definitely hope. There is more information on this topic on my blog at mybluedish.com/blog.

adbge · on Sept 24, 2010

It's been my experience that latency is a bigger issue with VoIP on a satellite connection than bandwidth limitations.

hvs · on Sept 24, 2010

I can finally dig out my old U.S. Robotics 2400 baud modem and start doing some low-bandwidth voice communication!

Seriously, though, the samples are very impressive. I was surprised at the quality.

woodson · on Sept 24, 2010

The samples sound quite good, suspiciously good even. I'll give it a try on speech samples from other languages and speakers (this often makes quite a difference).

nitrogen · on Sept 24, 2010

I've always wondered if some human languages work better with speech codecs than others, ever since I saw someone having a cell phone conversation in an Asian language without appearing confused or asking for clarification (based on facial expression, body language, and conversation pacing). My experience speaking English on cell phones is one of constantly repeating myself.

AndrewHampton · on Sept 24, 2010

I don't know much about voice encoding, but I'm really curious as to why all the example files are the same size (46.9 KB). Could someone explain why this is an advancement if the file sizes remain the same?

I suspect it has something to do with all of them using wav as the container, but would love to hear from someone more knowledgeable.

woodson · on Sept 24, 2010

The example files are produced by encoding and then decoding the original. In PCM 16bit raw format they will end up having the same uncompressed size. The encoded bitstream files will be a lot smaller.

For example: hts1a

original: 48000 bytes

encoded: 1050 bytes

Edit: Note that this is only the size of the bitstream written to disk. I didn't look into the actual format.

AndrewHampton · on Sept 24, 2010

Ah, that explains it. I didn't see that in the article.

zandorg · on Sept 24, 2010

As the source files are not in a WAV format, a WAV format must be used to hear it in the browser (or file system).

nitrogen · on Sept 24, 2010

I can hear the 50Hz modulation caused by the 20ms frames in the Codec2 samples. It's particularly problematic on the sibilants from the female sample. Codec2 seems to have a wider frequency response, but MELP is more intelligible.

That said, it sounds like a great version 0.1, and look forward to hearing what comes in the future.

seltzered · on Sept 24, 2010

I'm still trying to understand the processing involved, but it could be interesting if one could use it to effectively do voip calls over EDGE using this - may require having an external dsp dongle though.