Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the sampling theorem. You start with a continuous band-limited signal (e.g. sound pressure [0], low-pass filtered such that there is essentially no content above 20kHz [1]). You then sample it by measuring and recording the pressure, f_s times per second (e.g. 48 kHz). The result is called PCM (Pulse Code Modulation).

Now you could play it back wrong by emitting a sharp pulse f_s times per second with the indicated level. This will have a lot of frequency content above 20kHz and, in fact, above f_s/2. It will sounds all kinds of nasty. In fact, it’s what you get by multiplying the time-domain signal by a pulse train, which is equivalent to convolving the frequency-domain signal with some sort of comb, and the result is not pretty.

Or you do what the sampling theorem says and emit a sinc-shaped pulse for each sample, and you get exactly the original signal. Except that sinc pulses are infinitely long in both directions.

[0] Energy is proportional to pressure squared. You’re sampling pressure, not energy.

[1] This is necessary to prevent aliasing. If you feed this algorithm a signal at f_s/2 + 5kHz, it would come back out at f_s - 5kHz, which may be audible.





> Now you could play it back wrong by emitting a sharp pulse f_s times per second with the indicated level. This will have a lot of frequency content above 20kHz and, in fact, above f_s/2. It will sounds all kinds of nasty.

Wouldn’t the additional frequencies be inaudible with the original frequencies still present? Why would that sound nasty?


Because the rest of the system is not necessarily designed to tolerate high frequency content gracefully. Any nonlinearities can easily cause that high frequency junk to turn back into audible junk.

This is like the issues xiphmont talks about with trying to reproduce sound above 20kHz, but worse, as this would be (trying to) play back high energy signals that weren’t even present in the original recording.


That would mean that higher sampling rates (which add more inaudible frequencies) could cause similar problems. OK xiphmont actually mentions that, sorry, I had only watched the video when I replied.

If I were designing a live audio workflow from scratch, my intuition would be to sample at a somewhat high frequency (at least 48kHz but maybe 96kHz), do the math to figure out the actual latency / data rate tradeoff, but to also filter the data as needed to minimize high frequency content (again, being careful with latency and fidelity tradeoffs).

But I have never done this and don't have any plans to do so, so I'll let other people worry about it. But maybe some day I'll carry out my evil plot to write an alternative to brutefir that gets good asymptotic complexity without adding latency. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: