Synchronous rate conversion

Let's say we are dealing with a SYNCHRONOUS rate conversion problem, from 44.1kHz to 48kHz, where both rates are derived from the SAME clock source. Meaning that, when measured with the same stopwatch (clock), the 44.1kHz rate is EXACLTY 44.1kHz, and the 48kHz rate is EXACTLY 48kHz. In this environment, one could simply interpolate by 160 to 7.056MHz and decimate by 147 to 48kHz:

The first thing to do (at least conceptually) is to take the 44.1kHz data stream and stuff 160-1=159 zeros, evenly spaced in the time domain, between each sample. Then you apply the "zero-stuffed" sequence to a low-pass filter (cutoff at half the original sample rate or 22.05kHz), which will finalize the interpolation process by "filling in" the zeros with real data points. (hard to envision, but this is what happens with digital filters).

Now we simply grab every 147th sample, to generate our output sequence at 48kHz. That "grabbing every 147th sample" is precisely DECIMATION. Just digital re-sampling, or down-sampling. We didn't need a special filter for decimation in this case, because the interpolation filter did it's job for us.

Asynchronous rate conversion

In ASYNCHRONOUS sample rate conversion. The 44.1kHz clock and the 48kHz are NOT derived from the same time base, meaning they are not exactly the numbers they claim to be, when measured with the same stopwatch. So, in this asynchronous world, if have two time lines: one with ticks every 44.1kHz, the other one with ticks every 48kHz. Thus by definition, Asynchronous Rate Conversion is this:

There is NO interpolation by any FINITE integer you can do on the INPUT sample points ... meaning, no finite number of samples you can evenly space between the original samples ... that will give you perfect alignment with all the OUTPUT sample points.

Asynchronous Sample Rate Conversion: How does it work

Here we have a situation where we need to digitally CALCULATE output data samples at new points in time, different than the input data time points. And the time points are NOT synchronized. ... so there is no finite integer interpolation that we can perform on the input data that will align PERFECTLY with the output time points we need.

But we don’t have to be PERFECT. Conceptually, we will interpolate the input data by a pretty huge number, in fact, by a pretty huge INTEGER number, so that we effectively "fill in" very many samples in-between the original Fs_in samples. So instead of providing an interpolated input sample that corresponds to that exact point in time, we can provide the closest (in time) interpolated sample that is calculated. How BIG will the error be? The question is therefore is ... How far do I have to interpolate, by what integer, so that grabbing the most recent sample will give me acceptably LOW error ?

We can never achieve perfection here, but we can get ARBITRARILY CLOSE ... simply because the higher I interpolate, the lower the error. So the question is, how big should the interpolation number “N” be?

After a lot of math, it can be shown that input data at the Fs_in rate needs to be interpolated by N~2^20 (1,048,576 -over a million!), so that the error incurred by Asynchronous decimation to Fs_out will be acceptably small. But 2^20 is HUGE resulting in a frequency of 50 GHz and this is for an input sample frequency of 44.1 KHz.

Fortunately, because most of the samples are going to be ignored when decimating to the output sample rate (e.g. 48 KHz), we don’t have to calculate them in the first place. We only need the samples closest to the output sample rate. In order to do this we use FIR filters (Finite Impulse Response). And though clever implementation, and by knowing the output sample rate, the computational demands of implementing the FIR filter are manageable in a typical commercial implementation. (Meaning you don't have to do all the math or have heavy-duty computational engines).

My observations: Rate conversion from 44.1KHz to 192KHz is the same as 44.1 KHz to 176.4 KHz

From the above discussion, it can be seen that because the system is designed to select the interpolated sample that is closest to the output sampling rate, it does not matter whether the output sample rate is an integer multiple of the original sample rate or not.

This is important because in some audio circles it is believed that for CD material with its native sample frequency of 44.1KHz, it would "sounds better" if sample-rate converted to 88.2 KHz or 176.4KHz (integer multiple) than to 96KHz or 192KHz (not integer multiple). The argument presented here (by the expert at DIYaudio), at least from a theoretical point of view implies that the closest matching sample to clock-ticks in a 192 KHz clock, for example, should be no worse than the closest matching sample to clock-ticks in a 176.4 KHz clock.

## 1 comment:

such drop sample interpolation is archaic. Read the SRC4192 patent for something up to date. Notice it needs no 7mhz oversampling but only 16. Its possible to reduce this even more to only 4.

Post a Comment