With the aid of Schuyler Quackenbush, former chair of the MPEG audio subgroup, I created a basic perceptual audio coder in MATLAB. This coder uses a number of tricks to lower the bitrate of the signal while minimizing perceptual differences.
The audio coder uses several techniques to optimize its efficiency.
- Dynamic quantization allows for a different number of quantizer bins every 20 milliseconds. This yields greater compression as fewer bits are needed to represent the signal during periods of low amplitude.
- Transient detection code allows the coder to use shorter blocks during periods of quick transients. This avoids the presence of quantization noise prior to the transients, which would deteriorate the quality of the coder.
- A perceptual model is employed to allow for frequency masking, further reducing the bitrate. This model uses knowledge of the human cochlea to eliminate frequencies that may be present in the signal, but would not be perceived by a human.
The idea of entropy coding was also explored, although not used in the final coder.
Here’s a comparison of a recording of Suzanne Vega, before and after audio coding with my codec.
You are also welcome to peruse the source code in my Github repo.