Everything about File Compression: Take 3 November 18, 2008Posted by ConnorSmith in : Chit Chat, File Compression, Project Studio , trackback
…which, although commonly referred to as “Apple Audio Codec”, this actually stands for Advanced Audio Coding.
(This article assumes you have read Everything About Compression: Take 2)
Although originally coming out of MPEG-2 specification, AAC is known as Part 3 of MPEG-4. AAC was designed as a newer, “better” sounding alternative to MP3. Generally, the AAC encoder follows the same steps as the MP3 encoder, but let’s take a look at what makes the two formats so different.
First off, the filterbank (used to “chop up” the original audio file and determine the frequency content) in the AAC encoder is not a hybrid filter like the mp3 (which uses both an FFT and and an MDCT). The AAC encoder just uses an MDCT (the most important thing to note here is that an MDCT will typically overlap its bands of analysis, thus reducing any errors that might come at the “edge” points between blocks). AAC’s MDCT also has greater resolution than MP3′s (1024 vs. 576). AAC can also switch its block size (the length of the audio pieces it is analyzing).
The AAC also has a superior psychoacoustic model. The model was developed later (and likely spun from the MP3 original research). Presumably, the encoder with a more accurate perceptual model will be able to achieve better quality at lower bit-rates (kbps). The perceptual model is arguably the most important element in an encoder, so AAC’s improved psychoacoustic model is the most prominent reason why AAC sounds better at identical bit rates to mp3. (Note: “better” is obviously very subjective an open to debate. This is speaking rather theoretically)
Finally, AAC also has some additional tools which can help with efficiency, including Temporal Noise Shaping (TNS) and a Prediction stage. The inner processes of TNS are a little beyond the scope of this entry… but basically its a type of noise shaping and prediction that was originally implemented to help improve the sound of speech at lower bit rates. It helps with the efficiency and quality of the encoding. The prediction stage is not always used, as it is most useful in signals that are easily predictable (sine tones or tone-like soures).
Phew… enough of the technical jumble…
Oh, I should also mention that most AAC encoders will have the option to encode with variable bit-rate (which I went into in the VBR Explained Article).
So that’s AAC…
The Studio Files