September 1, 2004Compressed Audio Can Sound Better than Uncompressed!
On June 22, 1633, in Florence, Italy, Galileo was convicted of heresy by the Inquisitors General of His Holiness Pope Urban VIII (formerly Galileos friend and patron). Galileos crime? Teaching, and supporting with evidence from his telescope, the hypothesis of Polish astronomer Copernicus, which suggested that the earth revolves around the sun and not vice versa.
In 1992, 359 years later, Pope John Paul II, himself Polish, convened a commission at which he admitted "limits" on the competence of his holy office and exonerated the scientist. Galileo could not be reached for comment.
I tell you this story as a personal security measure, to discourage all of you authoritative audiophiles out there from donning your black robes and burning me at the stake for the heresy Im about to commit.
A few weeks ago, a friend sent me a CD containing four tracks from various artists. Each track had been recorded three times. The first was a direct copy, the other two "unknowns." I was told that one of the unknowns was another identical copy, but that the other was made after first compressing the track using a compression algorithm known as Ogg Vorbis.
Ogg Vorbis is an open-source codec (compression-decompression algorithm) designed to compete with MP3. Using the principles of psychoacoustics, it cleverly removes information that you either wont hear anyway, or will fill in as part of an auditory illusion. At the bit-rate settings my friend used, Vorbis throws away about 80% of the data. The compressed version was then reconstituted in CD format using the remaining 20%. My friends challenge to me was that I pop the CD into my player and determine by ear which of the two copies had been compressed.
No problem, I assured him. I laugh at danger. With my golden ears, $40,000 high-resolution stereo, and trusty listening partner, I would have no difficulty determining the identity of the compressed tracks. After all, no [snort] compressed format can compete with the real thing. As audiophiles, we all agree that more data, not fewer, are the answer to better sound.
Here is an excerpt from the e-mail I sent my friend:
"Brian and I have listened to your so-called Compression Challenge disc. After a total of eight minutes (minus two for lighting cigarettes and a discussion of the size of my growing waistline), we have arrived at our conclusions . . . Either the tracks we picked were compressed or they were recorded out of the trunk of someones car several blocks away. The sonic differences we noted were smallness of bass, lack of treble extension, lack of midrange fullness and articulation, general flatness and lesser emotional involvement, and tendency to pitch up. These results are considered accurate to within 5%, 19 times out of 20."
He wrote back (with unseemly Schadenfreude, I might add) that we were wrong on two of the four tracks. As it turned out, we might as well have flipped a coin. I spiraled into a deep depression. I pleaded with him to admit that he was pulling my leg. We went back and listened again . . . to no avail. On a second listening, not only was it clear that identifying the compressed tracks was incredibly difficult, it was also apparent that some aspects of the compressed tracks actually sounded better than the originals. Unbelievably, at times the compressed versions were cleaner, had more apparent tonal richness, more grunt, more detail.
How is it done? In short, smoke and mirrors. Sound arises from subtle variations in air pressure. Being physical waveforms, these variations contain a huge, if not infinite, amount of information. The more closely you examine them, the more fine detail there is. Obviously, your ears and brain need to disregard most of that and concentrate on the essential aspects. To do this, multiple levels of rejection are built into the human auditory system. Sounds above and below the frequency and loudness range of the ear are rejected. Loud sounds mask soft sounds (intensity masking). Soft sounds occurring just before or just after a loud sound are rejected (temporal masking). A soft sound that is very close in frequency to a louder sound is rejected (frequency masking). And so on.
Even more interesting, the brain will fill in "missing" sounds, such as the fundamental of a series of harmonics. Present the harmonics, your brain "knows" that the fundamental should be there, so you seem to hear it. Thats how tiny little speakers in headphones can seem to produce deep bass. The trick to compression is to drastically drop data from segments of the music that are likely to be masked or filled in as an audible illusion.
In theory (and this is just me talking), not only can compressed music sound as good as uncompressed, it might sometimes sound better. Because they are manufactured by your brain, auditory illusions are free from distracting intrusions of reality and must be accepted as real at the sensory level. Theres nothing disturbing about this. After all, a good stereo produces a convincing illusion of the presence of musicians making music in your living room. Actually bringing musicians into your living room would require a booking agent.
Nor is Ogg Vorbis the final word. The clever Germans who designed this algorithm have more where that came from, and other designers around the world are hard at work. Eventually, the concept of gigabyte-consuming high-resolution media such as SACD may seem as outdated as hiring a platoon of monks to copy the Bible by hand.
For those monks who remain unconvinced, there is a footnote to the Galileo story. In exchange for a death sentence commuted to house arrest for life, the old and infirm Galileo signed a retraction of the moving-earth hypothesis, which he submitted to the court. As he scratched out his famous signature, he was heard to mutter, "Nevertheless, it does move."
Ultra Audio is part of the SoundStage! Network.