An unexpected marriage: Tom’s Diner & The MP3 with Prof Karlheinz Brandenburg

How a simple song and a scientist’s curiosity sparked the creation of the MP3, revolutionizing how we listen to, share, and experience music worldwide.

2.20.2025
Words by:
Rodrick Rahim Chattaika Jr.

A familiar melody for many, an intrusively calming voice for some- but not all may know what Suzanne Vega’s hypnotic yet retreating voice inspired for the music industry. Her chanting hums and melodies on “Tom’s Diner” has led to her being placed as the “Mother of the Mp3” assuming Professor Karl Heinz Brandenburg is the father of the aforementioned, being one of the main contributors (among others)  to the invention of the Mp3 codec and the models that allowed it to be widely distributed.

Brandenburg states that although many other songs were used along the way, he heard of many people using the song to test loudspeakers, and decided to try it out for himself, and found the simplicity and clarity of her vocals on the song to be perfect for benchmarking the compression quality of the, then, yet to be formalised…MP3 codec.

Frequencies with: Karlheinz Brandenburg 

Chattaika: Suzanne Vega...Tom's Diner, how did that come along as the inspiration for the psychoacoustics?

Brandenburg: That's a very simple answer. Throughout the development of compression technologies, it was always easy to get some songs to sound nice, and it was always the main task to find the worst case items. I had read in some high fidelity journals that people use this song to test loudspeakers. At our lab in Erlangen, somebody brought the CD and I said, okay, let's try what happens. What happened was that Suzanne's voice, after compressed and decompressed with an early predecessor to mp3, sounded terrible; which was clearly a fault on our end.

Therefore, the song became the benchmark on what needed to be improved for the algorithm for my PhD thesis. We continued to use that as a test item, although it took a few years, but at some point, including work from Jurgen here on two channel stereo coding, it was solved. It sounded very nice.

Chattaika: Interesting, not as random as some may think…

Brandenburg: Sometimes people say it was the first song coded with MP3, this is not true. There is no first song, and there was only always a selection of items. In fact, when we had a real time version of a predecessor of MP3, at some point I even took that box home and listened to many hours of the different music I had at home to find whether there were poorly performing items, and there were others at the time.

[◉"] Dr.-Ing. Bernhard Grill, Prof. Dr.-Ing. Karlheinz Brandenburg, Dipl.-Ing. Harald Popp

Chattaika: Now that you mentioned that, which genres/sounds were the most difficult at that time, in the early discovery phase, to compress?

Brandenburg: There are certain sounds that are very complex, and then there is the symphony orchestra or pop music, where you have a lot of different sounds, they generally are easier to encode. However, single instruments, or the single voice of Suzanne Vega, are more difficult. In tests, we found that people are more critical of their favorite music. So we had one person in our team who liked hard rock, and he was the only one who could hear differences there.

Chattaika: What long term impacts do you think digital audio formats like MP3 have had on the economies of the music industry, particularly for artists?

Brandenburg: Regardless of what service you use, you now have access to a lot of music for a reasonable amount of money. In the long term it definitely benefited the music industry, although sales went down.

But what I heard at the same time, people really flocked to the concerts. So live music got up for quite some time. We also have to remember that music is one of the very few goods where you have to have had access before to like it and spend money on it. 

Yes, I would never just buy music because it's music, I build the emotional connection first.

Chattaika: How do you see modern compression algorithms like FLAC or lossless formats playing a role in enhancing the listening experience, balance in power, size and accessibility.

Brandenburg: There I say something which a lot of people don't like that much. 

MP3 has clear deficiencies; I can detect compression artifacts at any bitrate. AAC, however, especially Apple’s version, is typically indistinguishable from CDs, even for 'golden ears.' Lossless formats are crucial for any post-processing—compressed formats should only be for final listening. Tandem processing (re-encoding and decoding compressed files) causes audible artifacts, but lossless allows full quality preservation across conversions.

Now, with more demand for spatial audio, as more and more people want to experience music on the go as if they are in the concert hall or the recapture the nostalgia of sound of the club, creating that immersive effect for headphones is still challenging. While some companies claim success, real spatial audio for consumers still needs better setups or specialized headphones. We’ve developed a professional solution for multi-channel studio mixing but this requires a professional set-up, true consumer solutions are still evolving.

Chattaika: WIth immersive audio gaining traction, it makes it another barrier for artists that aren't backed by labels to be able to record such content because they can't afford the equipment. Do you see solutions for that? 

Brandenburg: Major streaming services now require both stereo and multi-channel mixes for uploads. The main challenge, however, is in playback rather than mixing. Multi-channel mixes can create an immersive experience with a quality speaker setup or with new headphones expected in the coming years.

Chattaika: Do you see any modern equivalents to how the mp3 revolution happened at that time in regards to distribution, accessibility and democratization for artists, for music?

Brandenburg: Two-channel playback is fairly straightforward—even a home setup today can rival top studios from 30 years ago, though mixing skill is essential. Multi-channel is trickier, especially with uncertain format standards. Currently, we see Dolby Atmos and Sony 360, which use MPEG-H, but I prefer the ADM format by the ITU, as it supports object-based audio with channels and metadata, delivering full spatial details to the listener. 

I recently heard an impressive studio mix in one of these formats—it was a big step up from any two-channel experience, though it required professional gear and room treatment.

Chattaika: And then regarding that, as you also mentioned, streaming and how it's made music more accessible, do you also see any issues that streaming may have caused to the industry and the artists?

I want to have downloaded music at a specific quality.

Brandenburg: Of course, the problem is that we have only a very few very big streaming providers which aren't interoperable either, each with their own format.  I don't know how other people see that. I'm still old fashioned in the sense that I want to have downloaded music at a specific quality.

Chattaika: Yeah, exactly. What are you listening to the most these days?

Brandenburg: Okay. As always the case, I listen to the music that was present to the time when I was young, such as The Beatles, and if I’m on the plane then more low tempo such as classical-as your platform focuses on electronic music, I like that too but not the best for travel listening but the pioneers such as Kraftwerk are ones I can mention.

Chattaika: How do you envision AI shaping the future of music and platforms?

Brandenburg: Before I left Fraunhofer, I directed the Ilmenau institute for nearly 20 years, working on music recommendation engines and music recognition similar to Shazam. While the results were mixed, we saw strong potential in machine learning. Today, AI is perhaps good at creating royalty-free music as there is no artist or composer, which benefits those needing background tracks. Although AI-driven audio compression research has progressed, I remain skeptical—it sometimes 'works' based on metrics but doesn't necessarily sound good. As I recently observed PhD candidates rely heavily on objective measurements without actually listening to the results, which misses the point. Sound quality needs trained ears to assess, not just numbers.

Chattaika: With the resurgence of physical formats, of vinyl alongside digital formats, do you think the next evolution in digital audio will be able to balance both nostalgia for analog and the demands of the digital age.

Brandenburg: I think from all I've heard, and I know this is a question of psychology, so if people like certain sounds better, and yes, you can distinguish the sound of vinyl from a well done digital format, and if you like that better, of course you will spend your money there. That's like a lot of other things in audio where people really want to listen to what they expect it should sound like. This is getting more traction these days, but I don't think it will be a real game changer. In the end, we'll have the different systems all going on, and some people will like the physical formats.

A tight-knit team led by Heinz Gerhäuser challenged industry giants with the mp3. Pictured in 1987 at Fraunhofer IIS: Harald Popp, Stefan Krägeloh, Harmut Schott, Berhard Grill, Heinz Gerhäuser, Ernst Eberlein, Karlheinz Brandenburg, and Thomas Sporer.

Chattaika: When you developed the MP3 format, did you anticipate the level of disruption it would cause in the industry, particularly for artists? 

Brandenburg: I think looking back in history, that was not the first disruption in the industry. In fact, I read that early on there were physical formats done by Edison and others, and then radio came up, and when radio first came up and transmitted music directly. The ascending record industry got down quite a bit, and then later on the same thing happened when compact disks came up, and before that even it came, it happened when the cassette tape recorders came up. So it's just to remember, and sometimes the music industry forgets this, that things are changing.It's no way to just wish that the old business models will be holding on in the future. You have to always look at what is next and try to get on the bandwagon, and then you have good business in the future as well. 

Of course, to get something produced in a very high quality is easier now than ever before. But that means you are still in the long tail. You have to find your fans and you have to do live gigs and all that to become known. And then you can get up to the relatively few people who really make a lot of money from this.

Chattaika: Looking forward, what technological innovations are you most excited about that could transform the music industry again? 

Brandenburg: The old dream is to go from something that reproduces the sound of two loudspeakers on your head,which is great- but there are people working to make the immersive audio much better, and that could be a big breakthrough.

Regarding headphones, we have another vision- most people forget they have their glasses on and proceed to look for them; and we envision creating headphones with the same “not being there” feeling. Headphones that adapt themselves to their environment.

Being in a meeting with multiple people, and feeling like the people are in different places in your specific environment,distinguishing different voices…That's a vision, there's a lot of work to be done. We call it personalized auditory reality.We think that could be the future for listening everywhere, not just to music, but to all kinds of sounds.

Chattaika: I assume the main challenge is the real time matching of frequencies between the external input and the playback?

Brandenburg: Some of this technology is achievable. In our latest demo, the headphones can replace loudspeaker sound so well that users don’t realize they’re wearing them—until they remove the headphones and notice the silence. However, many technical hurdles remain, requiring basic research to address. For truly adaptive headphones, the device must assess and respond to the environment in real time. This requires 'auditory scene analysis,' which remains challenging due to the need for real-time processing without draining the battery. While a few universities are making progress, no major breakthrough has been achieved yet.

We’re beyond excited to welcome Karlheinz Brandenburg as an advisor to Peachz and to have him speak at our upcoming Next—Set conference in Berlin this June. His groundbreaking work has reshaped how we experience music, and we can’t wait to explore the future of sound together.

Learn More about Next—Set here

You might also like