Podcast Production Pt. 3--Encoding

To some extent, the MP3 encoding process of podcast production is the most important. Sure recording and post-production are up there on the list, but those are fairly easy. And they're the ones most people pay the most attention to. Exporting as an MP3 is easy (so it seems), so most people don't give it a second thought. Which is a shame really, because taking just a few minutes to tweak some settings in the right tool is the difference between a very listenable podcast and one that makes you reach for the "next track" button.

So here's how we roll. For my money, the LAME MP3 encoder is just about the best out there (mainly because it's free and really, really good). To be fair, I haven't sampled every audio editing package's MP3 encoding, but most I've heard are lacking. The easiest way to access the LAME encoder is to use it in conjunction with the free, open source Audacity. Now, if you read last week's post, you know I edit in Soundtrack Pro and then export for encoding in Audacity. I could edit in Audacity, but I don't like it as an editor, and I don't like their compressor as much. Use the right tool for the job. If you like Audacity, however, editing there will save you a step.

We encode 2 versions of the sermon every week. One is a high-bitrate (relatively) "master" version, the other is the podcast version. Since we're talking about podcasts, that's what I'll focus on. If there is enough interest in the "master" process, I'll post those settings as well. For reasons that will become apparent in a minute, I encode the "master" first.

The first step in the podcast version is to run a "stereo to mono" filter on the file. Why? Because what's the point of a stereo preacher? The goal is high-quality sound at low bitrates, so why waste bits on a totally unnecessary track. Run it in mono, and you effectively double your bitrate. Smart and efficient--what's not to like?

Second, I switch the project sample rate to 22050 Hz. Again, this is for efficiency. Since there's not much content in the human voice above 10 KHz, we can safely encode at 22050. There is a principle in digital recording known as the Nyquist frequency. It states that the highest audible frequency that will be produced properly in a digital signal is roughly 1/2 the sampling rate. That's why CDs are 44.1--because we can't hear anything over 20 KHz, and that gives a little room for a steep low-pass filter to get rid of digital artifacts.

Use the right sample rate for the job. 22050 is perfect for voice. Use the right sample rate for the job. 22050 is perfect for voice.So, the Nyquist frequency of 22050 is about 11 KHz; which is above the range of all but the highest sibilance harmonics (which are annoying anyway). Again, why waste bits on trying to produce frequencies that we don't need. Halving the frequency response range again effectively doubles your bitrate. So now, in theory anyway (and it sounds pretty much this way in practice), we can get away with a bitrate that's 1/4 what we would need to reproduce a 44.1, stereo signal. Sweet!

Once we have our settings right in Audacity (stereo to mono has been run, and project sampling rate is 22050), we can export as an MP3. There are 2 key settings here. They are accessed by hitting the "Options" button in the save dialog box. The first is bitrate. I like to use Variable rate encoding with a quality setting of 8 (65-105 kbps).

[UPDATE] After it was pointed out that there was a lot of digital artifact noise in our recent recordings, I did some testing. Sure enough, there's just enough room noise in my office to mask most of it in my speakers. What sounded great through the speakers sounded not so great when I plugged in my UM-1s. After a little more testing, I found that bumping the quality setting to 8 made a huge difference. I A/B'd the original recording with the highly compressed version and found it to be quite comparable. The penalty of the higher bit rate is about a 30% larger file, but the trade off is worth it. Thanks to knewhart for pointing this out and making me double-check my stuff. [End UPDATE]

I normally use the Standard encoding speed (it's a little cleaner, and I'm not in a huge hurry). The most important setting is Joint Stereo. This will create a mono MP3 file from a mono track. If you leave it set to Stereo, you'll end up with a stereo track; and you'll be able to hear the difference immediately. In my experience, this is what separates a mediocre sounding podcast from a great one.

My Updated encoding settings. My Updated encoding settings.Our podcasts typically end up in the lower end of the variable range in terms of bit rate, yet sound like they were encoded at 128 kbps. A 30-40 minute message is typically 7-8 8-10 Megs, which is far smaller than other podcasts I encounter, but sound better.

So there you have it. My secret sauce for podcast encoding. You can download Audacity free from SourceForge, and they have full instructions on installing the LAME encoder package. It's worth the effort.

[UPDATE] On second though, scrap the visit to the UR website. The stuff was not as good as I thought. Here are a few samples to compare the sound quality at 8 and at 9.

test-9-qual

test-8-qual

The test-9-qual is the way I used to encode. If you listen through speakers, you probably won't notice the artifacts (unless your room is really quiet). The second file is the next quality level up. It's much quieter. The noise you hear in the file is the result of our (apparently) noisy recording chain. Sorry for the non-embedded audio links, I need to find a better way to do this.

[End UPDATE]