Better Recordings of Sermons

I've been thinking about the recording of messages lately. I've been asked about it a few times in the last few months. The question usually goes something like this, "We want to record our pastor's sermon, should we go straight to a CD recorder, or into a computer then burn a CD?" If you have read this blog for any length of time, you're probably thinking my answer is, "It depends." But you'd be wrong. In this case, I always like to go to a computer for recording. The reason is simple: When you record straight to a CD, what you record is what you get. If you start too early, you have a bunch of dead time up front. Stop to late, same thing at the end. If your level was too low, the CD will be too quiet. Too loud...well, even the computer won't save you there.

So I'm going to tell you what I do here at Crosswinds. This is not necessarily the definitive way of recording a message. But it works really well for us. We have dual destinations for the recording, CD duplication and the web. My goal for the finished product is a message that is easy to listen to, without a lot of intervention (ie. adjusting the volume up and down) on the part of the user. I consider the environment and equipment people will use to listen to the message--it will be either their car or at the computer. Not exactly the greatest places to discern maximum quality. That's why I go for listenable. Yeah, I know that's not a word, but work with me here, OK?

Signal routing wise, we take the direct out of the preacher's microphone and run it into a compressor. You should know that the pastor's channel is already insert compressed, albeit pretty mildly, to even out the volume in the room. Yes, I know this is double compression and I could eliminate it if I double bussed the channel. For what we're doing it's not worth it. From the compressor, we run into a 3rd party sound card (way, way better than on-board audio). You could also use a USB or FireWire audio interface. While recording, we take care to keep the peaks at about -12 dB to ensure we don't run out of bits and distort the signal. Finally once the message is recorded, I apply some additional dynamics control to it. Yup, that's right a third pass of compression.

At this point, audio purists are tearing their robes, putting ashes on their heads and crying, "Oh, the humanity!" I don't care. I want a CD that I can put into my somewhat loud truck's in-dash and listen to it while driving down the road without turning the volume up every time the pastor gets quiet.

I thought some visual aids might be helpful to explain some of what we do. The following example was taken from another churches website to demonstrate via waveforms what we're doing. I'm guessing the recording was not compressed in any way prior to being recorded. In my setup, we already have a mild 2:1 comp, plus a little more aggressive 3.5:1 before we get to the recording. But this shows what you can do "in post" when you record to a computer (or if you're just a purist and want to jump through whatever hoops you need to to only compress once--knock yourself out).

Let me issue a disclaimer here: I am going to discuss ratios of Loud to Quiet in a minute. I determined the ratios based on pixel counts of the waveforms. I know they're not accurate dB ratios and do not represent true loudness levels. But for the purposes of illustration, work with me here. We're going for concept, not 100% theoretical accuracy.

Let's look at the original waveform, as recorded.

The Initial Waveform

As you can see, there is a pretty wide dynamic range to this recording. The red bar represents the volume of the loud parts; the yellow, the soft. It was also recorded pretty low, so I really had to crank it up to hear it. What we're seeing is that the loudest parts of this passage are roughly 13 times louder than the quiet parts (refer to above disclaimer...). In practice, that burst at the beginning (on the left) was really loud, but by the time we got to the right, I was having trouble hearing it over the noise of the fan in my room. I had to keep turning it up.

So let's see what happens if we were to apply a compressor filter to this in our favorite audio editing program (for this example I used Audacity, which is really cool, really powerful and really free). To start, I tried a compressor with the threshold set at -30 dB and a 3:1 ratio. This is how it looked afterward.

-30 dB, 3:1 Ratio

1 Compressor Was Applied

Look at the difference. Now our Loud to Soft ratio is somewhere around 5.5:1 (again, see above disclaimer...). What this means is that there is significantly less difference between the softest passages and the loudest ones. The overall level is pretty low however, so we'll apply a normalizing filter to it. After we do that, it looks like this:

Normalizer Dialog

I will typically normalize to -1 dB, just to give it a little buffer (which, by the way, is why I didn't normalize to 0 dB in the compressor dialog). The normalization process takes a look at the signal and applies gain to the entire recording until the highest peaks touch the level you specify (the maximum amplitude). We do this at this point because we know what our highest peaks are (unlike when we are recording live), and we may as well take advantage of all the headroom the system has available.

1 Compressor, normalized

That looks better, but when I played it back, there was still a little more dynamic range than I wanted. So I got a little more aggressive with the compressor. This is how it looks with the threshold set at -35 dB and a 4:1 ratio

-35 dB, 4:1


1 Compressor is applied

Ok, now we're getting somewhere. What we are doing here is taking the loudest parts of the message and bringing them down closer to the softer parts. In this case we're down to just over a 3.5:1 ratio between the loud and soft. Again, we'll hit it with a normalizing filter and it looks like this:

1 with a normalzier

Now that's what I'm talking about! That will be super easy to listen to, and believe it or not, there's still plenty of dynamic range in the recording to easily tell when the speaker is emphasizing his words and when he's pulling back and speaking softly. The implied dynamics are there, but now I can actually hear what he's saying.

If this was recorded at the proper level (-12 dB peaks), a -35 dB threshold would have been way too low, and would likely have sucked the life out of the recording. You don't want all the peaks at exactly the same level, because that just sounds weird. You will have to experiment with this to determine what sounds good for your system.

What is the lesson here? Whether you want to compress the recording on the way to the computer is up to you (though I think every speaking mic should be insert compressed for better performance live anyway), but the above examples give you a good idea of what you can do to maximize the listening pleasure of the audience after the fact. I have two presets set up in Audition (the software we use at church to record with) that first compress then normalize the signal. We save a .wav file, which is burned to a CD, then an MP3 which goes on the web. Since we started doing this, we've not had a single complaint about the volume level of the sermons on the CDs or on the web.

And again, I know I'm taking liberties with the concept of dynamic range here, and I know this is not, purely speaking, the most pristine way to record audio. So no flaming comments about the inaccuracy of my math or the flaws in my signal chain, OK?

Given the time constraints we're under to turn these recordings around, and the fact that the FOH engineer has enough to do mixing the service, never mind constantly tweaking recording settings, this works really well for us.

Try applying some post-recording compression this weekend. Best of all, if you don't like it, there's always "Undo!"