For well over twenty years, people working with audio have seen dither on hardware, digital audio workstations, and plug-ins, yet many fear to tread near it. Some have filed it away as unexplainable and irrelevant. Others proclaim startling “truths” about its miraculous powers. The good news, or perhaps bad, is that dither will not transform your musical mud into gold, nor will it topple your towering wall of sound into rubble and debris. So, relax a little, but not too much.
Let’s start with some facts that your parents probably never told you. Dither is nothing to fear. It’s nothing more than randomized noise. Yes, that’s it. Noise. It’s not a lot of it either. It’s typically quite low in level. Liken it to the sound of that subtle hiss from an old analog tape machine. This may lead you to ask the big question, “Why is it here?” It may seem strange that audio engineers, equipment manufacturers, and plug-in makers, who spend countless hours in pursuit of LESS noise, would willingly use something that adds it.
Threshold of Pain
People add dither as a necessary evil in digital audio, though it’s not always necessary. To understand when it is needed, we must know and accept what happens in the digital domain without it. When audio is digitized, its dynamic (level) resolution is limited by the system’s bit-depth value. The higher the bit depth, the more accurately the system can assign the values of incoming audio levels. Each bit yields 6 decibels (dB) of dynamic range. So, a 1-bit PCM recording would only support digital levels from as high as 0 dBFS (0 decibels full scale, the ceiling in digital audio) to as low as -6 dBFS. This means that any signal quieter than -6 dBFS would cease to exist. That would be tragic! There are dynamic variations greater than 6 dB in many spoken sentences. Most modern digital audio workstations and recorders use 24-bit resolution (or higher), which provides 144 dB (24 bits X 6 dB per bit) of dynamic range. This means only the signals quieter than -144 dBFS would vanish.
To give some perspective, the human hearing system can only handle about 120 dB of dynamic range safely (the threshold of pain resides between 120 and 140 dB above the threshold of hearing). Speaking of humans, most humans listen to digital audio at a reduced bit depth. CDs use 16-bit resolution, which yields 96 dB of dynamic range. Music streaming sites typically utilize codecs such as MP3 or AAC, which can further reduce resolution through a process called data compression.
Here’s the scoop on one common situation. You have a radiant, glorious mix that is ready to be burned to CD. That means that you need to convert your precious 24-bit data from your digital system to a 16-bit file for CD compatibility. You’re going to lose some bits in the process. Oh, your poor, precious bits. The loss of 8 bits, when truncating from 24 to 16 bits, yields a loss of dynamic range. Since 24-bit resolution supports 144 dB of dynamic range, but 16-bit only supports 96, any audio signals below -96 dBFS will be lost. That’s what happens without dither. That’s not exactly encouraging. Furthermore, due to the decreased resolution provided by 16 bits, the system may incorrectly plot low-level signals near the -96 dBFS cut-off point, rounding some levels up and some levels down. These errors are called quantization errors or quantization distortion.
When Should You Apply Dither?
This is where dither steps in and becomes the thing you’ll want to take home to meet the parents. Remember how dither is just low-level randomized noise? If you add dither at the right stage, that low-level randomized noise is mixed in with your glorious mix before the bit depth reduction of 24 to 16 bits. The addition of noise raises the noise floor of your mix, which in turn brings even your lowest audio level above -96 dBFS (the cut-off point imposed by 16-bit resolution). Those quantization errors mentioned previously are eliminated. That’s great! “Wait just a tick,” you might be thinking as you contemplate the merits of extra noise in your hot track. No, it’s not going to ruin your mix. The dither noise, commonly between -80 and -60 dBFS, will be masked by the louder level of your original mix. So, you won’t hear the added noise. However, if there was an extremely quiet passage and your monitors were extremely cranked (do NOT do this), you may hear something that sounds rather like tape hiss. Yep, that would be the dither.
If you find yourself in the situation that requires you to convert from 24 to 16 bits, you need to know exactly when to apply dither. It should be applied at the absolute end of your digital signal path, before the bit-depth conversion occurs. If you are using a DAW and route everything to a master fader, dither should be applied on the master fader. If you have other processors such as EQs and compressors on your master fader, dither should be positioned after those processors. Dither may be available as a separate plug-in or integrated into “mastering” plug-ins and brickwall limiters. When you use it, set it to the destination bit depth (16-bit in this example).
Stream It or Beam It
When setting the dither’s bit depth, you may find selectable noise-shaping types. Noise shaping refers to the shape or frequency response curve of the dither noise. One noise shape will produce different frequencies at different levels when compared to another noise shape. Different noise shapes effectively shift the dither noise into different areas of the frequency spectrum. These are designed for various types of source content, from speech to full-range music. To find out if noise-shaping is right for you, consult your doctor or just read the dither device’s manual.
There are those who claim, “CDs are for old codgers! Stream it or beam it, but don’t CD it!” Whether you pitch your tent in that thicket, download- and streaming-only releases are standard fare. So, back to that glorious 24-bit mix. If you’ll be going from a 24-bit source to a data-compressed file type such as MP3 or AAC, you may not want to add dither. This is because many data-compression algorithms, including the ones used for MFit (Mastered for iTunes), SoundCloud, and Bandcamp, are optimized for 24-bit source files. However, standard iTunes (not MFit) and CDBaby are designed for 16-bit files. You should dither for them. If you will be sending your mix to a mastering engineer, do not dither your bits. Send the mastering engineer your full-resolution, 24-bit (or higher) files. Dithering will be handled by the mastering engineer.
Now, armed with facts, let us address some lies and general chicanery surrounding dither. I was once told by a self-professed “audio guru” that dither fixes phase problems. This charming character claimed, “If you’re mixing layers of guitar tracks that are out of phase, bus them to a stereo channel, then dither that channel. The dither will fix the phase problems.” Real truth or fake news? Well, phase deals with timing. Adding noise will not change the timing relationships of individual tracks. So, it seems that the illustrious audio guru was a sneaky snake. Another falsehood you should decry is, “You can add dither to every channel in your DAW to make it sound like an analog console.” Lies! Dither is simply noise. Simple noise cannot emulate the complexities of the equalization, transformers, and harmonic distortion offered by analog consoles.
If you’ll be converting the bit depth up, then down, which may be necessary when integrating multiple softwares into your workflow, the decision to dither depends greatly upon the source and destination bit depths. Unfortunately, the number of variables involved (bit depths, number of conversions, pre-mixing or pre-mastering) result in the answer being variable. The general recommendation is that if you’ll be converting from a high fixed-point bit depth to a low fixed-point bit depth (24 to 16, 16 to 8, etc.), you should dither. If you’ll be converting from a floating-point bit depth to a floating-point bit depth, you don’t need to dither. If you’ll be converting from 32-bit Float to 24-bit (fixed), you don’t need to dither because 32-bit Float only has 24 bits of audio data, with 8 bits of headroom in which the 24 bits “float.” If you’ll be converting from 32-bit float to 16-bit (fixed), you should dither because you’d truncate 8 bits otherwise. So, “It depends.”
Conclusion
In conclusion, it’s generally advisable to dither only when converting from a high bit depth to a lower bit depth (32 to 24, 24 to 16, 16 to 8, etc). Since that conversion should only be done once, dithering should only be done once. However, remember that if you will not be lowering the bit depth of your content, as is common when preparing files for mastering, Bandcamp, or MFit, dither should not be done. So, keep your wits about you and be wary of wild claims. Dither wisely, dear readers.
2 Comments
The comment about reducing from 32-float to 24-fixed is demonstrably wrong. This is easy to prove in any good spectral editor where you can observe the effects of truncation distortion. Whether or not it's audible is another matter, and open to debate, but regardless, the correct way to reduce bit depth, whether from 32, or 64 float, to 24-ficed, or lower, is to use dither.
Excellent explaination of something I knew nothing about. I am one of those that had no idea what dithering was used for. As it turns out, I don't believe I've used dither in my recordings since I began digital recording somewhere around 1997. That's a long time to have been ignorant on the subbject. As my recording has typically been voice-over for broadcast and sent as .wav and later .mp3 files, I apparently skated by with no issues.
Thanks for the article and the much delayed education!