Studio Monitors: Color Calibration for Sound
Why should you purchase studio monitors for audio production when you already own speakers? Working at B&H I’ve met a lot of photographers and videographers who understand the need for using a calibrated graphics monitor instead of their TV sets to establish accurate color values when making correction decisions. They get it. They know that audio engineers also need special-purpose gear to evaluate how loud material is when mixing. Meet the studio monitor.
Roughly speaking, in audio, the kind of color (i.e. "blue") is termed "frequency" and the color's intensity ("dark blue" vs. "light blue") relates to how loud any given frequency is -- the "amplitude." Frequency and amplitude are defining properties of "sound" which is a fundamentally physical event originating from molecules repeatedly vibrating outwards and inwards in mediums like liquids, gases, solids or plasmas. The number of times the molecules vibrate both outwards and inwards in one second is quantified in "hertz" or "Hz." One Hz is also known as one "cycle." Humans can hear molecular disturbances between about 20 to 20,0000 cycles per second. Or you could say humans hear between 20Hz and 20,000Hz (20kHz). While we quantify how many times molecules move outwards and inwards in one second in Hz, we formally (and literally) refer to this as a sound's "frequency" or "pitch." 'Does it sound "high?"' 'Does it sound "low?"' Etc. The wave's amplitude or loudness is then determined by the "vigor" of the disturbance rather than the number vibrations. When you look at a pulsating speaker "driver" -- a "woofer" or a "tweeter" -- you are, along these lines, actually regarding the generation of a range of molecular disturbances within the spectrum you know to be sound (20 Hz - 20 kHz) at a variety of amplitudes, based on the source material. The degree to which a speaker faithfully "replicates" frequencies in accordance with their actual amplitudes is the specific difference between speakers and studio monitors.
We use the term "frequency response" for measuring how much the amplitude of frequencies is altered upon playback by the speaker. Take a look at the two frequency response graphs below for a visual representation. The number of "dB(SPL)" the plotted lines diverge from the 0dB axis is the measurement of the speaker's prejudice in amplitude replication for any given frequency. ("dB" is just a generic exponential unit of measurement, and "dB(SPL)" is, more specifically, an exponential measurement of the loudness of sound -- though we often abbreviate this as "dB.") You will notice the graph on the left has a plotted line that diverges significantly from the axis relative to the plotted line in the graph on the right. Which one looks like the studio monitor?
The graph on the right is, of course, the studio monitor and the one on the left the speaker. Notice though that even the studio monitor is not entirely flat in its replication. Monitors can diverge up to +/- 3dB from the amplitude of source frequencies. You will consequently still need to "learn" the loudness-nuances of your monitors, but, to put it in perspective, unless you have well-trained ears, you won't notice changes under 3dB without prior warning. With this standard in mind, studio monitors are highly accurate.
Speakers on the other hand replicate frequencies to easily audible degrees of difference from their original amplitudes. Such modification may be great for adding a compelling "thump" to the low-end of the song you are listening to on your home stereo but it would make it especially challenging for you to mix a song on that set of speakers because you could easily make the mistake of thinking you had the low-end "sitting" at a pleasing level when it could, in extreme cases, be a lot closer to only half as loud as you thought it was! Conversely, there could be a dip somewhere in the mids on a set of speakers you are trying to mix on which could result in an ill-informed decision to significantly increase the loudness on that portion of the frequency spectrum, resulting in an unwanted peak when listened to on a different set of speakers without that particular frequency response characteristic. Either archetypal scenario will lead to an exaggerated mix heard through exaggerated speakers when what you are ultimately striving to present is a neutral mix to be enjoyed on exaggerated speakers.
But wait! There is yet another frequency response curve you must factor into your quest for a more neutral perspective: the frequency response of the room.
Returning to our color correction example, consider that you could be color correcting on a calibrated graphics-monitor displaying accurate colors but still not be seeing true values due to the way they are modified by the luminosity of the room before making it to your eye. The same holds true for audio where it is the physical size and shape of the room that additionally amends your sonic experience. This occurs as waves reflect off surfaces and fold back in on themselves in ways that - depending on the dimensions of the room relative to the dimensions of the wave - will increase the loudness of some frequencies and decrease the loudness of others. This phenomenon is termed "constructive and destructive interference" and it means you could have a top notch set of studio monitors emitting signal that, after passing through an un-treated room, could, by the time it reaches your ear, end up sounding like a signal emanating from those speakers you were trying to avoid mixing on in the first place!
Now imagine how much worse it would be trying to mix with speakers instead of monitors in an untreated room!! It's this kind of scenario that can especially lead to the aforesaid example of perceiving frequencies at half or double their real amplitude.
To counteract our room we acoustically treat our space to absorb some waves and spread others out until our room's frequency response levels off around the +/- 3dB range we look for in monitors:
The Master Handbook of Acoustics is a great resource for learning more about this science. However, short of actually mastering acoustics, solutions like RPG's Studio in a Box can prove handy for those needing a simple and relatively inexpensive method of achieving a "flatter" room. Another useful option is Auralex's free room assessment form. Follow the steps to detail the specifics of your space and Auralex will specify your room's problem frequencies in conjunction with what kind of treatment you should be using and where.
You can additionally harness modern room-correction technology that essentially takes a sonic imprint of your mix environment and corrects against the room's frequency response curve before emitting a signal. The idea is that the signal will be fairly accurate by the time it gets to you. I wouldn't recommend this solution by itself but would recommend it in conjunction with actual room treatment. If your room is already decent, digital room calibration like this could definitely take it to the next acoustic level. Two such examples are KRK's ERGO and IK Multimedia's ARC.
In any event, you will still need to learn your treated room's remaining amplitude distortions, and, ultimately, the combined characteristics of your studio monitors in your treated space so that you can begin interpolating "Good" mixes that sound good on all speakers in all rooms. This outcome is, as we have seen, a tall order when referencing information you can't trust!