The Adventures of Audio Post for Film
Early in 2008, a friend sent me an email with a link to a YouTube video. The video was a preview of a movie he was producing/directing called "Overnite Shift". I was immediately intrigued by the excellent look of this no-budget movie, and something about the characters piqued my interest. I did however have reservations concerning the audio, which to my ears was problematic. On getting in touch with him, he informed me that most of the film had already been shot, but that he was having issues with the sound person delivering on his commitments. The director and I had previously worked together, so he asked me if I wanted to become involved in the project.
My main focus is music with a background that includes a commercially released CD, along with client production, engineering and mastering credits. I also love a good movie and a challenge, so I agreed to join the team. Many years prior I had worked as the production-sound person on several professional projects, and therefore knew what was involved in the early stages of movie making, but I had little direct experience of audio post production. As it turned out, my involvement ended up covering virtually every aspect of the post process, from sound design, through dialogue editing, Additional Dialogue Recording (ADR), Foley and mixing. What follows is a sharing of my trials and tribulations, not a definitive "how to" article concerning audio post production. And hopefully in the telling, some of my experiences may help others who dare to embark on a similar journey.
The 30+ minute movie tells the story of an older, immigrant New York City taxi driver who during the evening of Halloween picks up a series of passengers. Through witnessing their interactions we start to learn a little of his life, and in a confrontation with his final passenger, are reminded that life is not always linear. That past misdeeds committed thousands of miles away (however innocently), can be exposed at any moment, allowing those ghosts to resurface and haunt us once again.
All the scenes had been filmed entirely in a moving vehicle (except where it had to stop for traffic lights, stop signals, or to let off passengers). And since there was no room in the taxi to fit an additional crew member, there was no dedicated production sound available. Production sound is what the dialogue editor and mixer use after the movie has been shot to complete the final audio track, and usually comprises clean dialogue and any ambient material that is relevant to a scene.
The camera person had initially experimented with lavaliere microphones and a dedicated audio recorder to try and isolate the dialogue, but because there was no hard sync between the camera and the recorder, post-synchronization became a problem. Also, hiding the microphones (such as in the talent's hair) was too time consuming, so the decision was made to go exclusively with on-camera sound. To add a further complication, the driver and his passengers were never filmed together - in fact, none of the actors ever met each other. They were all filmed on different days, in slightly different cars, resulting in dialogue that contained different underlying ambiences (mostly the road noise).
That meant all I had to work with was the recorded dialogue from the on-board microphone of a Panasonic AG-HVX 200 video camera, along with a few bits of ambient noise captured at various moments throughout the shoot. Things like doors opening and closing, directional indicators, etc. Using these resources and anything else I could get my hands on, my job was to clean up the dialogue and recreate the sense of a continuous ambience of a taxi as it travels the streets of New York City.
Since the film consists of an introduction and four chapters, the overall workflow fell conveniently into five distinct "reels". The director sent me a QuickTime movie of the first chapter (2nd reel), along with the individual audio tracks - the driver, the passenger and the music. I specifically requested that he export each audio file from the beginning of the video clip, regardless of whether there was any audio at that point. This meant that even in the absence of a time code reference, when imported into my editing system, all the files would automatically line up and be in sync. The files were too big for sending via email, so we used one of the many FTP sites that allow for free or low cost transfer of data, a method we employed for the rest of the project.
My editing system was based around a Digidesign 003 Rack using Pro Tools LE 7.4 software running under Mac OS X 10.4.11. Monitoring was via a pair of Dynaudio Acoustics BM6A powered speakers. In retrospect, it would have been wonderful to have access to some of the new features in Digidesign Pro Tools LE 8 software, especially the track-comping feature while doing ADR. However, since I started the project well before the new version was released, I followed the old adage – DO NOT UPGRADE YOUR SYSTEM in the MIDDLE of a TIME-CRITICAL PROJECT.
Another indispensible item was Digidesign Music Production Toolkit 2, which includes Smack LE, a wonderful sounding plug-in compressor featuring a side chain input that I used for ducking various elements under the dialogue, and Hybrid, a great integrated virtual synthesizer used for certain special sound effects. (Both of the last items are available separately.)
Having aligned everything into Pro Tools, I started work on cleaning up the first-passenger audio. I have many different noise reduction plug-in bundles, but my favorite is iZotope RX. As with any of these programs, it is impossible to completely clean up source material with a heavy noise component without damaging what it is you are trying to isolate. So compromises had to be made. Also, the program is very DSP intensive when working in real-time, introducing a substantial amount of delay (4094 samples) in the treated audio. However, this was easily resolved by selecting all the regions of the processed track, and spotting them back to their original location using the "Spot" function in Pro Tools.
iZotope RX processing (click to enlarge)
It was at this point that the director decided that he was not happy with the driver's accent and wanted to replace it using an actor who spoke with a genuine Hungarian accent. He also voiced the idea of calling in all the other actors to replace all of their dialogue to allow for a consistent sonic imprint. (Note, in these instances patience and calm demeanor are a definite plus!)
There are several requirements for getting good results from ADR, a process whereby actors repeat their lines in a controlled environment, either by watching the original performance on a video screen, and repeating the lines over and over until a good performance is captured, or by simply listening to the original and doing the same. It is generally utilized for dialogue that has undesirable intrusions or is too noisy.
A very important requirement for capturing good ADR is to make the performing talent feel comfortable and to make sure they have good access to the audio and visual cues. The recording space needs to be as close to dead (acoustically speaking) as possible without being an anechoic chamber, something that is not a very comfortable environment to be in. Since the human ear is very sensitive to aural information gathered from reflections generated by the surrounding spaces, it is important to have as little of that information as possible in the recorded material. (Imagine trying to place re-recorded dialogue that sounded as if it had been spoken in a large hall into the interior of an automobile. It would sound completely wrong.)
In the end, the passenger ADR idea was shelved, because there was no money available in the dwindling budget to rent a properly equipped ADR studio. And the free spaces offered were far from comfortable or well isolated. However, the driver’s accent was still an issue for the director and so we eventually settled on a solution. Since the director’s first language was Hungarian, we would work with his voice and record the ADR in my studio. It would cost us nothing except time, and the fact that my studio was not really equipped for that kind of work would not be an issue for either of us.
Pro Tools does not have a dedicated ADR function, so a work around had to be devised. The solution was to place accurate markers at the beginning of each line that needed to be replaced. A "floating" audio file was created, containing 4 evenly spaced audio beeps as countdown prompts. The last beep was muted and this floating audio file was then tail-sync-aligned to each marker using Command-Control-Click with the Time Grabber Tool. After a few practice runs of listening to the original audio (we had settled on the audio-only ADR method), Pro Tools was placed in loop record and as many passes as necessary were recorded.
Floating click file (click to enlarge)
To capture the voice I mounted a Sennheiser MKH-416 shotgun microphone slightly above the director's head, pointing downward towards his mouth, in a position that would normally mimic a boom-microphone position when used on a movie set. Ideally, one wants to use the exact same or similar microphone that was used in the original shoot. In this instance, since it didn't make sense to once again rent a video camera, I went with a "classic" microphone that imparts, for want of a better term, a "movie" type sound.
The next step was to re-sync the new audio with the picture, again bearing in mind that the eyes and ears are very sensitive to things being out of sync. For this process I used a program called VocALign Pro from Synchro Arts. Although stripped down, VocALign Project does the same thing and would have been quite up to the job. The basic idea is that the software analyzes both the original and processed waveforms, and then places the new audio where it thinks it ought to go. While it generally works very well, file selection is important and sometimes shorter segments need to be processed. This was especially so in my case, where the original audio had content (noise) that was not present in the new signal and could throw off the original analysis. In extreme cases minute adjustments were made manually.
Having finished these two jobs, we decided to have a progress check. The audio from the first ADR session was combined with the cleaned passenger audio, temporary car sounds were added, and a rough audio mix exported. This was then placed back into the director's video editing software for a large screen viewing. The first thing to become apparent was that although emotionally and contextually accurate, and with excellent sync, the sound of the director's ADR did not match the "look" of the original actor "age-wise". It is difficult to describe in writing, but, even with the process of "aging" the voice by pitch and formant manipulation using Celemony's Melodyne plug-in, the result was not quite right. It still amazes me how sensitive the ears are when presented with information that conflicts with visual stimuli.
However, as we watched the clip a far greater problem gradually emerged, which negated several months of intense audio editing. By the end of the 6 minute reel, the audio was completely out of sync with the video, by around 3 seconds. It didn't feel good! On the plus side, it was felt that the restoration of the passenger audio was on track, but the project could not continue until the issue of synchronization was remedied. What would need to be redone was not immediately obvious - but the answer to that will have to wait for Part 2.
Jurek Ugarow is a technical writer for B&H, creating professional audio content for the web. He brings to the task over 40 years of experience in many aspects of the music and audio industry. His early days were spent as a musician playing London’s cabaret circuit and doing spots for the BBC. In the early 70s, he was a founding member of a British Arts Council funded, residential arts community, an endeavor that would eventually give birth to the renowned Foots Barn Theatre. Subsequent years were spent working in TV and film as a production sound engineer, and as a member of various English rock bands. These days he spends his time in his own studio, writing music, recording and producing other artists, and doing audio post for film. He also practices and teaches Tai Chi Chuan in New York City. His music can be found here.
Top Pro Audio categories:
Recording | Desktop Audio | Keyboards & Synths | ENG, EFP & Broadcast | Live Sound & PA Accessories