HDSLR Guide Chapter 14: Post Production
In general, the post requirements for most of these cameras are quite similar. All shoot to flash cards in one of several major formats that are reasonably easy to transcode into an editable format. The choice is less about the media they produce than it is about the essential characteristics of that media: resolution, frame rate, and potential problems like noise and moiré.
At the time of this writing, the popular Canon EOS-5D mkII was due for a firmware update to expand its frame rate options, so the entire Canon range (the 1Ds, Mark 5D, 7D and T2I) offer the same selection of basic frame rates. The fractional 23.98 and 29.97 rates are clearly intended for people who need to shoot for NTSC territories, while the 25 frame-per-second (fps) option will be useful for PAL people. We’ll discuss why these fractional frame rates are important, below.
The availability of rates beyond 30 fps on cameras other than the 5D is useful for anyone who needs slow motion, although the reduced resolution and more problematic aliasing makes these frame rates less attractive than they might have been.
Canon’s cameras record h.264 Quicktime at a data rate around 40-45mbps.That is quite generous: it is slightly more than a Blu-ray disc that achieves very good subjective quality. The critical feature of the h.264 codec is that it is non-intra-frame: it uses similarities between frames to achieve better compression. Now this works, but it also means that frames are not entirely independent. This can improve quality, but is much harder work for both the computer and the camera; at a purely practical level the encoding is not nearly as well done as a commercially mastered Blu-ray movie.
Canon & Brightness
Users of Canon’s DSLRs need take care over the way luminance is handled during any conversion process. Understanding what’s going on here means understanding a bit of history about how video signal levels are handled digitally, and how that differs from normal computer imaging practice.
When computers first started handling digitized raster images — that is, pictures made up of rows of pixels — the obvious approach was to represent one RGB component of each pixel as a number. Familiar practice puts these numbers in the range of 0-255 for an 8-bit image where (0,0,0) represents full black, and (255,255,255) full white. The problem arises when trying to encode video images, where a zero voltage level on a video signal line does not represent full black. Black is in fact a level somewhere above 0V. Because of considerations like this, the decision was made to represent digital video images between the levels of 16 and 235, where (16,16,16) represents black, and (255,255,255) represents white. This is sometimes referred to as ‘studio swing,’ whereas the full range of data would be ‘full swing.’ Studio swing is specified in ITU-T Recommendation. 709, among other places, and is the predominant approach used for YUV high-definition video.
Skipping to the present day, we’re in a situation where this distinction is often not well handled. Some software assumes that anything that’s in YUV is studio swing, and anything that’s in RGB is full swing. This assumption falls apart often, particularly when decoding an h.264 Quicktime from a Canon DSLR, which applies full-range image data to a codec that represents YUV data usually in studio swing. This will often result in the black and white levels of the image being misinterpreted, altering brightness and contrast and potentially destroying significant amounts of information. These shifts can happen either at the point the original file is read, during the conversion, or when the output file is brought into the NLE software.
Anyone converting material from a Canon DSLR into a more editable format will usually hit one of a few situations, some of which can be detected only with test tools such as the waveform monitor.
Correct conversion: All the information from the input file is present in the output file, and the NLE interprets the output file appropriately, placing the original black at zero and white at 100% on a waveform display.
Premiere Pro will behave this way if the camera's original file is imported, laid on a timeline, and rendered out to an AVI with a codec such as Blackmagic’s 8-bit 4:2:2 HD uncompressed, then reimported into a project set up with a Blackmagic preset. Unfortunately, Adobe saw fit to omit the batch conversion feature that was present in older versions of Premiere, so it’s difficult to use Premiere Pro as a conversion utility unless the user is willing to manually convert clips one by one or simply lay all the clips on a timeline and render them out into one large file. If it's preferable to work in a compressed intermediate format, Cineform’s NeoScene encoder achieves correct results as well.
Expanded range: The output file contains all the information from the input file, with black still at 0 and white still at 255, but is interpreted as studio swing. This means that some image data, which uses the full 0-255 range, may lie outside the 16-235 studio swing range and will be clipped. This is visible on the waveform monitor:
Because the NLE will not display blacker-than-black or whiter-than-white material, the result appears high in contrast, with crushed shadows and clipped highlights. This can usually be repaired using something like a processing amplifier filter or even in general color correction, but this is time-consuming and not a very practical approach as a general workflow. This problem will occur when using commandline tools such as “ffmpeg” to convert to uncompressed YUV AVIs.
Cropped: The input file is incorrectly assumed to be studio swing, and all information outside the 16-235 range is ignored. Output file may be studio or full swing, but in either case information is lost.
This is particularly insidious because the image looks exactly like the “expanded range” problem, since NLEs will not render information outside the 0-100% range on the waveform monitor. However, there is no fix for this: information below 16 and above 235 is absolutely lost and cannot be recovered. This problem will occur in several situations, but most often when converting to an RGB output.
Almost right: This principally applies to Quicktime, which, depending on the environment in which it’s working, may apply a small gamma shift to the image.
Sometimes, this may be acceptable, especially if it was known about from the start and camera setups were made with it in mind. Any Quicktime-based application may do this, particularly the free MPEG Streamclip application. Because MPEG Streamclip is so easy and useful, and since this situation does not cost much information when going to a 10-bit output, this gamma shift is often overlooked.
Frame Rates & Audio
Anyone who hasn’t run into this situation before will be very reasonably asking why we complicated fractional framerates and thus why the 7D is important. It’s received knowledge that PAL TV goes at 25fps, and NTSC goes at 30; but, in fact, there hasn’t been a video format that runs at precisely 30 frames per second since the U.S. got color TV.
This originated during development of the NTSC color system. It was necessary to maintain the compatibility of the color broadcast with existing black-and-white TVs, so engineers added an additional component to the radio broadcast to encode the color information, which monochrome sets would ignore. At the 30fps of the previous black-and-white system, aspects of this new component happened to collide with the radio spectrum allocated to the audio broadcast, producing audible interference. The only solution was to change the frequency relationship between the two: essentially, to detune one away from the other. The easy choice would have been to change the audio frequency, but that would have made existing monochrome TVs unable to receive audio with the new color system. Instead, a decision was made to alter the color encoding signal. Maintaining essential relationships within the signal then required a change to absolutely every other number in the system, including the frame rate. At the time, this was not viewed as a problem because the analog synchronization circuitry in most then-current TVs would absorb the change happily; in the long term, it has become one of broadcasting’s most notorious technical decisions.
The joys of drop-frame timecode, which skips frame numbers (not actual frames) in NTSC video streams in order to clean up the timing discrepancy over long periods is one result of this choice. Strange fractional frame rates are another. A complete discussion of 3:2 pulldown is outside the scope of this guide, but information can be easily found on the Internet. The upshot is that for many years, theoretically 24p productions —especially in NTSC territories — were almost invariably shot at 23.98fps. This allowed NTSC postproduction to be used with no synchronization issues. Only if NTSC had actually been truly 30fps would 3:2 pulldown simulate 24fps exactly . There is no aesthetic difference between the two – humans usually can’t tell the difference between 25fps PAL and 24fps film, let alone 24 and 23.98 – but this is one of the things that can cause serious postproduction headaches if someone gets it wrong.
It’s worth digressing into a brief discussion of sound recording to clarify this point. If we shoot at exactly 24 frames per second, then transfer our picture to NTSC video, it will be running about 0.1% more slowly, and this will happen today if you drop 24p DSLR footage into an NLE or transcoding app such as Compressor and interpret it to 23.98. Doing this does not alter anything beyond the rate at which the computer clocks out frames; it's merely taking the same stack of pictures and playing them back more slowly. However, while it isn’t immediately obvious, the picture is now running slower than the sound, resulting in a small but ever-growing synchronization error that must be corrected. Simply put, what might have been a 24p video stream with 48,000Hz audio is now a 23.98fps video stream with 47,952Hz audio, which isn’t legal in most audio systems. Most NLEs will do a quick and dirty correction for this that may not be suitable for mastering. It must be corrected by special audio software for proper results.
These days, for productions that will not be postproduced on NTSC video, the whole issue of fractional framerates may be purely of historical interest. It is entirely reasonable to shoot on a Nikon that won’t do anything but 24p, then interpret it down to 23.98fps and retime the audio for an NTSC-rate that can be delivered as a DVD or broadcast master. It may also be easier to shoot the production at 23.98fps, although then producing a true 24p master (perhaps for theatrical exhibition) will require extra work. The potential permutations of this are too complex to enumerate individually. Suffice it to say that careful planning and engineering are required to avoid problems, and may affect which camera will be used for a shoot. The choices break down like this when shooting the progressive-scan video that DSLRs offer:
Broadcast non-drama in NTSC territories: shoot 29.97fps, but this is quite limiting: the picture must be electronically retimed for any other purpose, compromising quality. Consider shooting 23.98 anyway.
Drama for broadcast or theatrical in NTSC territories: shoot 23.98fps. 23.98 or 24fps material can be sped-up for distribution in PAL territories without problems.
Broadcast non-drama in PAL territories: shoot 25fps
Drama in PAL territories: shoot 25fps. 25fps material can be slowed to 24 for theatrical release (or NTSC distribution with 3:2 pulldown, if progressive) without significant problems.
If in doubt, when in NTSC territories shoot 23.98 and be prepared to retime both picture and audio for PAL territories; or in PAL territories shoot 25fps and be prepared to retime for NTSC and theatrical. From this, we see that the 7D would be the most flexible choice since it handles 23.98, even if it didn’t have a selection of other rates.
If film must be shot with a camera that won’t produce the desired output frame rate (for instance, shooting on a camera offering only 24fps for delivery in a 25fps PAL territory), the first option is simply to play the shot at a different frame rate to the one at which it was recorded. Playing a 24fps shot at 25fps is actually done all the time when showing 24fps movies on PAL television systems, and the difference is almost invisible. This will require retiming the audio, and will only be an option if the difference in frame rate — such as 30 to 24 — is small enough to get away with it.
Retiming pictures – for instance, stretching a 30fps image to achieve a higher, slower rate without having to drop to a lower resolution – is more complex than retiming audio. However, the state of the art can produce reasonable results using a technique called optical flow interpolation. This technique involves tracking motion over small areas of the image and using that information to apply a warp or morph effect to move one frame toward the next.
The techniques involved are similar to those used by compression codecs to track and encode motion in a scene, allowing a frame to be intelligently generated representing a notional point in time where no original frame was recorded. These techniques are far more advanced than older conversion technologies that often created ghosting and other artifacts, but they will still fail when rendering very fast movement. They're especially bad with layers of fast-moving, semi-transparent objects such as fences.
This can be done on the desktop in Apple’s Shake, or with ReTimer or Twixtor, two After Effects plugins that will do optical flow interpolation. The process will never be perfect, and tests on a representative subject will reveal the sort of artifacts it may produce. Shooting with a reasonably fast shutter to reduce motion blur is a compromise that may produce a cleaner result since it is harder to interpolate blurred images. Similar techniques can actually be used to add motion blur to overly sharp frames. Avoiding problem subjects — repetitive patterns, fast moving objects, or multiple transparent layers — will also help.
Storage & Backup
Whatever is done for backups and reliability checks, the material will pass through and in many cases remain on hard disks. It's not so much the absolute reliability of hard disks that's the problem, but their tendency toward sudden and total failure. For this reason RAIDs are used, but even though they provide good protection against mechanical failure, they don't protect against human error.
At the same time, DSLR footage has an advantage in that it simply isn’t that big — it's measured in terms of minutes per gigabyte rather than the seconds per gigabyte of an uncompressed HD master. This makes approaches like writing to an optical disk practical. However, optical media are subject to both write errors and long-term storage issues associated with exposure to heat and light, and should be verified both after writing and periodically using the techniques discussed above.
Higher end work has for several years been moving toward LTO tape. The standard is maintained by a consortium founded by Seagate, Hewlett-Packard, and IBM, an organizational structure that was designed to avoid the previous inter-brand incompatibilities between devices that even used the same physical tape cartridge.
LTO uses half-inch tape in a linearly-scanned format. This is important: older tape systems (though not SDLT) often used spinning helical scanners as used in video recorders and DAT tape decks, which can be a reliability problem. The reason they’re used for high-end uncompressed work is that LTO is big and fast, with LTO-4 offering 800GB per tape at 80 or 120MB/s at the time of this writing. LTO is available in convenient FTP-attached drives that can be directly attached to anything with a spare Gigabit Ethernet port from companies like Cache-A and Quantum. After a small amount of IP address configuring, file access is then drag-and-drop. Later in post, a mag tape drive can also be used to back up larger, less compressed intermediate and final material that is difficult to handle in any other way.
Most HDSLRs produce either h.264 or, more rarely, MJPEG material in Quicktime or AVI format, respectively. Most NLEs, including Final Cut and Premiere, will to some extent handle these formats directly, but it isn’t a good idea to work that way. All compressed formats use computer horsepower in decompression, but h.264 is notoriously hard work for your edit station. In Final Cut, for instance, everything will be renderable, and Premiere Pro will crawl along on even the best hardware. The quality of DSLR video is also so compromised by compression that recompressing it — which is a requirement if you want to do more than cuts-only editing — is going to be incredibly toxic to image quality.
The solution to all these issues is to convert the material from h.264 or MJPEG to some other intermediate format. The choice of that format, and how one gets material into it, is a critical one. There are two choices to make at this stage: either work completely uncompressed, for a truly gold-standard finish, or live with a little compression to make things easier for the computer’s storage system.
To maintain maximum quality, the most obvious solution is to work with completely uncompressed material. Recompressing one codec with another, as would be done when using any form of compression, will always cost something. This is true even if the output codec has very high bandwidth.
Doing this immediately puts the filmmaker into the “seconds per gigabyte” range of storage consumption, but that isn’t very difficult to accomplish when terabyte hard disks can be purchased for a few hundred dollars. If all the original material has been backed up to another device, it might be decided that an unprotected RAID-0 array offers acceptable reliability. In this case it’s possible to cut uncompressed, full-resolution, 4:4:4 RGB, equivalent to just under 200MB/s at 24fps, on as few as four hard disks. These can be readily installed in most PCs and some Macs and configured with standard operating system tools. A purist would say that an external, rack-mounted storage array would be a better bet, but it can be done with less. HD capture board vendors such as Blackmagic and Aja ship their products with test utilities designed to stress hard disks in the same way they will be stressed during real work, allowing the user to evaluate what the system will tolerate. Free utilities such as HDTach are less ideal, but will provide a rough idea.
Getting a camera's original material into an uncompressed format can be done in several ways, bearing in mind the caveats outlined above concerning Canon cameras. Mac users would want to use Quicktime rather than AVI, but the procedure is otherwise identical. The resulting uncompressed material will be very large on disk at around 12GB per minute. However, it can be imported directly into an appropriately configured project in either Final Cut or Premiere Pro and worked on in all its 10-bit uncompressed glory without the risk of losing more.
Viewing this sort of material on a calibrated HD-SDI monitor really requires an uncompressed I/O board from a company like Blackmagic or Aja. Most vendors will ensure that their drivers provide project presets for NLE that will allow things to work smoothly.
While work can be done solely on the computer monitor, a lower-cost option for external viewing is Blackmagic’s Intensity. This is a good solution if high end HD-SDI connectivity isn't needed, and allows the image to be viewed on a normal 1920x1200 TFT monitor at full resolution.
If the idea of the huge data rates involved in full-bore uncompressed video handling is unappealing, the alternative is to convert from the DSLR’s native format to a much less compressed format. This generally means something that doesn’t use inter-frame compression and has good NLE support, such as: Avid’s DNxHD, Apple’s ProRes, and Cineform.
Technically speaking, DNxHD and ProRes are nothing new – they’re DCT codecs broadly similar to MJPEG, but with some enhancements on that standard. Cineform is a wavelet-based codec, which has, subjectively, better quality-per-bit performance than DCT at the cost of being harder work for the computer. Cineform is designed specifically for this sort of intermediate work and is compatible with Blackmagic I/O hardware to satisfy the display chain requirement. There are third-party and free video codecs that could be used as an intermediate codec, particularly including Morgan Multimedia’s MJPEG2000, which approximates Cineform. Consideration in this guide, however, will be limited to those codecs that are packaged and intended to be used as edit intermediates.
In most cases the choice will depend on the platform. Avid users will use DNxHD. Sorenson Squeeze comes with Media Composer and will convert DSLR footage to DNxHD Quicktime that will fast-import, although it may suffer the gamma offset common to many Quicktime applications with Canon footage. Without an Avid Log Exchange, it can be difficult to maintain timecode and other metadata in Avid projects, but with compressed files being comparatively easy to handle it may be easy enough to cut straight to the final material without an offline.
Final Cut users get ProRes support with their software and can use Compressor or MPEG Streamclip to convert.
Premiere users will have to spend a small amount of money on Cineform’s NeoScene converter. This is a good option, as Cineform has ensured that NeoScene reads Canon’s slightly unusual luminance encoding correctly. Those who prefer Final Cut can also choose this route, especially if they need to move material to Windows systems as well.
Most desktop edit environments are not equipped with calibrated monitors, or even monitors that can be calibrated. This may not cause a QC failure (although it can if black levels, color or gamma are wrong enough to be obviously incorrect), but it may cause the final material to look significantly different from the way it looked during the edit.
Most people who will use DSLRs for video work are either filmmakers or photographers, and both will be aware of the potential of postproduction adjustment toward both technical and artistic ends. There are two principal concerns here: producing a technically acceptable image, and creating an artistically appropriate one.
Grading for broadcast or theatrical release is not something that can be done to the very highest levels on the desktop with computer displays, especially as these differ between themselves: Apple and Windows-based machines have very different display gamma by design. In either case, TFT monitors lack the contrast to match industry standards, rarely exceeding 1000:1 in absolute contrast ratio (despite what a lot of advertising states).
Some DLP projectors, typically three-chip types, are much better, and can be brought into reasonable trim using commercially available calibration tools or services. This is probably the place to start. However, even the highest-end calibration tools, such as Filmlight’s Truelight system, often rely eventually on a side-by-side comparison of the corrected electronic output and a final medium such as 35mm projection, with manual adjustments to achieve a final match.
Doing a side-by-side comparison and adjustment session is a reasonable way to get a rough, eyeball calibration, even if the “grading facility” is nothing more than a desktop computer and a good DLP projector. This might require renting presentation equipment or having the material transferred to the intended output format, or taking whatever other measures might be required to view the material in its intended form. These sorts of fine color adjustments can be made in something like a Blackmagic HDLink, which offers conversion from HD-SDI as output by one of its I/O cards to DVI which might be used for a monitor or projector. Many HD-SDI boards offer similar facilities, and if the material will be viewed on a computer desktop, a lot of graphics cards provide at least some gamma adjustments in their driver software.
No matter which approach is taken, it’s critical to maintain consistency both in the hardware configuration and the viewing environment. Clearly it’s not a good idea to fiddle with colorimetry controls while work is going on, but high end grading also takes place in a room with controlled lighting and often a reference illuminated with color-controlled white light behind or near the display.
In the end, accurate color can become an obsession. It’s possible to get a reasonable result for purposes up to and including digital cinema projection using desktop equipment, and even if that result isn’t necessarily entirely accurate to what was seen during grading, it may still be acceptable. Grabbing those last few percentage points of accuracy is extremely expensive and for anyone other than the most exacting producers not really necessary in a world where cinema screens, televisions and computers are often poorly calibrated.
Using the Canon Picture Style Editor
All HDSLRs output compressed video, which can limit what can be done in post production because of the "lost" information. Although the cameras are designed to output a good quality image, the compression used will cause the image to break down if excessive image manipulation is attempted in post.
The Canon Picture Style is featured here, but the same basic concepts apply to Nikon's equivalent feature, called Picture Control. The Nikon Picture Control utility must be purchased separately (about $145), while the Canon Picture Style Editor is included with all Canon HDSLRs.
That's where Picture Style saves the day. Picture Style essentially allows the user to alter the way the image is recorded before it is compressed. This ability is unmatched in most prosumer video cameras. Used well, it can place the camera closer to the level of other digital cinema cameras by avoiding the need to do a lot of post processing, which otherwise reveals the true nature of the lossy compression.
Picture Style can control the hue, saturation, and luminosity levels of the video being recorded. These levels can also be adjusted selectively for just one color or a range of colors.
The Picture Style Editor allows the user to create custom Picture Styles to upload to the camera for use in the field. The editor is a mini-color grading application, except the color grading is done in-camera and before the footage is compressed.
- The process begins by snapping a picture in RAW of the scene or scenes that are to be shot in video. This, of course, assumes that the project has a certain degree of pre-planning. However, the user will still be able to take advantage of Picture Style by taking general reference images for more unplanned shoots, such as documentaries or weddings. After the RAW image has been taken, it's imported into the Picture Style Editor; and then the curves and settings are applied to the image until the exact look is achieved.
- Aside from applying overall curves to the way the image handles highlights, midtones and shadows, specific colors can be sampled from the RAW image and curves can be applied to that specific color. For example, a certain look can be applied to a scene without ruining an important color (e.g. skin tones).
- For even greater control, several images from the same scene can be used to cover more variables that will appear in the shot. Just save the Picture Style, import the next RAW image (from the same scene), refine the settings further, than save the updated Picture Style. This can be repeated as many times as needed. Ideally, different Picture Styles should be created for each change of scene.
- Once the Picture Style is created, use a USB cable to connect the camera to the computer that contains the Picture Style.
- Open the Canon EOS Utility software (included with each camera).
- Click on "Camera settings/Remote shooting" on the first screen.
- Set the camera to "Photo Mode."
- On the left control screen that appears, click on the "Register User Defined Style" under the "Shooting Menu" list.
- Choose the Picture Style 1 tab (unless this is already used).
- Click on the folder icon to navigate to and select the custom Picture Style.
- Click OK to save the Picture Style to the camera. It is now accessible in the Picture Style menu on the camera.
Note: If the custom Picture Style is more stylized than the "natural" look, a more accurate way to expose may be to set exposure using the "Flat" profile and then changing to the custom profile.
For a detailed tutorial on how to use all of the settings in the Picture Style Editor, check out: http://www.canon.co.jp/imaging/picturestyle/index.html.
EOS Movie Plugin
The Canon EOS Movie Plugin-E1 for Final Cut Pro (FCP) lets editors transcode and ingest video footage from their Canon HDSLRs using the familiar Log and Transfer function in FCP.
Simply plug in your memory card, launch FCP, and transcode/import video clips through the familiar interface. Enter reel name, clip name, scene, shot/take, angle, and log notes for each clip. Set in and out points to grab only what you need from each clip. Select your preferred editing format and ingest your video.
The plugin also preserves camera metadata like shutter speed, ISO, and most importantly, time stamps.
It's a huge step forward for Canon HDSLR videographers. Canon H.264 video clips typically had to be transcoded into an editing format like Apple ProRes with Apple Compressor or another application before being ingested into FCP, a process that didn't preserve metadata. The EOS Movie Plugin-E1 preserves camera metadata and time codes, letting editors sync audio and B-roll shots with EOS footage. Finally, the plugin allows users to create a pristine disk image (DMG file) of each memory card for archival purposes.
The Canon EOS Movie Plugin-E1 for Final Cut Pro is free and available at Canon's website. Simply follow links to drivers and other software for HDSLRs http://www.usa.canon.com/cusa/consumer/products/cameras/slr_cameras/eos_5d_mark_ii_