Motion capture, or mocap, is a technique of digitally recording the movements of real things -- usually humans -- so their movements can be played back with computer animation. This technique is used increasingly in film and in video games, perhaps most notably in the computer-generated character Gollum in the two latter Lord of the Rings movies.

A motion capture session only records the movements of the actor, not his visual appearance. These movements are recorded as animation data which is then "mapped" onto a 3D model (which can be of a normal human, giant robot, or anything else) that was created by a computer artist, and the model can then be made to perform the same movements that were recorded.

In the motion capture session itself, an actor, often a martial artist, dancer, or mime, wears a leotard with a number of reflective markers taped or glued to specific points all over his body. At least two cameras, and preferably an array of cameras, film the actor as he acts, or performs specific motions. The cameras report to a computer the exact position of each reflective marker, many times per second.

Instead of such an optical system, a magnetic system can be used, in which the actor wears a number of sensors which detect a nearby magnetic field and transmit data on each sensor's inferred 3D position to the computer.

In particularly complex scenes that are shot with particularly expensive equipment, a motion control camera can pan, tilt, or dolly around the stage while the actor is performing. These camera motions are also tracked meticulously and fed to the computer; or, a computer controlling the camera motion has already been programmed with the motion control data, and the camera meticulously follows the directions of this computer.

This computer then uses software to post-process this mass of data and determine the exact movement of the actor, as inferred from the 3D position of each marker at each moment. Mocap data is notorious for requiring a human to spend a great deal of time to "clean up" the data. A single sensor mis-reading might cause the computer to believe that the actor's arm was pointed straight up into the air for a fraction of a second, for example, when it was not.

After post-processing, the computer exports animation data, which computer animators can associate with a 3D model and then manipulate using normal computer animation software such as Maya or 3D Studio Max. If the actor's performance was good and the software post-processing was accurate, this manipulation is limited to placing the actor in the scene that the animator has created and controlling the 3D model's interaction with objects. The animator does not have to move that particular model's arms and legs around manually -- the movement is already present in the animation data.

Motion capture equipment is expensive. It can cost many tens of thousands of dollars for the digital video cameras, lights, software, and staff to run a mocap studio, and this technology investment can become obsolete every few years as better software and techniques are invented. Some large movie studios and video game publishers have established their own dedicated mocap studios, but most mocap work is contracted to individual companies that specialize in mocap.

Mocap offers several advantages over traditional computer animation of a 3D model:

  • Mocap can take far fewer man-hours of work to animate a character. One actor working for a day (and then technical staff working for many days afterwards to clean up the mocap data) can create a great deal of animation that would have taken months for traditional animators.
  • Mocap can capture secondary animation that traditional animators might not have had the skill, vision, or time to create. For example, a slight movement of the hip by the actor might cause his head to twist slightly. This nuance might not be imagined by a traditional animator, but it would be captured accurately in a mocap session, and this is the reason that mocap animation often seems shockingly realistic compared with traditionally animated models.
  • Mocap can accurately capture difficult-to-model physical movement. For example, if the mocap actor does a backflip while holding nunchucks by the chain, both sticks of the nunchucks will be captured by the cameras moving in a perfectly realistic fashion. A traditional animator might not be able to physically simulate the movement of the sticks adequately due to other motions by the actor.

On the negative side, mocap data is very difficult to manipulate once captured and processed, and if the data is wrong, it is often easier to throw it away and reshoot the scene rather than trying to manipulate the data as could be done easily with traditionally animated computer models. Another important point is that it is common and comparatively easy to mocap a human actor in order to animate a biped model, but this is only because humans are patient, will take direction, and will not attempt to lick markers off their bodies, unlike dogs, cats, tarantulas, and so forth. Motion capture is not typically used for non-biped 3D models for this reason. It may not even be suitable for animation of bipeds with superhuman powers, due to the mocap actor's inability to fly through the air, morph his fists into hammers, and so forth.

Motion capture can be applied to animals or even things like cars or trucks, though humans constitute the vast majority of mocap actors.

Video games are increasingly using motion capture animation for such animation as the movement of a football or basketball player or the combat moves of a martial artist.

Movies have increasingly used motion capture animation as computer-generated animation has replaced traditional cel animation, and as it has become exotically fashionable to utilize completely computer-generated creatures, such as Gollum and Jar-Jar Binks, in live-action movies.

Facial motion capture is also sometimes utilized to digitally capture the complex movements in a human face, especially while speaking. This is generally performed with an optical setup using a single camera at close range, with small reflective markers glued or taped to the actor's face.

Due to current technology limitations, a motion capture session only records the movement of a few key points on the actor's body, where the sensors or reflective markers are placed. One might extrapolate that future technology might include full-frame imaging from many camera angles that would record the exact position of every inch of the actor's body, clothing, and hair for the entire duration of the session, resulting in a higher resolution of detail than is possible today.