Next: Video Finger: An Up: Synthetic Movies Previous: Previous Examples of

Basic Components of a Synthetic Movie

The basic components of a synthetic movie are the description of the movie and the description of the objects in the movie. The semantic content of the movie is transmitted in the movie description. This is not to say that the object view generated from the object description does not convey information, but that the presence of that particular view is dictated by the movie description.

The actual structure of the components depends on the implementation of the synthetic movie. For example, a videodisc based synthetic movie uses a very different movie description from a computer graphic animation. If interframe synthesis is used, the movie description will not describe the contents of individual frames, but rather their sequencing. Alternatively, if intraframe synthesis is used, the movie description must contain a description of the contents of each frame.

The object descriptions, typically being relatively large, are usually stored locally at the receiver. They may be pre-transmitted or provided on local mass storage such as videodiscs, CD-ROMs, or hard disks. The movie description may either be included on the mass storage device or transmitted separately.

The Object Description

The object descriptions contain representations of the objects used in the image sequence. The complexity of the receiver is determined largely by the complexity of the object descriptions. The desirable qualities of the object description are low complexity, small size, and manipulability. Several classes of object representation exist, each with advantages and disadvantages.

3D Representations

A common class of object representations describe an object by specifying a three dimensional surface and the reflectance at every point on the surface. These representations differ in how the three dimensional shape is described. Examples of these representations are polygonal patch descriptions and particle representations. Another common class of object representations attempt to describe objects as combinations of primitive geometrical objects. Examples of this class are constructive solid geometry (CSG) and superquadric representations.

These representations tend to be very manipulable. The two dimensional view required for display by conventional display devices must be rendered from the three dimensional object description. In this rendering step, any desired object view, using any lighting and lens parameters, may be generated.

The description of the motion (including deformation) of realistic objects presents a difficult problem. This is because most real objects (in particular live objects) are non-rigid, or even more problematic: deformable. The representation typically used for representation of non-rigid objects is an articulated object with links between rigid sub-parts. An acceptable representation of deformable objects has been the subject of much recent research[Terzopoulos88][Platt88]. One new deformable object representation, which is promising due to its low computing requirements, describes the deformations using a modal analysis [Pentland89].

The rendering stage that gives these representations much of their manipulability is computationally expensive, and well beyond the limited computing power available presently in personal computers. Other representations are currently implementable, and have their own advantages.

A 2D Image Representation

A simpler representation is a limited view representation, where a set of 2D views of the object from a limited number of viewing angles are stored. This representation allows for simple, realistic images and it is very fast as little to no computation is required.

The control of this representation is very limited. The manipulations possible are translation, scaling and simple 2D remappings of the object views. The generation of an object view not explicitly contained in the object description is not a simple problem. If a view similar to the one desired is present in the object description, it could be used. Calculation of an error metric to arrive upon the closest approximation may be impossible, given the large number of possible deformations. Interpolation to obtain views not contained explicitly in the object database may be possible.

The methods of encoding the views differs. Full color (either RGB or YIQ) images may be used, or the images may color quantized to compress them. Additionally, the object view data may be entropy or run-length encoded to further lower the bandwidth. The redundancy present between adjacent views should allow a large amount of compression if more advanced techniques of image compression were used, such as Multiscale VQ, or other multi-channel techniques[Netravali89]. Multiscale (Pyramid) encoding would allow superb overlaying and would simplify the adjustment of the image scale and focus, but would require an extremely powerful computer.

The Movie Description

The Movie Description describes the position of objects in the scene over time. This description may be generated or modified locally in response to user input or input received from external devices. The movie description is the ``essence'' of the image sequence. It is what needs to be transmitted to recreate the original image sequence at the receiver. If an image detail (such as which flag is flying over the fort) is not included in the movie description, it will not be present in the synthesized movie. The receiver, of course, must have the required object descriptions.

Intraframe Movie Descriptions

Intraframe descriptions specify the sequence of images to be displayed. The images must be one of a large set of images present in the object descriptions. Typically a sequence of images are associated together for display, with the ability to enter and leave the sequence at any particular frame. The movie description describes the order in which the sequences are viewed.

In many intraframe synthetic movies, the movie description is actually a set of constraints and procedures for manipulating the sequences. In Hypermedia, for example, a set of procedures for browsing through information, with the ability to establish and examine links to related information, is provided[Brøndmo89]. The course of the movie is determined by the user input received, possibly prompted by additional visual or aural cues.

Interframe Movie Descriptions

Interframe descriptions describe the actual contents of the image sequence, in a frame by frame manner. The computer graphics community has developed several synthetic movie scripting languages for describing their animations [Reynolds82][Feiner82][Fortin83]. These languages differ in many respects, but they share several details in common:

The basic unit of time in the movie descriptions is the displayed frame.
Support for parallel (and synchronized) execution of object motion descriptions.

The movie description may not be explicitly defined in some interframe synthetic movies. Instead, these movies are driven directly by software state machines or constraint systems. Indeed, examples of these may be more numerous than explicitly defined movies. They include interactive video games, flight simulators, interactive graphical simulators[Zeltzer88] and many applications at the user interface.

Next: Video Finger: An Up: Synthetic Movies Previous: Previous Examples of

wad@media.mit.edu