previous up next index
Previous: Contents Up: MagicEight: System Description Next: The MagicEight Virtual Machine

Introduction

MagicEight is a computing system developed to enable the construction of parallel computers for processing media: audio, text, and video, among others. Flo suports medium to coarse grain parallelism, using a dataflow model of execution, on a range of machine architectures scaling from a single von Neumann or general purpose processor (GPP) to networks of several hundred heterogenous processors.

It is a common assumption that since general purpose processors will be able to deliver performance on the order of 800 MOPSgif in a year or two, digital video processing will shortly be done exclusively using software executing on a single GPP, in a manner similar to digital audio. What this assumption ignores are the computational requirements of digital video. While six channels of uncompressed audio data require only 4.2 Mbits/sec of data, an uncompressed video channel today requires between 147 and 21,000 Mbits/sec gif. Note also that while the audio data rate has been set by the reponse of the human auditory system, even high-definition video is not within an order of magnitude of the human eye's response, and thus the video data rate will be technology limited for many years to come.

Since a single GPP is not capable of providing the required peformance, the simplest approach to obtaining the required performance is to combine a number of general purpose processors, operating in parallel and interconnected using either shared memory or a message passing mechanism. A MagicEight system is designed to efficiently coordinate the management of resources and execution of applications on such a parallel processor.

Parallel Systems

There are two fundamental issues encountered in building a parallel processor computer system [AI87] :

  1. The non-deterministic latency associated with accessing shared memory in a multiprocessor
  2. Obtaining efficient synchronization across multiple processors

An imperative algorithm specification (i.e. a sequence of instructions to be executed sequentially), while an ideal means of controlling a single von Neumann computer archicture, provides little opportunity for parallel execution. Instruction level parallelism of small numbers of instructions is available, but larger amounts of parallelism (allowing multiple processors to be utilized) aren't available unless supported explicitly by the algorithm. This requires the programmer to determine the parallelism, in many cases fixing the granularity at compile time for a particular machine. The fundamental issues listed above are particularly difficult to overcome using an imperative algorithm specification.

A fine-grained dataflow specification provides the maximum available execution parallelism for a given algorithm, and directly addresses the fundamental issues listed above. Unfortunately, this parallelism comes at a cost of low performance in real implementations due to the high overhead of synchronizing (matching, or scheduling) every instruction.

Hybrid Dataflow

Several of the refinements to static dataflow are attempts to make use of data and instruction locality. In particular, many hybrid dataflow schemes propose a scheduling quanta larger than a single instruction [Bab84] [Ian88]. Sequences of instructions, which execute in a determinate manner (operating only on data local to the processor) are the basic unit of scheduling and synchronization. These hybrid schemes attempt to minimize the amount of synchronization used, while still providing an acceptable level of parallelism. The number of basic instructions in the sequence varies from one (fine-grained dataflow) through the tens and hundreds (medium) up to entire applications (coarse). Note that the machine used to execute the sequences may itself utilize multiple processing units to efficiently utilize the available instruction level parallelism within the sequence.

We extend hybrid dataflow with the introduction of streams , multidimensional arrays of relatively small (8 - 1024 bits) scalar data elements. Streams may be divided into appropriate granularities for execution at runtime, and provide a convenient framework for performing synchronization.

Specialized Processors

Building a system with the required processing power for digital video using only a network of GPPs is a costly and unwieldy solution. We propose instead that some of the processing elements be specialized processors -- capable of executing a restricted set of algorithms much more efficiently than a general purpose processor. Some examples of specialized processors are SIMD and MIMD arrays of processing units, well suited to such tasks as matrix multiplication, convolution, and vector distance calculations (VQ or Motion Estimation), or a reconfigurable processor, loaded with application specific functionality.

Specialized processors are supported in MagicEight by the use of dynamic scheduling and dynamic linking of shared libraries. At runtime, an algorithm is mapped onto the most appropriate (i.e. fastest for it ) processing elements available in the executing system. Multiple architecture shared libraries allow operations to be defined in a manner which allows system specific processors to easily be supported.

This Document

The MagicEight system model defines how an application and the data it manipulates are represented, as well as the functionality provided by MagicEight. The view of the machine seen by the MagicEight programmer is defined in the first section, and the data structures described there are defined in the Data Structures section. It is intended that the user program be written in a higher level intensional language such as Lucid, and compiled into a MagicEight ``program.'' The syntax of a MagicEight program object is the topic of the last section of this document.

Reference is made throughout this document to the resource manager. It is intended that this resource manager be an integral part of a MagicEight system, not a separate entity. MagicEight is the resource manager, as well as the operating system.

 


previous up next index
Previous: Contents Up: MagicEight: System Description Next: The MagicEight Virtual Machine

magiceight-web@media.mit.edu