(Tree-Structured) Interpreter Pattern

(from Design Patterns)


Given a language, define a representation for its grammar along with an interpreter that uses the representation to interpret sentences in the language.


Problem instances can be expressed as a few primitives assembled by a few different combination rules. That is, problem instances are sentences in a simple language. For example, regular expressions, programming languages, graphical scenes, and documents.



Use a class to represent each primitive and each combination rule. Represent a problem with a tree whose leaves are primitive objects and whose internal nodes are combination rule objects.

Solve the problem by traversing the tree. Each node has a method which receives the traversal state and performs some work, possibly modifying the traversal state or delegating some of the work to children. Calling the root node with the initial traversal state starts the traversal.



The research on implementing interpreters is quite extensive.

Shared nodes
Since the same object can be used in multiple places in the tree, nodes should avoid private state. They should also minimize public state, to promote sharing. Instead, the contextual information they need should be passed in as part of the traversal state. See the Flyweight pattern from Design Patterns.

Default actions
Adding new actions is easier if each node provides a default action behavior. This is most effective for nodes which only serve to delegate to their children, write to the traversal state, or maintain constraints.

Dependency analysis
Dependency analysis is figuring out what state elements are read or written by a node. It is a step in all of the techniques which follow.

This analysis is not possible in general since it may depend on the values of the state elements themselves. However, while underestimation of the read set can cause errors, there is no functional disadvantage to overestimation. (The opposite holds for the write set.) Thus one makes conservative approximations.

Dependency analysis can be performed as a traversal action, with each primitive conservatively reporting its dependencies and each combination rule conservatively combining the dependencies of its children. (Since dependency analysis can be used to accelerate any traversal action, it can be used to accelerate dependency analysis!)

Liveness analysis
Traversal can be accelerated by not computing elements of the traversal state that will not be read before they are changed again. This can be done by tagging each element as "dead" or "alive" based on the action being performed or the output of dependency analysis.

Effect caching
When the same action is applied repeatedly to a tree, time can be saved by caching the work done by unmodified subtrees. If neither the subtree nor the relevant traversal state has changed since last time, then the effect of traversing the subtree must also be identical, and can simply be recalled from memory. This optimization requires dependency analysis, a mechanism for recording side-effects, and a notification mechanism for detecting changes.

Partial evaluation
Partial evaluation generalizes effect caching. The idea is to create multiple implementations of a subtree, each specialized on particular conditions of the traversal state. The specialized implementations are more efficient by taking advantage of those known conditions, i.e. they have already been "partially evaluated." The specialized implementation is then used whenever the required conditions hold. Effect caching essentially uses an exact-match condition along with a full evaluation of the subtree.

Tracing causes
Dependency analysis allows interacting with the tree via its output. For example, if the tree represents a graphical scene, dependency analysis can tell you which node determines the color of this pixel. If the tree represents a program, dependency analysis can tell you which computation led to this number.

Dependency analysis permits parallelism. For example, the children of a node could be partitioned into independent sets which can be traversed in parallel. This usually means that one set does not read state modified by another set. Thus another optimization to perform on a tree is to make subtrees independent.

Lazy evaluation
Lazy evaluation generalizes liveness analysis. Instead of using conservative dependency information to decide what to compute, you can compute nothing until it is explicitly asked for. Instead of computing a value to put in the traversal state, a node simply inserts a pointer to itself. The pointer is a proxy, as in the Proxy pattern. When another node tugs on this pointer, the original node finally computes the value and uses it to replace the pointer.

Unfortunately, lazy evaluation is expensive since the original node must compute using the old traversal state (the state it was called with). Therefore, don't use lazy evaluation unless many elements of the traversal state go unused and this is not detectible by dependency analysis.

Anecdote: Lazy evaluation was the basis of a rejected proposal for VRML2 called ActiveVRML. They planned to use the highly-optimizing Haskell interpreter.

See Design Patterns for sample code. For dependency and liveness analysis, see any book on compilers, e.g. Compilers: Principles, Techniques, and Tools. For caching and partial evaluation see the book Partial Evaluation and Mixed Computation. For parallelism and lazy evaluation see Implementing Functional Languages: A Tutorial and the excellent MIT course 6.847 (Dataflow Architectures and Languages).

Dynamic programming is often used as an alternative to tree traversal. Each node (now called an instruction) has conditions which enable it to run. The idea is to maintain a queue of instructions whose conditions are satisfied (the "run queue"). The interpreter pops an instruction off of the queue, executes it (modifies the program state), then pushes any new instructions which have just been enabled and pops any instructions which no longer apply. This approach is used in microprocessors, optimizers, blackboards, rewriting systems, and backward chaining expert systems.

State machines are a further optimization where all possible queues are modeled as states and changing the queue is modeled as a state transition. Regular expression parsing is usually done with a state machine.

Known Uses

The Interpreter pattern is a very powerful and well understood pattern, yet it has not been used much outside of programming languages. Hopefully that will change.

Related patterns

(from Design Patterns)

The Composite pattern is a simple pattern that says "primitives and composites conform to the same interface." Windows uses this technique for assembling user interface elements. We also saw this in Python, where both primitive values like integers and composite values like tuples had the same interface. This allows you to build arbitrarily complex networks of objects. The Interpreter pattern specializes the Composite pattern to decoupled tree structures. The way an Interpreter uses the tree is also very stylized (divide-and-conquer).

The Visitor pattern is really just an idiom for adding methods to classes. Instead of distributing the implementation of a particular method out to different classes, you put all of the implementations in one class (call it the "method class"). The method class has one method for every implemented class. Each class then gets one "extension method" which receives a method class and calls the appropriate method.

Method class:

MethodForA(A& a) { ... }
MethodForB(B& b) { ... }
MethodForC(C& c) { ... }

Class A:

Extension(MethodClass& m) { m.MethodForA(*this); }

Unfortunately, using the Visitor pattern then makes it hard to add new classes. A better solution is to use a language with multiple dispatch, like CLOS.

Thomas Minka
Last modified: Tue Apr 25 09:59:24 GMT 2006