PLE lecture notes -- Sather

Sather is a statically type-safe language which places great emphasis on efficiency. Its support for programming-in-the-large is similar to CLU, via parameterized classes and iterator abstractions.

(Sather suffers from a mild form of Algol 68's term-itis, by using "concrete type" instead of "class", "abstract type" instead of "class type", "routine" instead of "method", "attribute" instead of "instance variable", etc. We will try to stick to the more familar terms.)

Focal points

Behavorist object-orientation
Two-level typing: classes (concrete types) and class types (abstract types)
Parameterized types
Structured coroutine abstraction ("iters")

Values

Standard classes:

BOOL

Booleans are initialized to false.

x : BOOL;
x := true or false;

CHAR

Characters are initialized to '\0'.

x : CHAR;
x := 'a';

STR

x : STR;
x := "concat" "enation";

INT, INTI, FLT, FLTI

The 'I' suffix denotes infinite precision. Numbers are initialized to 0.

x : FLT;
x := 2.5 * 2.0;

ARRAY{T}

Array indexing starts at zero.

x : ARRAY{INT};
x := | 3, 4, 5 |;

TUP{T1, T2, ...}

Tuples can have up to four elements.

x : TUP{INT, FLT, BOOL};
x := TUP{INT, FLT, BOOL}::create(2, 2.5, true);

As in Self, all values are treated as objects. For example, 2.max(3) evaluates to 3, and 2.square evaluates to 4. (Max and square are methods in the INT class.)

For the purpose of efficiency, Sather distinguishes between value classes which are immutable and may be stored on the stack and reference classes which are mutable and are stored on the heap. Value classes, like BOOL, CHAR, and INT, are automatically initialized. Reference classes, however, are initialized to NULL; the programmer must create an object for them to point to via an explicit constructor call. While Sather claims safety by omitting pointers, it admits the possibility of core dump by using an uninitialized instance of a reference class.

Sather supports a generalized form of function pointer called a "bound routine". The generalization is that some of the arguments to the function can be fixed; only the remaining arguments need to be provided. For example, the bound routine #ROUT(1.plus(_)) is the successor function. Underscores denote arguments which have yet to be filled in. Unfortunately, this bound routine cannot be called as a function; we must use the call method:

#ROUT(1.plus(_)).call(2)   -- evaluates to 3

Object-orientation

Like Self, Sather uses "behaviorist" member access: there is no distinction between accessing a variable and calling a method. However, Sather does not unify the syntax for these two; instead it makes variable access (such as A.x := 3) syntactic sugar for method call (A.x(3)). Similarly, arithmetic in Sather (such as 1 + 2) is syntactic sugar for method call (1.plus(2)). We will return to the sugar vs. uniformity issue when we compare Haskell with Scheme.

Two-level typing

Sather differs most dramatically in its use of types with object-orientation. Given that neither Python nor Self enforce static type-safety, there is a question about whether it is needed. Static typing's goal is to shrink the set of allowable programs. The use of types in Sather is argued from the standpoint of security and speed:

Security because common, small errors will produce disallowed programs.
Speed because types give the compiler extra information and restrict programs to a subset which is known to be efficiently implementable.

However, these points are arguable. Keep in mind these replies:

Security is a matter of style, and a decision by the programmer, not the language. Languages should support checks, but not make them mandatory.
The extra information given to the compiler by type annotations is mostly redundant, because these annotations have been shown to be automatically recoverable in most programs (cf. Haskell).
Program restriction is inherently fragile, because software and hardware advances continually change the notion of "efficiently implementable." Indeed, many of the "efficient" constructs in Sather were once considered too inefficient to put in a language. Chaining a language to a particular technology stamps a date on the language, essentially making it "obsolete from birth." This practice exacerbates the explosion of languages.

Sather supports typing on two levels. Instances are typed by the methods they support; these methods are specified by the instance's class. In addition, classes may be typed by the signatures of some of the methods that they specify. Class types correspond to explicit language syntax for the abstract classes in C++, hence the Satherism "abstract type." For example, string objects support methods for subscripting and for comparison; the STR class lists implementations of all of these methods. The type of STR, therefore, includes both $ELT (method signatures for subscripting) and $IS_LT (method signatures for comparison). This allows strings to be used as arrays or as sortable entities, in a statically checkable way. Classes provide information about the method implementations that an instance will use. Class types provide information about the method signatures that an instance will use. A class must provide implementations for all of the signatures in its type(s), and it may provide more. Classes and class types are distinguished syntactically; both are in uppercase but class types are led by a dollar sign.

Inheritance

Sather has two notions of inheritance, corresponding to these two levels. Since classes implement methods, classes inherit from one another by literally copying those implementations. Sather uses an include statement for this, to reinforce the textual substitution semantics. Any method overlap will be resolved in favor of the including class, and free names in the included code will be resolved with respect to the including class. This makes Sather's code inheritance resemble Algol 60's call-by-name, or higher-order functions with dynamic scoping. Like these obsolete mechanisms, it seems to be at odds with modularity. Since nearly all object-oriented languages use this inheritance semantics, these remarks apply to them as well. (There will be more discussion in the treasure hunt.)

Class types inherit from one another by a subtype relation, which is similar to the textual inclusion used for classes. However, method signatures have no free variables so there is no name capture issue. Subtyping is used for polymorphism: any class of type $S which inherits from type $T can be passed to a procedure which expects a class of type $T, because every class of type $S must implement the signatures in $T. Classes which simply borrow code from one another are not interchangeable in this way.

Polymorphism

For example, suppose we want to write a general-purpose array sorting routine. If we declare its signature as

sort(s :
ARRAY{STR})

, then we can only sort arrays of strings. An array of another class C, which happens to borrow code from STR, say for regular expression matching, will not suffice. The reason is that C is not necessarily required to implement comparison properly, just because STR did. Instead, what we want to sort is an array of "any class which implements comparison." We do this in Sather by using a class type: sort(s : $IS_LT). Any classes which claim to implement comparison properly advertise the $IS_LT type.

Thus the type graph is a programmer construction intended to provide detailed control over polymorphism. We can specify exactly which methods we require arguments to support, irrespective of how classes are implemented (i.e. how they share code). Sather supports this by allowing edges in the type graph to be added at will, even outside of the type definitions, as long as the connected types conform in their method signatures. This is in line with the purpose of typing: programmer-defined restriction, not language-defined restriction.

Efficiency vs. generality

Sather allows names to be typed using either classes or class types. Using a class means that only instances of that class are allowed, period. There can be no polymorphism. Using a class type allows an instance of any class advertizing that type. New classes which claim that type can thus be attached to the name; the name is polymorphic. For example, in

x : BOOL;
y : $BOOL;

the variable y is polymorphic; x is not.

Why is there a provision to prevent polymorphism? This seems to be a rather dangerous language feature, as it tempts programmers to write non-reusable code. The Sather documentation claims that "this is an important source of efficiency:" By declaring a variable with an abstract or a concrete type, the programmer may decide to pay the price for routine dispatch or to restrict the generality of the code by precisely specifying the object type that the variable can hold. Of course, a compiler could for itself determine whether dispatch is needed, depending on how the variable was used. Compilation "hints" like Sather's concrete typing and value vs. reference classes are reminiscent of C's obsolete register annotation, which prevents use of the address-of operator.

Parameterized types

Sather supports parameterized classes and class types, in a manner reminiscent of CLU. For example, the standard ARRAY{T} class takes one class parameter, T. Parameters may be typed, just like function arguments; ARRAY does not require T to support any methods, but for example a SORTED_ARRAY class might require its parameter to be of type $IS_LT.

Parameterized types are type-functions: given a list of types, they return a type. Similarly, their arguments can be typed, and they can be overloaded. Thus we see the development of a "type language" paralleling the value language. Increasing the complexity of the type system attempts to regain the expressiveness lost by typing. However, there remain limitations:

A parameter of a type-function cannot be restricted to being a concrete vs. abstract type.
Since functions must take a fixed number of arguments, there is no way to create a general TUP type-function. Thus Sather only allows four arguments to TUP.

Structured coroutines

A coroutine is a routine which, conceptually, runs concurrently with its caller, and can communicate back and forth. In sequential languages, coroutines are implemented by procedures which can make resumable returns in addition to normal returns. The coroutine and its caller explicitly pass control to each other, and in doing so can transfer values.

Sather supports a restricted form of coroutine, called an "iter", similar to the iteration abstraction in CLU. Iters are restricted because their lifetime is bounded by the loop ... end construct in which they are used. Iters are "structured coroutines" in the same sense that the while loop is a "structured goto". This also distinguishes iters from streams (used in Scheme) and cursors (used in C++), which do not have a bounded lifetime. The benefit of this restriction, like the structured goto, is clearer code and the potential for compiler optimization.

For example, this loop computes the dot product of two arrays, a and b:

loop
  x := sum!(a.elts! * b.elts!)
end

There are three coroutines here:

a.elts! produces the elements of a.
b.elts! produces the elements of b.
sum! accumulates its arguments, returning the current subtotal each time.

When any of these coroutines exits, the entire loop exits.

Iters are defined just like functions, except they make a resumable return using yield, and make a regular return using quit. (Why not return?) Iters are distinguished by an exclamation point suffix.

For example:

elts! : T is  -- successively yield the elements of self
  i ::= 0;
  loop
    while!(i < asize);  -- coroutine which exits if condition is false
     -- res is the implicitly declared message buffer for a coroutine
    res := self[i];
    yield;
    i := i + 1;
  end
end

Absence of reflection

Sather does not support reflection, beyond defining new classes and class types. Objects cannot be viewed as collections and functions cannot be viewed as objects. Tragically, while Sather internally constructs rich data structures to express type relations and type parameterization, none of these structures are visible to the programmer. Why is this an issue? Because it prevents objects from dynamically discovering and utilizing eachother; all interactions must be planned in advance. Since types are not first-class values, I can't give you a type and say "use this for your computations right now." It also prevents creating new types at runtime, to handle unanticipated or dynamic requirements. Finally, it reduces generality and thus increases complexity, because there is this class of special things which are sort of like values and data structures but not quite.

Separate compilation and fragile base classes

Sather's textual substitution semantics for inheritance give rise to the fragile base class problem. Suppose class D includes code from B. In order to compile D, B's code must be available. Furthermore, if B changes, D must be rebuilt. Hence, B is fragile; it needs careful handling to prevent problems in D. Pure virtual base classes, such as in Python and Self, avoid this problem. An alternative to inheritance, which also avoids this problem, is forwarding: D contains an instance of B which it sends messages. The code for B, since it remains behind an abstraction barrier, is not needed to compile D.

PLE Home page

Last modified: Sat Feb 8 16:51:52 EST 1997