PLE lecture notes -- Self

A successor to Smalltalk, Self takes object-orientation to new heights of generality.

From the notes for a Self seminar at Brown university (cs196b):

Self is a derivative of Smalltalk. Its most striking difference is the lack of classes - instead everything is an object. Objects inherit from each other. Every thing is an object, and every action is a message send.
Some of the things Self does not have includes variables, scope, or structured flow of control statements, yet its semantics is powerful enough to provide the equivalent with an equivalent syntax.

Focal points

Small value domain
Uniform message-passing syntax
Object-orientation via prototypes and traits
Reflection with mirrors

Values

Basic values in Self:

Number

2 (30-bit signed integer), 2.5 (float)

String

e.g. 'doesn\'t'

Heap-allocated object

has the form:

(|slots| code)

where both the slots and the code are optional. For example, this is an object which evaluates to 3:

(1+2)

However, the object is not 3, only its value is. There are several kinds of slots an object can have:

Read-only slots are used to store constants and methods. Declared with an equals sign.
Read-write slots are used for variables. Declared with a left arrow and an initial value, which may be nil.
Parent slots are used for inheritance. If a slot lookup fails on the object, the parent objects stored in these slots are then scanned for the slot, in order. Declared with either the read-only or read-write syntax, using a name with a star suffix.
Keyword slots are used to store methods that take arguments. Declared with the read-only syntax, using a name where colons appear in the argument positions.
Argument slots are used for (you guessed it) passing arguments to the object via a method call. Declared with a colon prefixing a name.

For example, this is an object with a constant x, a variable y initialized to 4, an uninitialized variable z, a parent slot p, and no code:

(| x = 3. y <- 4. z <- nil. p* = lobby |)

The dot must be used to separate slots.

This object contains a keyword slot containing the doubling function:

(| sqr: = (| :arg | arg * 2) |)

The doubling function is simply an object with an argument slot and some code. This kind of declaration is so common that there is synactic sugar for it (one of the few instances of sugar in Self). You just bring the colons together:

(| sqr: arg = (arg * 2) |)

An object with a two argument method can likewise be declared with

(| add:To: = (| :x. :y | x + y) |)

(| add: x To: y = (x+y) |)

Quirk: Slot initializers are not lexically scoped; they are evaluated in the context of a special object called the lobby. This code produces an error:

(| y = 3. f = (| x = y | x + 1) |)

because the parser only searches for the initializer y in the lobby. The only way to get a pseudo-lexically scoped initialization is to assign to the slot at runtime:

(| y = 3. f = (| x <- nil | x: y. x + 1 ) |)

Here f evaluates to 4. (See the exercise for why this is only "pseudo"-lexical scoping.)

Stack-allocated object ("block")

A block corresponds to the same notion of block in C or Scheme, except that in Self a block is really just shorthand for creating an object with a value method for evaluating its code. For example,

(| value = ('hello world' printLine) |)
[ 'hello world' printLine ]

are both objects which print "hello world" when the value message is sent to them. This is a useful shorthand since many control structures, such as if, use this message.

Blocks may have slots, including argument slots. If the block has no argument slots, the translation is

[ code ]               ---->   (| value = ( code ) |)

If the block has one argument, the translation is

[|:arg| code]          ---->   (| value: = (|:arg| code) |)

For two arguments:

[|:arg1. :arg2| code]  ---->   (| value:With: = (|:arg1. :arg2| code) |)

However, blocks also inherit from the block in which they are defined, i.e. they are lexically scoped, unlike heap-allocated objects. Blocks are stack-allocated, which means that they disappear after the enclosing block has returned. Therefore they cannot be used as return values.

Messages

The basic syntactical unit is the message send, corresponding to a function call in other languages. The syntax is

object message

i.e. an expression evaluating to an object, followed by a space, followed by a message expression. (Other object-oriented languages use a dot instead of a space.) There are three kinds of message expressions:

A unary message is used to evaluate the contents of a slot. The message is just the name of that slot. So
```
(| x = 3 |) x
```
evaluates to 3. This operation does not simply return the object in a slot; it evaluates that object. For example:
```
(| x = ('hello' printLine) |) x    "prints hello"
```
The philosophy behind this is behaviorism.
A keyword message is used for methods bound to keyword slots, i.e. methods which take arguments. The arguments are placed after the colons in the slot name, just as in the definition of the method. For example, to send a times:Plus: message we use times: 2 Plus: 3. This makes Self read somewhat like English, as opposed to timesPlus(2, 3). The answer, i.e. the resulting value, depends on which object received the message. The spaces and the capitalization are mandatory. The parser uses capitalization to disambiguate nested expressions like
```
times: 2 Plus: [|:a| a] value: 3
```
into
```
times: 2 Plus: ([|:a| a] value: 3)
```
i.e. not the three-argument message times:Plus:value: (which is an illegal message name, because the 'v' is lowercase). This expression, by the way, is the same as times: 2 Plus: 3.
A binary message is shorthand for a keyword send with only one argument. It omits the colon. It can only be used with operators like + and *. So instead of writing x +: y we write x + y. Thus we see that Self interprets arithmetic as a message send to the left operand.

To modify a read-write slot, use a keyword message with the new value as argument. Thus there is no distinction between setting a variable and calling a method. For example,

(| x <- 3 |) x: 4

writes 4 into the x slot. The answer to the message is the receiver of the message, so we can cascade assignments:

((| x <- 3. y <- 4 |) x: 4) y: 3

writes 4 into the x slot and 3 into the y slot. If the parenthesis are dropped, we get an error, because the statement is interpreted as:

(| x <- 3. y <- 4 |) x: (4 y: 3)    "error"

but 4 does not have a y slot. In general, use parentheses liberally, to clarify who is receiving a message.

To add a slot to an object at runtime, use the _AddSlots: message. The argument is a object whose slots will be copied into the receiver. For example,

(| x = 3 |) _AddSlots: (| y = 4 |)

returns the object (| x = 3. y = 4 |).

The interpreter is a bona-fide object, called the shell, so to bind a name in the interpreter you add a slot to this object.

Conditionals

There are two unique Boolean objects, true and false, which respond to the message ifTrue:False:. The arguments to this message are blocks, exactly one of which is evaluated. For example, true implements ifTrue:False: as

ifTrue: b1 False: b2 = ( b1 value )

i.e. it always evaluates the first block. The Boolean objects are returned by comparison operators, allowing the Scheme absolute-value expression

(if (< x 0) (negate x) x)

to be written as

(x < 0) ifTrue: [ negate x ] False: [ x ]

A more complex example is factorial:

| fact = ((self < 2) ifTrue: [ 1 ] False: [ self * ((self - 1) fact)]) |
5 fact         "prints 120"

(Remember, to define this function in the Self interpreter, you must surround it with lobby _AddSlots: (| ... |).)

Object-orientation

Unlike Python, Self has no notion of a "class". Everything is an object; objects inherit from other objects, via parent slots. Conceptually, however, it is useful to distinguish objects which are the parents of many other objects, regardless of whether the language makes a syntactic distinction. In fact, the Self standard library makes use of several object-oriented design patterns:

Traits: A traits object defines methods for the objects which inherit from it. In Self, the sequence traits object defines the minimum behavior that all sequences must have, e.g. do: and at:.
Mixin: A mixin object is like a traits object but provides a much smaller set of methods. It is intended to be inherited in addition to some traits object. For example, the ordered mixin provides comparison operations. We don't normally make an object which is simply "ordered," though we may have an "ordered sequence".
Prototype: A prototype object is not inherited, but cloned to make new objects. It is used to define an initial set of slots. Many object-oriented languages, e.g. C++, combine prototype and traits into a special construct called a "class." This is unnecessary and less powerful.
Oddballs: An oddball is a unique object, such as true or nil. Such objects do not need a class, since there is only one instance. Unlike class-based languages, Self allows us to create an oddball without any extra baggage.
Categorized globals: A global namespace object is inherited by all objects. This namespace is a composite of (i.e. a collection of parent slots to) namespaces for the different categories of globals: traits, mixins, prototypes, applications, etc. Thus we can selectively remove related names. For example, we can remove all objects which allow "unsafe" operations, to achieve a similar security effect as Safe-Tcl and Safe-Python.

Dynamic inheritance

Self allows inheritance links to change at runtime, since a parent slot can be read-write. This is a good example of orthogonality in a language. Self stores the fact that a slot is a parent slot with a bit associated with the slot. This bit can also be changed at runtime, to dynamically shut off or turn on inheritance. Python is also a dynamic interpreted language, but does not allow these kinds of operations.

Closures

Self has no lexically scoped, heap-allocated objects corresponding to Scheme's closures. For example, the Scheme make-counter procedure:

(define (make-counter n)
  (lambda () (set! n (+ n 1)) n))

(define c (make-counter 4))
(c)                          "prints 5"
(c)                          "prints 6"

cannot be translated directly into Self. Instead, we must use an explicit prototype counter object, which is cloned, initialized, and returned:

lobby _AddSlots: (| 
  proto_counter = (| n <- nil. value = (n: n + 1. n). 
                     parent* = traits clonable |).
  make_counter = (proto_counter clone n: self) 
|)

_AddSlots: (| c = 4 make_counter |)
c value                             "prints 5"
c value                             "prints 6"

Of course, this is really what is happening behind the scenes in Scheme. In Self we just have to make this mechanism explicit. A benefit is that we can now access n from outside:

c n        "prints 6"
c n        "prints 6"
c n: 1
c value    "prints 2"

We can also use polymorphism to treat c as a block:

(1 & 2 & 3) asSequence do: c   "increment the counter 3 times"

Reflection with `mirrors`

The message _Mirror, when sent to any object, will return a collection of slots called a mirror which can be used to refer to that object. The mirror acts like a table of slots (cf. Python's __dict__). Examples:

View all of the slots defined by the object: _Mirror printAll.
View the slot "x": _Mirror at: 'x'
Change the contents of slot "x" to 3: _Mirror at: 'x' Put: 3
Rename slot "x" in the shell to "y": (shell _Mirror at: 'x') name: 'y'

Since functions in Self are just objects, we can modify them, too:

_AddSlots: (| f = (| x = 3 | x) |)               "f is a function returning 3"
(shell _Mirror at: 'f') contents at: 'x' Put: 4
f                                                "prints 4"

PLE Home page

Last modified: Sat Feb 8 16:33:34 EST 1997