PLE lecture notes -- Python

Python is one of the most advanced "scripting" languages, which makes it of great practical as well as theoretical interest. This overview covers version 1.2.

Focal points

Rich value domain
Simple, but powerful form of object-orientation
Reflection
Elimination of pointer variables

Values

Basic values in Python:

Number

2 (integer), 2L (long integer), 2.5 (float)

String

e.g. "doesn't" or 'doesn\'t'

Fixed-size array ("tuple")

x = (1, 2, (3, 3.5), "four")

x[1] is 2, x[2][0] is 3.

Variable-length array ("list")

x = ["spam", 100, 12.34, ["car", "cdr"]]

x[1] is 100, x[3][1] is "cdr".

Variable-length associative array ("table")

x = {"name":"joe", "address":(16, "bit", "pkwy"), "salary":65535}

x["name"] is "joe", x["address"][1] is "bit".

Note that

"hello"[2],
("h","e","l","l","o")[2],
["h","e","l","l","o"][2], and
{0:"h", 1:"e", 2:"l", 3:"l", 4:"o"}[2]

are all equal to "l".

Functions

add = lambda x,y: x+y

add(1,2) is 3.

Python is dynamically typed, and generally doesn't restrict what values you bind to a name. Data structure components and function arguments are not typechecked. Function arguments are passed by value, free variables are lexically scoped, and functions always return exactly one value (which may be the special value None).

Orthogonality

In the interest of orthogonality, arithmetic operators have meaning for all of the sequence data types: ("b" + "an"*2 + "a") is "banana". It might be nice if tables could be merged with "+", filtered with "-", etc., but that was not to be. Python does not allow redefining the behavior of the basic types.

Syntax

Syntax is reminiscent of C, but variables do not need to be declared (and they aren't typechecked, either). Here is factorial (indentation is mandatory):

def fact_rec(n):
	if n < 2:
		return 1L
	else:
		return n * fact_rec(n-1)

def fact_iter(n):
	a = 1L
	for i in range(2, n+1):
		a = a * i
	return a

(Like Scheme, Python provides a synonym (def) for creating a function value and binding it to a name. This name, like all names, can be rebound, possibly to a different data type.)

Unpacking

Python supports a simple form of pattern matching called unpacking, for multiple assignment:

a, b              = b, a           (unpack tuple)
(left, right)     = (red, blue)    (unpack tuple, parentheses are optional)
[one, two, three] = [1, 2, 3]      (unpack list)

We will see more complex forms of this when we get to Haskell.

Exceptions

Exceptions, as in CLU, allow functions to return "exceptional" values; i.e. something besides what they normally return. For example, we can check for negative inputs to factorial:

negfact = "factorial of negative number"
def fact_exc(n):
	if n < 0:
		raise negfact, n
	elif n < 2:
		return 1
	else:
		return n * fact_exc(n-1)

try: 
	print fact_exc(-5)
except negfact, n:
	print fact_exc(-n)

Object-orientation

We will first explore how object-orientation could have been done in Python, followed by how it actually is done.

Tables

Because Python includes tables as a built-in data type, we can already make packages of named variables and functions, i.e. data abstractions. For example, a complex number abstraction:

from copy import copy
def c_add(x,y):
	z = copy(complex)
	z["real"] = x["real"] + y["real"]
	z["imag"] = x["imag"] + y["imag"]
	return z
complex = {"real":0, "imag":0, "add": c_add}
x = copy(complex)
x["real"] = 5
x["imag"] = 3
y = copy(complex)
y["real"] = 2
y["imag"] = -2
z = x["add"](x,y)  # or y["add"](x,y)
z["real"]          # prints 7
z["imag"]          # prints 1

We're using a prototype-based style here, where the notion of complex number is expressed by a prototypical value that we make copies of.

Classes

Unfortunately, this syntax is rather cumbersome, and furthermore tables do not support inheritance. Therefore Python introduced a new data type, the class (though it actually resembles an object in C++, not a class.) A class is like a table, except it uses a shorter, more familiar access syntax (x.real instead of x["real"]) and can inherit from other classes. The syntactical definition of a class also differs significantly from a table expression. Here is how the same example might appear using a class:

class complex:
	real = 0
	imag = 0
	def add(x,y):
		z = copy(complex)
		z.real = x.real + y.real
		z.imag = x.imag + y.imag
		return z
x = copy(complex)
x.real = 5
x.imag = 3
y = copy(complex)
y.real = 2
y.imag = -2
z = x.__dict__["add"](x,y)  # x.add(x,y) 
z.real                      # prints 7
z.imag                      # prints 1

The caveat here is that Python does not allow us to call x.add directly, so we must use a kludge. Python does, however, allow us to embed the definition of add inside the class. Here is how it might look otherwise, using an external function as before.

Syntactical differences aside, these two complex-number abstractions work in exactly the same way. In fact, Python allows us to blur the difference by treating a class like a table (but not vice-versa), via the __dict__ attribute or vars() function:

x = copy(complex)
x.real = 5
x.imag = 3
x.__dict__["real"]	# prints 5
vars(x)["imag"]		# prints 3

(this is also a handy way to print an object, without defining a special print method.)

Instances

So why doesn't Python let us call x.add? Because it wants us to use another special data type, the class instance. A class instance is almost a class, but not quite. We make one by calling the class as a function. Here is the "official" Python way to define complex numbers:

class complex:
	real = 0
	imag = 0
	def add(x,y):
		z = complex()
		z.real = x.real + y.real
		z.imag = x.imag + y.imag
		return z
x = complex()	# now x is an instance, not a copy
x.real = 5
x.imag = 3
y = complex()
y.real = 2
y.imag = -2
z = x.add(y)	# yay!
z.real		# prints 7
z.imag		# prints 1

Now we can call x.add directly. In doing so, Python automatically passes the instance (x in this case) as the first argument. This is a pinch of syntactic sugar, only allowed with class instances.

Other differences between classes and class instances:

A class instance retains a tie to its "mother" class, which allows it to (sometimes) respond to changes in the class. We will explore the details of this in the exercises.
A class instance cannot be instantiated by calling it as a function. That is, there is no "class instance instance" data type.

Now that we have classes, do we need tables anymore? Unfortunately we do, since only tables can be indexed by an arbitrary data type, not just single words. The class instance concept, as we just saw, also has significant overlap with the class concept. What does this three-way overlap say about Python's generality? Do you think Python could have instead just introduced an additional, abbreviated syntax for regular tables, and allowed them to inherit from other tables? What would be the ramifications? Would it make programs more confusing? More powerful? We will return to this idea when we tackle Self.

Derived classes

Python supports multiple inheritance. In the following example, the pos_complex class is derived from the complex class. It ensures that the real part is non-negative. The example also demonstrates the use of an explicit constructor function (called __init__ in Python).

class complex:
	def __init__(self,re,im):
		self.re = re    # member variables don't have to be declared
		self.im = im
	def add(self, b):
		return complex(self.re + b.re, self.im + b.im)

class pos_complex(complex):
	def __init__(self,re,im):
		if re >= 0:
			return complex.__init__(self,re,im)
		else:
			raise "NegativeRealPart"

Note however, that adding two pos_complex numbers will return a complex number (not a pos_complex number), since the pos_complex class uses the add method from the base class. We'll ask you to fix this in the exercises.

Modules

Python allows a class definition to be saved to disk in the form of a module, where it can be loaded when needed. For example, Python has random-number functions in a module called rand, stored in Lib/rand.py (take a look). A module definition differs slightly from a class definition, in that no "class" header is required; the whole file is assumed to define a class whose name is the filename. We can load the class rand with import rand. You can use dir(rand) to see what names it defines. (dir() is a class operator similar to vars().) We see that the rand module defines three methods and includes another module, whrandom. For example, we can call choice to select a random element from an array:

rand.choice((1,2,3))       (prints 2)
rand.choice((1,2,3))       (prints 1)

We can also call the random function in the whrandom module to get random numbers:

rand.whrandom.random()     (prints 0.252201655156)
rand.whrandom.random()     (prints 0.404101825986)

The fine print: a module actually differs from a class, in that it cannot be instantiated. But a module isn't a table or class instance, either. (Sigh.)

Reflection

Just as objects can be viewed as tables, in Python we can view the current namespace as a table. We just leave out the argument to vars():

def f(x):
	return x*2
v = 3
self = vars()
self["f"](self["v"])    (prints 6)
self["v"] = 4
v                       (prints 4)

Print the value of self to see the names you have defined in the interpreter.

Since vars() is an operator on classes, it appears that the environment is itself a class. Strangely, there seems to be no way to refer to this class directly.

Python allows user-defined classes to provide methods for almost all language operators, including procedure call. However, since these methods cannot be changed for the built-in types, this feature is more like orthogonality than reflection.

Elimination of pointer variables

This idea probably first originated in Lisp, but Python uses it, and it is an important idea that many language designers overlook. The idea is to eliminate explicit pointers by making every variable an implicit pointer. Thus a variable does not represent a location in which data is stored, as in C; variables represent name tags. Assignments do not copy data; they just change the name bound to some data.

Thus instead of C's pointer-laden definition of a binary tree:

struct BinaryTree {
	int value;
	struct BinaryTree *left, *right;
}

we just define an environment with three name tags, waiting to be pinned onto something:

class BinaryTree:
	value = None
	left = None
	right = None

Quirk: Local vs. global variables

Variables in a function by default refer to global names, but if a variable is assigned to anywhere in the function, it becomes local. (This allows the allocation of locals to be determined at parse-time, without declaration.) If you want to assign to a global variable, you must use a global declaration, which is an instruction to the parser, that applies to all of the references and assignments of that variable which come syntactically after the declaration. Once made, there is no way to "undo" a global declaration in the body of the function. Examples:

y=0
def f():
	print y  # prints 0

def f2():
        y = 2    # set local y
	print y  # prints 2; global y is not modified

# but if you swap the order, gives an error
def f3():
        print y   # error: y used before defined
        y = 2     # makes Python think y is local (future affects past!)

def f4():
        y = 1     # set local y
	global y  # declare y to be global
	print y   # prints 0 (not 1)
	y = 2     # change global value of y to 2

def f5():
        print y   # this produces an error, because the global
                  # statement hasn't taken effect yet
        y = 2     # this causes Python to think y is local
        global y  # until this declaration

PLE Home page

Last modified: Wed Jan 8 22:42:29 EST 1997