An Automatic C++ to Scheme Interface Generator

Header2Scheme version 1.4 is now available, and contains a few bug fixes and enhancements. The source code is available for download.

The Problem

There are many situations in which more text-based interactivity is desired in a C or C++ program than simple printf's and scanf's. Unfortunately, to create a full-fledged interpreter is a challenging and time-consuming task. Tcl is one solution that many people choose, but has several disadvantages, among them that it is slow and stores all data types as character strings. Scheme is a very elegant interpreted language for which there are many implementations of interpreters. Unfortunately, up until now, there have been relatively few useful libraries for which Scheme bindings have been created, so acceptance of Scheme as a general interpreted language solution has been minimal.

One solution

Header2Scheme is a program which reads in a directory tree full of C++ header files and compiles them into C++ code. This new code, when compiled and linked with SCM (Aubrey Jaffer's Scheme interpreter) implements the back end for a Scheme interface to the classes defined by these header files.

Why SCM?

I looked at a few Scheme interpreters before deciding on SCM; two of these included Scheme48 and libScheme. Each had its advantages, but SCM had the distinction of being the fastest of the pack. Since my work focuses mainly on real-time graphics applications, SCM had the obvious advantage. (However, libScheme had a very elegant, object-oriented programming interface which was very appealing.)

Basic syntax defined by Header2Scheme

Given a C++ class:
	class Foo {
		int memberFunction();
		Yabba *class_variable;
The Scheme interface becomes
	(define my-foo (new-foo))
	(define my-int (-> my-foo 'memberfunction))
	(define my-yabba (-> my-foo 'class_variable))
An alternative calling interface is
	(foo::memberfunction my-foo)
	(foo::class_variable my-foo)

More on the backend

When interfacing C++ and Scheme, there are several issues to take into consideration: I have tried to address all of these problems in Header2Scheme.

Conversion of data structures

Basic C types often have analagous Scheme data structures: Header2Scheme generates code to convert fixed-length C arrays into Scheme vectors and back. (See the section below on pointers and references.) Strings have essentially the same representation in Scheme and C, modulo a type tag in Scheme. Floats and ints are likewise similarly represented and have all the necessary conversion code generated by Header2Scheme.

More sophisticated C data structures (i.e. the ones which Header2Scheme is designed to provide access to) are represented in the Scheme backend as pointers with a type tag.

Pointers and references

All "complicated" C structures are represented in the Scheme backend essentially as C pointers with a type tag. On the Scheme side of things, there are no notions of pointers, references or dereferencing; everything is an "object".

When making a function call (see the section below on type checking) pointers are automatically dereferenced.

Type checking and overloaded functions

One of the most elegant aspects of Scheme is that there is no type checking; everything is considered to be "data". Because Header2Scheme fundamentally is dealing with C, however, it necessarily enforces some type checking.

When you call a member function of a class via Scheme, the code that Header2Scheme produced checks the Scheme arguments' types and attempts to find the appropriate (possibly overloaded) C++ function. If one can't be found, the interpreter produces a wrong type error (as opposed to a segmentation fault). Otherwise, the arguments are converted to their C equivalents, the function is called, and the return value, if any, is converted to Scheme format and returned to the interpreter.

In Scheme, all arguments to functions are passed by value. Unfortunately, C++ allows passing pointers and references to variables during function calls, and many C functions would become unusable if they were unable to side-effect their parameters. Because of this, the backend that Header2Scheme generates allows side-effects to propagate, where possible, to all arguments that are passed by reference (either via a pointer or a reference). (Scheme integers and characters can not be mutated because they are always passed by value.) It should be noted that this can cause unpleasant and unexpected effects in code:

	class SbVec3f {
		void getValue(float &x, float &y, float &z);

	> (define x 0.0)
	> (define y 0.0)
	> (define z 0.0)
	> (define my-list (list x y z))
	> my-list
	(0.0 0.0 0.0)
	> (define my-vec (new-SbVec3f 3 4 5))
	> (-> my-vec 'getValue x y z)
	> x
	> y
	> z
	> my-list
	#(3.0 4.0 5.0)	
If, on the other hand, the command which had mutated x had been the standard Scheme
	> (set! x 2.0)
	> my-list
	#(0.0 0.0 0.0)
my-list is (correctly) not mutated in this example.


Consider the following two classes:
	class a {
		void foo();

	class b : public a {
		void bar();
The member function "foo" can be called from an object of type "a" or "b". However, the member function "bar" can only be called from an object of type "b". Standard Scheme has no built-in notion of inheritance, so the question remains of how to decide at runtime whether a given member function can correctly be called for a given Scheme object.

Header2Scheme solves this problem by reconstructing its own idea of the desired class hierarchy in a pre-processing step while the backend to the Scheme interface is being generated. This hierarchy is then reconstructed when the Scheme interpreter is run, and is referenced to resolve run-time questions of type checking.

What has been done with Header2Scheme?

I have used Header2Scheme to create a Scheme binding for Open Inventor, a 3D graphics toolkit developed by Silicon Graphics. This package is called Ivy, and is available from the Ivy home page.

Changes in version 1.4

Downloading Header2Scheme

h2s-1.4.tar.gz (290K) contains the source code and documentation for Header2Scheme. Header2Scheme is Copyright 1995 Kenneth B. Russell and is covered under the GNU General Public License.

Other links

GUILE is a collaborative effort to make embedded Scheme more ubiquitous.

SWIG is a more up-to-date glue code generator than Header2Scheme which supports multiple language bindings.


Please send me email if you have any comments, questions or suggestions, or have problems with the distribution.

Kenneth B. Russell - kbrussel@media.mit.edu

$Id: index.html,v 1.15 2001/06/02 05:20:33 kbrussel Exp $