This work was a class project for MIT course 6.338J: Parallel Scientific Computing, Spring 1998.
There are a lot of idle machines out there on the net. Wouldn't it be nice to be able to take advantage of that wasted CPU time? The goal of this project was to build a prototype system that allows users to submit computational jobs to the network.
The prototype system was built in two parts:
The main issue in building this system was security. If no one will allow jobs to be run on their machine because it will jeopardize their ability to use it, then the efficiency of an implementation won't matter. Security is the stated concern regarding this sort of activity on Athena machines.
A second major issue is portability of code. If you don't even know what host your code is going to be run on, then you probably don't know what kind of machine it is either. Keeping binaries for every conceivable architecture is not only inconvenient, but requires access to that architecture for compilation and possibly even architecture-dependent code development .
The main reason to submit jobs to the network is the hope that by doing so you'll get an answer sooner than you would if you ran the job locally, so efficiency is obviously important.
The network is a big place, without central management. Machines, or whole sections of the net, can drop out of sight for a variety of reasons. Algorithm robustness in this environment is a very hard problem.
We chose Java as the implementation language for several reasons:
Along with poor efficiency, the lack of good development environments for Java is also an issue. As the Java language matures, good development tools will become more available. In the near term, the large array of supplied classes and the strong object-oriented nature of the language offsets this problem by making large project development relatively easy (compared to C, or C++).
This system for remote computation is viable, and will become more viable as the Java language, compilers and VMs mature. Designing algorithms for efficiently and robustly using this sort of distributed, loosely organized, collection of processors is obviously a very hard problem worthy of serious research.
Scott and I presented our final project at the end of Spring term 1998. A little over two months later Sun put up a web page describing the current state of a project they had begin in 1994 called Jini.
The Jini pages spend a lot of time selling the idea that networks can now self-configure. The idea is that appliances on the net provide other devices with portable Java code to use as device drivers, and that this all happens in a dynamic, self-configuring way. They call this idea Spontaneous Networking.
Spontaneous Networking is certainly a powerful concept, but there's a lot more there. If you look deeper into the technical information you'll find that support for exactly the kind of distributed computation that we explored in this project seams to be implicitly supported. Very cool!
Watch for the free source-code download opportunities for researchers and enthusiasts, available soon.
Don't miss the Top500 Supercomputer Graph-O-Matic on this same server! Coded up after the first lecture to give you a feel for trends in super computing.