Dist-sys-97 Notes for September 26

Homogeneous vs Heterogeneous

The big thing I learned today was in our conversation about homogeneous vs heterogeneous systems. We started on this by talking about the reading from Mullender's book, the one that talked about homogenizing all the machines online to provide an effective distributed system. This is traditional engineering practice (it simplifies design a lot if everything's the same), but I find it unrealistic and limiting. We talked some about heterogeneous distributed systems..

Heterogeneous Name Spaces

One example we talked about was heterogeneous name spaces. Most name space proposals involve one global name space that's valid everywhere, usually organized hierarchically. This eliminates conflicts: the problem of two people using the same name for different thing (although it doesn't handle aliases: two different names for the same thing). It also allows global accessibility. But what about a name space that wasn't globally heterogeneous? Unfortunately, we didn't come up with many good arguments in favour of heterogeneous name spaces. I suspect in some examples it might be really difficult to have a full, homogeneous name space for practical reasons - ie, labelling every atom with a unique, global name is likely to be hard because there are so many atoms. But that's not a very convincing argument. If you do have heterogeneous names, then there are two ways to deal with them. One is to provide some sort of lingua franca that all names can be translated into (in effect, homogenizing all names). Another is to build translators between different namespaces.

Heterogeneous Services

I think maybe the name space example was a red herring in the discussion of homogeneous vs. heterogeneous. A better example might be online services. By way of example, I provided an example of searching AltaVista vs. Hotbot. In a lot of ways these two services are the same, and servers like Metacrawler homogenize them. But there are subtle differences - AltaVista returns more useful replies, Hotbot is more up to date. And I use those differences effectively - I choose one service or another depending on my needs. The brand name can be useful in distinguishing services; homogenizing those two services misses out on some power.

Ontologies for Describing Services

There are several ways people find a service they need. One is to consult an ontology - I need to buy a bicycle so I go to the Yellow Pages (our friendly ontological source) and look under B. Another is to know about brand names - I want to buy some clothes, so I know to go to Nordstrom's and I just walk there (much like I go to Hotbot when I'm looking for recent web pages). Another way is to look in the neighbourhood - Kwin just walks down to Newbury St. to buy a fancy new coat. And another is you ask friends for suggestions (although this probably requires an ontology to reference). We observed that a lot of differences between services online can be quantified. IE: Jango deals with the heterogeneity of different online CD stores by laying them out on the "price" axis, a criteria that's easy to manipulate and compare. "Availability", "accuracy", etc are other simple quantifiable criteria. If a difference in a service can be quantified, then an ontology can probably capture that difference fairly effectively (although you've always got that problem of specifying the ontology). With this sort of ontology available, and a stupid agent can probably make an intelligent choice. But I worry that some differences are ineffable, will never be categorized. Brand name choices will probably have a role in the future. All this analsysis might be too complicated, though, and some practical ad-hoc solutions might be enough to manage a distributed world of heterogeneous services. You can always ask a human being for help, too, we're good at dealing with ontological confusion.

Distributed Service Finding

Kwin had a nice idea (that I hope he tells us more about) for resolving requests. If you have 15 criteria for a service, one agent doesn't have to resolve them all. Your upstream agent might be able to help resolve 3 or 4 of them, which then passes it on to another agent it knows which can help resolve a few more. So multiple agents can work together to find a service that matches your requirement. It's a nice intuition about distributing the binding of a service to a specification.

Identity of Machines (a Network of Heterogeneity)

One last little bit that fits in here somewhere is the idea of "identity" of machines, what heterogeneous vs. homogeneous really means. Right now my PC is a highly individual thing - I've spent months configuring it, installing software on it, making it *mine*, different from the PC next door. In contrast, something like WebTV or a NetworkPC (or a dumb terminal) strives to be completely homogeneous, to work exactly the same as all other machines of the same brand. NetworkPCs do different things, but they derive all their services from some (typically centralized) server who uploads code into them (via Marimba or whatever). The idea is this homogenization of the workstation makes system administration simpler. I'm interested in combining these two approaches - leave machines heterogeneous, but allow machines to import services from each other. (Preferably peer to peer, not centralized). So each machine on the network has it's own idiosyncratic services (nicely described in an ontology), and agents can requests these services from multiple machines. Everyone helps everyone else out. It'd be a nice world. In some sense, we live there already with all the distributed things we do now - DNS, email, Web servers, etc.
Nelson Minar <nelson@media.mit.edu>
Last modified: Tue Oct 14 17:47:21 EDT 1997