Dist-sys-97 Notes for September 26
Homogeneous vs Heterogeneous
The big thing I learned today was in our conversation about
homogeneous vs heterogeneous systems. We started on this by talking
about the reading from Mullender's book, the one that talked about
homogenizing all the machines online to provide an effective
distributed system. This is traditional engineering practice (it
simplifies design a lot if everything's the same), but I find it
unrealistic and limiting. We talked some about heterogeneous
distributed systems..
Heterogeneous Name Spaces
One example we talked about was heterogeneous name spaces. Most name
space proposals involve one global name space that's valid everywhere,
usually organized hierarchically. This eliminates conflicts: the
problem of two people using the same name for different thing
(although it doesn't handle aliases: two different names for the same
thing). It also allows global accessibility.
But what about a name space that wasn't globally heterogeneous?
Unfortunately, we didn't come up with many good arguments in favour of
heterogeneous name spaces. I suspect in some examples it might be
really difficult to have a full, homogeneous name space for practical
reasons - ie, labelling every atom with a unique, global name is
likely to be hard because there are so many atoms. But that's not a
very convincing argument.
If you do have heterogeneous names, then there are two ways to deal
with them. One is to provide some sort of lingua franca that all names
can be translated into (in effect, homogenizing all names). Another is
to build translators between different namespaces.
Heterogeneous Services
I think maybe the name space example was a red herring in the
discussion of homogeneous vs. heterogeneous. A better example might be
online services.
By way of example, I provided an example of searching AltaVista vs.
Hotbot. In a lot of ways these two services are the same, and servers
like Metacrawler homogenize them. But there are subtle differences -
AltaVista returns more useful replies, Hotbot is more up to date. And
I use those differences effectively - I choose one service or another
depending on my needs. The brand name can be useful in distinguishing
services; homogenizing those two services misses out on some power.
Ontologies for Describing Services
There are several ways people find a service they need. One is to
consult an ontology - I need to buy a bicycle so I go to the Yellow
Pages (our friendly ontological source) and look under B. Another is
to know about brand names - I want to buy some clothes, so I know to
go to Nordstrom's and I just walk there (much like I go to Hotbot when
I'm looking for recent web pages). Another way is to look in the
neighbourhood - Kwin just walks down to Newbury St. to buy a fancy new
coat. And another is you ask friends for suggestions (although this
probably requires an ontology to reference).
We observed that a lot of differences between services online can be
quantified. IE: Jango deals with the heterogeneity of different online
CD stores by laying them out on the "price" axis, a criteria that's
easy to manipulate and compare. "Availability", "accuracy", etc are
other simple quantifiable criteria.
If a difference in a service can be quantified, then an ontology can
probably capture that difference fairly effectively (although you've
always got that problem of specifying the ontology). With this sort of
ontology available, and a stupid agent can probably make an
intelligent choice. But I worry that some differences are ineffable,
will never be categorized. Brand name choices will probably have a
role in the future.
All this analsysis might be too complicated, though, and some
practical ad-hoc solutions might be enough to manage a distributed
world of heterogeneous services. You can always ask a human being for
help, too, we're good at dealing with ontological confusion.
Distributed Service Finding
Kwin had a nice idea (that I hope he tells us more about) for
resolving requests. If you have 15 criteria for a service, one agent
doesn't have to resolve them all. Your upstream agent might be able to
help resolve 3 or 4 of them, which then passes it on to another agent
it knows which can help resolve a few more. So multiple agents can
work together to find a service that matches your requirement. It's a
nice intuition about distributing the binding of a service to a
specification.
Identity of Machines (a Network of Heterogeneity)
One last little bit that fits in here somewhere is the idea of
"identity" of machines, what heterogeneous vs. homogeneous really
means. Right now my PC is a highly individual thing - I've spent
months configuring it, installing software on it, making it *mine*,
different from the PC next door. In contrast, something like WebTV or
a NetworkPC (or a dumb terminal) strives to be completely homogeneous,
to work exactly the same as all other machines of the same brand.
NetworkPCs do different things, but they derive all their services
from some (typically centralized) server who uploads code into them
(via Marimba or whatever). The idea is this homogenization of the
workstation makes system administration simpler.
I'm interested in combining these two approaches - leave machines
heterogeneous, but allow machines to import services from each other.
(Preferably peer to peer, not centralized). So each machine on the
network has it's own idiosyncratic services (nicely described in an
ontology), and agents can requests these services from multiple
machines. Everyone helps everyone else out. It'd be a nice world. In
some sense, we live there already with all the distributed things we
do now - DNS, email, Web servers, etc.
Nelson Minar <nelson@media.mit.edu>
Last modified: Tue Oct 14 17:47:21 EDT 1997