1. Introduction

1.1. Scenario - 1.2. Application Domain - 1.3. Overview of Thesis

In the novel Neuromancer, Science-fiction writer William Gibson let his imagination run wild, envisioning the global computer network being an immersive space, much like a parallel dimension, into which people could jack via neural implants (Gibson 1984). This was a shared graphical space, not constrained by the laws of a physical reality, allowing people to interact with remote programs, objects and other people as if they were locally present. This novel stirred many minds and is frequently referred to as the origin of the term Cyberspace.

Another influential science-fiction novel is Snowcrash, written by Neal Stephenson, where a near-future scenario describes the Metaverse, a computer generated universe in which people can go about their digital business clad in 3D graphical bodies, termed avatars (Stephenson 1992). (The word avatar comes from Sanskrit and means incarnation).

Fiction turned real, but not quite

In 1985, Lucasfilm created Habitat, an online service in which each user was represented as an avatar that could be moved around a common graphical space using the keys on a keyboard (see Figure 1). Users could manipulate the environment as if they were playing a computer game, or they could interact with other users through text messages displayed along with the figures.

Now, as the Internet embraces sophisticated graphics, dozens of similar Internet-based systems have emerged. Some are 2D in nature, like Habitat, others plunge into the third dimension, partly fueled by the VRML standardization of 3D graphics interchange on the Internet. While still not using Gibson’s neural implants or Stephenson’s goggles, these environments provide windows into a shared visually rich universe in which you can see remote users float around. However, when you step up to an avatar to start a conversation, the spell is broken because current avatars don’t exploit embodiment in the discourse. At best they move their lips while a user is speaking, but things like shifting the gaze or gesture with the hands are absent or totally irrelevant to the conversation.

My contribution

I use a model derived from work in discourse theory, dealing with multiple modes of communication, to animate communicative visual behavior in avatars. I have built a working prototype of a multi-user system, BodyChat, in which users are represented by cartoon like 3D animated figures. Interaction between users is allowed through a standard text chat interface. The new contribution is that visual communicative signals carried by gaze and facial expression are automatically animated as well as body functions such as breathing and blinking. The animation is based on parameters that reflect the intention of the user in control as well as the text messages that are passed between users. For instance, when you approach an avatar, you will see from its gaze behavior whether you are invited to start a conversation, and while you speak your avatar will take care of animating its face and to some extent the body. In particular it animates functions such as salutations, turn-taking behavior and back channel feedback.

1.2. Application Domain

Virtual Bodies

This work introduces an approach to animating virtual bodies that represent communicating people. The following sections present three different types of applications where the avatar technology presented here could be employed to enhance the experience. The existence and popularity of these applications serves as a motivation for the current work.

Chatting

Pavel Curtis, one of the creators of LambdaMOO (Curtis 1992), advocates that the Internet "killer app of the 90’s" is people. His point is that whatever business we go about on the global network, we shouldn’t have to be alone, unless we want to. You should be able to see and communicate with people strolling the isles of a supermarket, hanging out in the café or waiting in lines, be it in an old fashioned mall or an on-line shopping center. A new era in technology is upon us: the age of social computing (Braham and Comerford 1997).

Systems that allowed a user to see who was on-line and then enabling them to exchange typed messages in real-time date back to the first time-sharing computers of the 1960s (Rheingold 1994) . Later systems, such as the Internet Relay Chat (IRC), have been widely popular as a way to convene informal discussions among geographically distant people, "but the continuing popularity of IRC appears to be primarily a function of its appeal as a psychological and intellectual playground" (Rheingold 1994, 179). The IRC and more recently, various Distributed Virtual Environments (DVEs), seem to serve a purpose as public meeting places analogous to their real world counterparts, but not confined to physical distances.

Telecommuting

In today’s global village where multi-national companies keep growing and research institutions in different countries join forces to address major issues, the demand for efficient channels of communication across long distances has never been greater. The field of Computer Supported Collaborative Work (CSCW) is exploring ways to create systems and techniques that help distributed workgroups to jointly perform a task and share experience.

One aspect of such a system is real-time communication between the members of the group in the form of a virtual meeting. There it is important to incorporate some mechanisms to assist in managing the flow of turns to avoid a chaotic situation. For dealing with this and other issues of mediating presence, representing participants visually is a powerful approach. Consider a large meeting where half of the participants are physically present in the room but the other half is participating through a speakerphone. The remote people are soon dominated by the others, and often reduced to mere overhearers (according to personal communication with various sponsors).

Gaming

Computer gaming has until recently been mostly a solitary experience, but with the sudden surge in household Internet connectivity the global network is fast becoming a sprawling playground for all kinds of game activity. Text based games and role-playing environments have been on-line for awhile, such as the popular MUD (Multi-User Dungeon) that has been around for almost two decades. But now a wide selection of simulations, war games, action games, classic games as well as different versions of role-playing games offer a graphically rich environment in which you can interact with other game players across continents. Although many of those games pose players head-to-head in combat, others encourage group co-operation and interaction. These games already provide captivating virtual worlds to inhabit and they often represent users as avatars, adapted for the environment and the particular game experience.

1.3. Overview of Thesis

The previous chapter has served as an introduction to the domain of this work, and motivated the subject by presenting some applications. The remainder of the thesis is divided into four chapters that in a general sense present, in order, the problem, the theoretical tools for working on the problem, how this work applies the tools, and conclusions. Chapter 2 starts by describing in detail an already existing system and then goes on to discuss the shortcomings of current systems with regard to avatars. Chapter 3 is a review of relevant work from various research areas related to and supporting this work, establishing a solid foundation. Chapter 4 discusses the working prototype, how it starts to address the stated problems, its system architecture and how it is theoretically rooted. Finally Chapter 5 gives a summary, evaluation and suggests future directions.