hugo :: brainstorms in sociable media (mas.961 '03)
"only as an æsthetic phenomenon is
existence and the world justified"

- nietzsche


emotus ponens picture
::: who am i? :::
You are what you eat - Anonymous

monty tagger picture

Who Am I?
Mapping a Person into the Space of Identities

A key competency that a sociable computer must have is attaining a deep understanding of its user. Unfortunately, progress in the person modeling literature in artificial intelligence is slow and often short-sighted. Two prominent approaches are behavior modeling and demographic profiling. In behavior modeling, a person is represented as a history of behaviors, i.e. what actions they took in the context of some application domain. For example, intelligent tutoring systems track a person's test performance (Sison & Shimura, 1998); and collaborative filtering systems track user purchasing and browsing habits and compare them with those of like-minded people to make predictions about the user's attitudes (Shardanand & Maes, 1995). The chief drawbacks of behavior modeling are that 1) knowledge of user action sequences is generally only meaningful in the context of a particular application, and 2) a statistical behavior model of a person is just a vector of numbers, completely uninterpretable by itself. A second approach to person modeling is demographic profiling, in which gathered demographic information about a user is used to draw generalized conclusions about user preferences and behavior. A drawback of this approach is that demographic profiling tends to overgeneralize people by the categories they fit into, and often requires additional user action such as filling out a user profile. This is not to say that generalizing about persons is a bad approach, because when we mentally model other people, we, like computers, also overgeneralize. The difference is that people have much richer knowledge-based and experience-based vocabularies for generalization than a demographic profiling system acting on a sparse rule set with rules like: "age 18-25 --> likes britney spears."

Meanwhile, an important aspect of a person has been overlooked: personal identity. A person's identity is strongly correlated with their beliefs, goals, and desires, which in turn drive their behavior. A person's behaviors are often the predictable product of his/her identity. Whereas behavior modeling views a person in terms of a history of behaviors, using this history to predict future behaviors, it would seem much more productive to model a person's source of beliefs, goals, desires, and behaviors. Perhaps demographic profiling hopes to do this, but its approach is weak. Demographic categories do not have very much breadth or depth, and there is no inherent empirical basis for any inferences made from these categories. An emerging study in marketing research, called, psychographics (i.e. psychology+demographics), is taking a more promising direction in trying to see how people are more realistically grouped in society (e.g. "soccer moms" vs. "pickups and shotguns"). The intent of this research project is friendly to psychographics.

How could we possibly hope to model something as grandiose as the totality of a person's identity? The German sociologist Georg Simmel suggested that identity is not monolithic, but rather fragmented, and dually public and private in nature (1908). In addition, the public fragments of a person's identity are determined by social roles. For example, being a police officer confers something on my identity, as well as being a dog-owner, or a student. As we begin to deconstruct identity, we find that a compelling picture of a person can already be painted by understanding the different social archetypes that a person can be described by. There is also evidence from psychology about the centricity of archetypes, i.e. as people begin to think in terms of these linguistic signifiers, they naturally begin to fashion themselves solely in terms of the signifiers available to them, (Lacan, 1986) making knowledge of these archetypes even more powerful predictors of people.

If we can computationally characterize the beliefs, desires, and goals of a large enough set of social archetypes (call this collection of archetypes an identity map), and if we can partially classify a person into a neighborhood in this identity map, then we will find ourselves with a substantial model of a person's beliefs, desires, goals, and behaviors. People certainly possess such models, and employ this knowledge to characterize the identities of others on the basis of social cues expressed and given off by these others. (Goffman, 1959)

This is the rationale behind the proposed "Who Am I?" (WAI) project. The goal is for WAI to be able to characterize a person's identity by analyzing his/her personal text, perhaps something on the order of a person's homepage, or better yet, their weblog. The outputted characterization of a person will consist of a list of social archetypes a person is thought to belong to, along with associated confidence scores.

On of the major challenges is how WAI can acquire a substantial and meaningful collection of social archetypes. Online special interest groups provides a potential source of data. The Dmoz Open Directory Project has clusters of personal web pages dedicated to various interests (e.g. croquet), subcultures (e.g. raver), and professions (e.g. nurses). There are 1400 clusters in all, each with between 15-200 personal pages dedicated to the topic. Within each cluster, an automatic computer reader skims each website in the cluster extracts simple beliefs, interests, disinterests, and goals. The extracted information will likely take the form of concept-affective-valence-pairs, e.g. ("Britney Spears", -100%), ("make money", 85%). These attitudes will be mined using methodology similar to (Liu & Maes, in press). The beliefs, interest, disinterests, and goals most common to each cluster are extracted. This would form the "essence" of a social archetype. Of course, we don't propose that each of the 1400 clusters will a very meaningful archetype, because we want a less granular definition of archetype than a single interest or subculture (although, in today's sick sad materialistic culture, people often define themselves by laundry lists of interests...see friendster if you need convincing). We can classify each of these 1400 archetypes into meta-archetypes which can be more meaningful. For example, the kayaking, bungee jumping, and snowboarding archetypes can form a "thrill-seeker" meta-archetype. We can either manually classify archetypes into meta-archetypes (1400 isn't so bad), or, we can use machine learning to learn these archetypes from online communities like weblog blogrings.

Having built a map of social archetypes, WAI automatically analyzes a person based on some personal text, like a homepage, or a friendster profile. WAI generates its hypotheses about a person's current identity composition, based on his/her current beliefs, interests, disinterests, and goals.

"Who Am I?" would have innumerably many interesting applications. Just bubbling what the computer's guess of a person's identity composition directly into a visualization could be useful and entertaining reflective feedback. I also envision computer applications which can use a person's identity profile to drive customized interactions. Talk to runners using journey metaphors, like "on the road to success". Offer a different interface to people who are "wild" versus "reserved", or "liberal" versus "conservative", or "sporty" versus "bookwormish".

I propose to implement a collection of computational models of social archetypes, as described above, and an application for visualizing a person's perceived identity-composition given a textual input like a homepage. Time permitting, I will prototype an application to use WAI that can tailor an interaction with a user based on knowledge about the person's social identity composition.



Goffman, Irving (1959). The Presentation of Self in Everyday Life: Introduction

Lacan, Jacques. Seminar XI: Four Fundamental Concepts of Psychoanalysis. Trans. Alan Sheridan. Penguin, 1986.

Liu, Hugo and Maes, Pattie (in press). What Would They Think? A Computational Model of Attitudes. To appear in Proceedings of the 2004 International Conference on Intelligent User Interfaces, IUI 2004, January 13–16, 2004, Madeira, Funchal, Portugal. ACM 2004, ISBN 1-58113-815-6

Shardanand, U. and Maes, P. (1995). Social information filtering: Algorithms for automating "word of mouth", Proceedings of CHI'95, 210-217.

Simmel, Georg (1908). How is Society Possible?

Sison, R. and Shimura, M. (1998). Student modeling and machine learning. International Journal of Artificial Intelligence in Education, 9:128-158.




H U G O . . L I U ...

program in comparative media studies, mit

the media laboratory, mit
if you like my work, please link to me
hugo at media dot mit dot edu