top bar

Selected Thesis Chapters
Related Papers

Communicative Humanoids

A Computational Model of Psychosocial Dialogue Skills

Kristinn R. Thórisson

Submitted to the Program in Media Arts & Sciences,
School of Architecture & Planning on July 19, 1996 in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology


Face-to-face interaction between people is generally effortless and effective. We exchange glances, take turns speaking and make facial and manual gestures to achieve the goals of the dialogue. Endowing computers with such an interaction style marks the beginning of a new era in our relationship with machines — one that relies on communication, social convention and dialogue skills. This thesis presents a computational model of psychosocial dialogue expertise, bridging between perceptual analysis of multimodal events and multimodal action generation, supporting the creation of interfaces that afford full-duplex, real-time face-to-face interaction between a human and autonomous computer characters. The architecture, called Ymir, has been implemented in software, and a prototype humanoid created. The humanoid, named Gandalf, commands a graphical model of the solar system, and can interact with people using speech, manual and facial gesture. Gandalf has been tested in interaction with users and has been shown capable of fluid face-to-face dialogue. The prototype demonstrates several new ideas in the creation of communicative computer agents, including perceptual integration of multimodal events, distributed processing and decision making, layered input analysis and motor control, and the integration of reactive and reflective perception and action. Applications of the work presented in this thesis can be expected in such diverse fields as education, psychological and social research, work environments, and entertainment.

Assistant Professor of Media Arts & Sciences, MIT Program in Media Arts & Sciences
Associate Professor of Media Arts & Sciences, Sony Corporation Career Development Professor of Media Arts & Sciences
Research Scientist, AT&T Labs Research


 Key parts of my thesis that have been published in peer-reviewed journals and conferences
  A Mind Model for Multimodal Communicative Creatures and Humanoids
International Journal of Applied Artificial Intelligence, 13(4-5): 449-486

Thórisson, K. R. (1999)
Machine Perception of Multimodal Natural Dialogue
In P. McKevitt, S. Ó Nulláin, C. Mulvihill (Eds.), Language, Vision & Music, 97-115. Amsterdam: John Benjamins
Thórisson, K. R. (2002)
Real-Time Decision Making in Multimodal Face to Face Communication
Second ACM International Conference on Autonomous Agents, Minneapolis, Minnesota, May 11-13, 16-23
Thórisson, K. R. (1998)
Layered Modular Action Control for Communicative Humanoids
Computer Animation '97, Geneva, Switzerland, June 5-6, 134-143

Thórisson, K. R. (1997)
Turntaking & Dialog
Natural Turn-Taking Needs No Manual: Computational Theory and Model, from Perception to Action
In B. Granström, D. House, I. Karlsson (Eds.), Multimodality in Language and Speech Systems, 173-207. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Thórisson, K. R. (2002)
Architectural Methodology
Constructionist Design Methodology for Interactive Intelligences
AI Magazine, 25(4):77-90
Thórisson, K. R., H. Benko, D. Abramov, A. Arnold, S. Maskey, A. Vaseekaran (2002)

 Follow-on publications that build on my thesis work

Cognitive Map Architecture: Facilitation of Human-Robot Interaction in Humanoid Robots
IEEE Robotics & Automation Magazine, March, 16(1):55-66
Ng-Thow-Hing, V., K. R. Thórisson, R. K. Sarvadevabhatla, J. Wormer & T. List (2009)

Design and Evaluation of Communication Middleware in a Humanoid Robot Architecture
IROS 2007 Workshop on Measures and Procedures for the Evaluation of Robot Architectures and Middleware, Oct. 29, San Diego, CA.
Ng-Thow-Hing, V., T. List, K. R. Thórisson, J. Lim & J. Wormer (2007)

Evaluating Multimodal Human-Robot Interaction: A Case Study of an Early Humanoid Prototype
In A.J. Spinks, F. Grieco, O.E. Krips, I.W.S. Loijens, I.P.J.J. Noldus and P.H. Zimmerman (eds.), Measuring Behavior 2010: Proceedings of the 7th International Conference on Methods and Techniques in Behavioral Research, 273-276. ACM New York, NY, USA
Jonsson, G. K. & K. R. Thórisson (2010)

A YARP-Based Architectural Framework for Robotic Vision Applications
Proc. of International Conference on Computer Vision Theory and Applications (VISAPP), Lisboa, Portugal, Feb. 5-8, 1:65-68
Stefánsson, S. F., B. Th. Jonsson & K. R. Thórisson (2009)

Cognitive Architecture

A Distributed Architecture for Real-Time Dialogue and On-Task Learning of Efficient Cooperative Turn-Taking
In M. Rojc and N. Campbell (eds.), Speech, Gaze and Affect, Ch. 12, 293-324. Boca Raton, Florida, US: Taylor & Francis

Jonsdottir, G.R., & K. R. Thórisson (2013)

A Multiparty Multimodal Architecture for Realtime Turntaking
Proceedings of Intelligent Virtual Agents 2010
K. R. Thórisson, O. Gislason, G. R. Jonsdottir & H. Th. Thorisson (2010)

A Granular Architecture for Dynamic Realtime Dialogue
Proc. of Intelligent Virtual Agents (IVA), Tokyo, Japan, September 1-3
Thórisson, K. R. & G. R. Jonsdottir (2008)

Towards a Neurocognitive Model of Realtime Turntaking in Face-to-Face Dialogue
In I. Wachsmuth, M. Lenzen, G. Knoblich (eds.), Embodied Communication in Humans And Machines. U.K.: Oxford University Press

Bonaiuto, J. & K. R. Thórisson (2008)

Modeling Multimodal Communication as a Complex System
In I. Wachsmuth, M. Lenzen, G. Knoblich (eds.), Springer Lecture Series in Computer Science: Modeling Communication with Robots and Virtual Humans, 143-168. New York: Springer

Thórisson, K. R. (2008)

Dragons, Bats & Evil Knights: A Three-Layer Design Approach to Character-Based Creative Play
Virtual Reality, Special Issue on Intelligent Virtual Agents, 5(2):57-71. Heidelberg: Springer-Verlag
Bryson, J. & K. R. Thórisson (2000)

Dialogue & Turntaking

Autonomous Acquisition of Natural Language
In A. P. dos Reis,P. Kommers & P. Isaías (eds.), Proceedings of the IADIS International Conference on Intelligent Systems & Agents 2014 (ISA-14), 58-66, Lissbon, Portugal, July 15-17
E. Nivel, K. R. Thórisson, B. R. Steunebrink.,H. Dindo, G. Peluzo, M. Rodriguez, C. Hernandez, D. Ognibene, J. Schmidhuber, R. Sanz, H. P. Helgason, A. Chella & G. Jonsson (2014)

Teaching Computers to Conduct Spoken Interviews: Breaking the Realtime Barrier With Learning
Proceedings of IVA '09; Springer Lecture Notes in Artificial Intelligence 5773, 446-459
Jonsdottir, G. R. & K. R. Thórisson (2009)

Learning Smooth, Human-Like Turntaking in Realtime Dialogue
Proc. of Intelligent Virtual Agents (IVA), Tokyo, Japan, September 1-3

Jonsdottir, G. R., K. R. Thórisson & E. Nivel (2008)

Fluid Semantic Back-Channel Feedback in Dialogue: Challenges & Progress
Proc. of 7th International Conference on Intelligent Virtual Agents, 154-160, September. Paris, France
Jonsdottir, G. R., J. Gratch, E. Fast, & K. R. Thórisson (2007)


(yes - it matters!)


Integrated A.I. Systems
Invited paper at The Dartmouth Artificial Intelligence Conference: The Next 50 Years — Commemorating the 1956 Founding of AI as a Research Discipline, July 13-15, 2006, Dartmouth, New Hampshire, U.S.A.
Minds & Machines, 17:11-25, 2007

Thórisson, K. R. (2007)

Methods for Complex Single-Mind Architecture Designs
In Padgham, Parkes, Müller and Parsons (eds.), Proceedings of the Autonomous Agents & Multiagent Systems (AAMAS 2008), May, 12-16, Estoril, Portugal, 1273-1276

Thórisson, K. R., G. R. Jonsdottir & E. Nivel (2008)

A Brief History of Function Representation from Gandalf to SAIBA
Proc. of the 1st Function Markup Language Workshop at AAMAS, Portugal, June 12-16, 2008

Vilhjálmsson, H. & K. R. Thórisson (2008)

Towards a Common Framework for Multimodal Generation: The Behavior Markup Language
Proc. of Intelligent Virtual Agents (IVA '06), August 21-23
Also published in Springer Lecture Notes in Computer Science

Kopp, S., B. Krenn, S. Marsella, A. N. Marshall, C. Pelachaud, H. Pirker, K. R. Thórisson & H. Vilhjálmsson (2006)


Modular Simulation of Knowledge Development in Industry: A Multi-Level Framework
Proc. of WEHIA – 1st International Conference on Economic Sciences with Heterogeneous Interacting Agents, 15-17 June, University of Bologna, Italy
Saemundsson, R. J., K. R. Thórisson, G. R. Jonsdottir, M. Arinbjarnar, H. Finnsson, H. Gudnason, V. Hafsteinsson, G. Hannesson, J. Ísleifsdóttir, Á. Th. Jóhannsson, G. Kristjánsson & S. Sigmundarson (2006)

Applying Constructionist Design Methodology to Agent-Based Simulation Systems
3rd International KES Symposium on Agents and Multi-agent Systems – Technologies and Applications
Thórisson, K. R., R. J. Saemundsson, G. R. Jonsdottir, B. Reynisson, C. Pedica, P. R. Thrainsson and P. Skowronski (2009)





Table of Contents

0. Abstract & Table of Contents [PDF]

1. Introduction [PDF]

2. Face-to-face Interface [PDF]

3. Multimodal Dialogue: Psychological and Interface Research [PDF] [Table 1 (ps)]

4. Agents, Robots and Artificial Intelligence [PDF]

5. Computational Characteristics of Multimodal Dialogue [PDF]

6. J.Jr.: A Study in Reactivity [PDF]

QuickTime of J.Jr.gz 4.5MB

7. Ymir: A Generative Model of Psychosocial Dialogue Skills [PDF]

8. Ymir: An Implementation in LISP [PDF]

9. Gandalf: Humanoid One [PDF]

Black-and-White QuickTime of Gandalf 2MB
QuickTime of Gandalf 2 20MB
QuickTime of Gandalf 3 25.3MB

10. Ymir / Gandalf: An Evaluation in Three Parts [PDF

11. Designing Humanoid Agents: Some High-Level Issues [PDF]

12. Conclusions & Future Work [PDF]

Appendix 1. Character Animation [PDF]
Appendix 2. System Specifications [PDF]

Appendix 3. Questionnaires & Scoring [PDF]

References [PDF]



[ Back to Thórisson's home page ]


Copyright (c) K.R.Thórisson. All rights reserved.