Christopher R. Wren
Dynamic Models for Smart Rooms


My name is Christopher Wren, and I am involved with the Smart Rooms project. The goal of Smart Rooms is to create environments that respond intelligently to the actions of their inhabitants. A Smart Room should be able to help the people within it.

If you're going to help someone, you need to know what they're trying to do, and even if they're there at all. We use cameras, and other sensors, to detect, track and interpret the actions of people in our Spaces. So far, working with bottom-up frameworks, we have developed fast, robust feature trackers. And we have shown that some interpretation of human motion is possible with only this information.

However, people are governed by the laws of physics, and by the limitations of their own nerves and muscles. We believe that we may gain an even better understanding of human motion by modeling these constraints. Toward this goal we have created a system that can recover a description of the physical state of the user in a recursive framework.

This description makes it possible to reject certain kinds of noise that are inconsistent with the realities of the human body. It also allows us to make predictions about what we expect to see in the near future. Those predictions, combined with the probabilistic nature of the tracker help us recover from ambiguities that would otherwise confuse the bottom-up tracker.

In short, the computer now has a set of expectations about how people move and mediates its perception of the user with those expectations.

This video shows the tracking system in action, followed by a clip showing the 3D model responding to user motion. Then a short sequence showing a case where the tracker would have failed without the model.

An interesting example is Dave Becker's Sensei system. The Sensei teaches the user T'ai Chi movements. This short video shows the system in action. First, the user is shown the moves by a virtual actor. Then the user tries them. The Sensei analyses the motions of the user, recognizes what gestures the user is attempting, isolates problems, and is able to give appropriate feedback.

In conclusion, by modeling the constraints that govern the actions of the user, we are better able to create environments that can interpret and assist their inhabitants.


Christopher R. Wren, wren@media.mit.edu
Last modified: Thu Oct 9 13:54:58 EDT 1997