Interactions Can be Recorded
Many activities in the playhouse can
be recorded on standard VHS tape for later viewing by the child or to be
used as keepsakes by their parents or given as gifts to friends and relatives.
Example activities which may be recorded include the child's interactions
with virtual characters, the child telling stories in the virtual world,
or the child singing songs (e.g., "Happy Birthday") accompanied by virtual
Wide Range of Activities
Since the playhouse characters are intended
to be used as long-term, live-in playmates for a child, it is important
that the system provide a wide range of open-ended, non-repetitive activities.
In addition to the "companion" function (which is intended to interact
with a child playing with objects within the house), other possible activities
include character-based story-telling (using the electronic book metaphor),
sing-alongs, and directed-character games.
To keep the system novel and age-appropriate
for a child, to avoid technical obsolescence, and to provide a steady stream
of income for the inventors, the system must be extensible via add-on software
or physical toys that the characters can interact with.
Some of the activities, especially add-ons,
should have an educational aspect to them to increase the appeal of the
system to the target market. In addition to conventional educational game
software, educational cartridges could include "companion-to-child" discussions
and activities on topics such as toilet training, tidiness, scholastic
achievement, and other themes important to parents.
The target age range for users (3-8)
imposes several constraints on the design of the system:
The physical size of the house and the
ergonomics of the camera and display must accomodate a normal-sized child
within the specified age range.
The vision processing software (for
location and gesture recognition) must accomodate this size range of children.
The system must not assume literacy.
Thus, all system prompts must be either verbal requests by a character,
or clearly understandable icons which can be gestured at.
The speech recognition system must work
for children in the specified age range. This implies that the system should
not expect fully grammatical sentences as input; the system should rely
either on keyword spotting or terse or elliptical answers to direct questions
(e.g., "yes", "no").
The house door and VCR deck must be
workable by children in the target age range.
The schematics below show the
basic design and installation of the Magic PlayHouse. The dimensions of
the house are 44" x 41" x 56" tall. This standard playhouse size
ensures a 3-to-8 year old child will feel comfortable playing in the house.
The outside look and the internal decoration of the house will have the
same style as the Poohs house in the Disney movies. Normal playhouses
have windows on each wall. However, to get better control of lighting for
vision processing, we have included only one small window in one
wall the house.
The sensing and computing devices
are invisible to the child. Most of them are installed in the 6-inch-wide
electronics bay behind the screen wall. The major components of the Magic
Custom processing boards will be designed
to perform central control and computing tasks. The two major chips in
the circuits are a powerful CPU with DSP capability and a graphic chip
with fast 3D rendering capability. Moreover, to support recording and replaying
of interactions, we need to integrate A/D, D/A devices. A 1-gigabyte hard
disk is used to store digitized video and other controlling information.
LCD Display inlayed in the wall, decorated
as a mirror. There are different sizes of active matrix LCD displays available
currently. However, due to the price issue, we chose to use two small screens
(12 inches) tiled together.
Speakers installed below the LCD screen.
We use two stereo speakers with subwoofer to produce positional sound effects.
VCR deck installed below the LCD screen
and between the stereo speakers. This VCR player is used to record and
replay the interaction experiences of the child. The deck has no external
controls; it is completely driven by the same processor that is running
the interactive experience, allowing the virtual characters to "control"
it as appropriate. VCR tapes will also be used as an upgrade mechanism
for the system software.
Cameras main sensing device. There
are two cameras installed in the house. One color QuickCam is hidden between
the two LCD panels. It is used to perceive the gestures and presence of
the child and take video inputs. Another grey-scale QuickCam is installed
on the ceiling, which is used to get position information of the child.
Microphone hidden in the wall, above
the LCD screen. It is used to get speech input and record sound in the
Light two major light bulbs are installed
in the screen wall and the ceiling respectively to improve the lighting
condition for the computer vision program.
Penny Tag Receiver Penny tags provide
a mechanism for tagging any physical object with a low-priced, miniature
tag (the size of a small piece of wire) which enables the object to be
identified and located in space when brought near the receiver (Fletcher,
et al, 1996). Tags will be used to identify when a child brings a toy into
the playhouse so that the virtual characters can provide appropriate comments
We present several specific scenarios
for childrens interaction within the Magic Playhouse environment. In the
following sections, we describe possible interactions, necessary perceptions
and the display design for four typical scenarios -- companion Pooh, child
in virtual world, game playing, and interactive storytelling. In addition
to these activities, the PlayHouse VCR can also be used by the child to
simply watch videotaped movies.
The system will be the childs companion or playmate most of the time.
In the morning, the Pooh bear will wake the child up and direct them through
getting dressed, brushing their teeth, and doing morning exercises. In
the evening, Pooh starts telling stories, singing lullabies, etc. to "wind-down"
the child before bedtime. In the playtime, the Pooh bear "notices" when
the child comes into the house. After a short greeting conversation, Pooh
leads the child into other activities. If the child would rather play by
himself/herself (e.g. having a tee party), Pooh will just sitting there
and look at the child and invite the child to play games or tell a story
from time to time. Pooh also has "memories" about the childs name, habits,
favorite games, stories listened, etc., so that it will prompt properly.
In the morning exercises, Pooh teaches
the child some simple exercise, then leads the exercise. The vision system
observes the child and gives appraisals when the child does well or gives
some instruction when the kid does not do well.
In this scenario, the Pooh bear is
life size (22 inches tall). It occupies most part of the screen, standing
or sitting in the virtual world hugging a hunny pot. Its head and eyes
move according to the childs position change, as if it is looking at the
The vision system uses both cameras
to detect the childs entering and leaving the house. The ceiling camera
keeps track of the childs position, while the front camera is used to
perform simple gesture recognition. The conversation in this interactive
scenario is highly constrained. Most conversations are well planned and
initiated by the Pooh bear, so that the system can use a template-based
speech recognition system to get response from the child. By recognizing
gestures, the Pooh bear can also deal with turn-taking behaviors properly.
The system will use Penny Tags to
enable the virtual characters to notice tagged toys that the child brings
into the playhouse. The child can tag some other toys and tell Pooh bear
their name. Pooh can then notice when child brings them into the house
to play and speak proper sentences about the toys. Alternatively, a line
of pre-tagged Pooh toys could be sold (e.g., most of the Disney-brand merchandise)
that the system would already know about. In this way, the childs interaction
within the house is even more personalized.
Child in Virtual World
At the beginning of this interaction scenaio, the LCD display acts as a
magic mirror. It displays exactly the same room as the inside playhouse,
except that everything in the display is cartoon style. The childs image
is projected to the proper places in the cartoon world. The Pooh bear is
aware of the child in the virtual world. It stands or sits beside the child.
When the child moves, Pooh also moves properly. The child can tell stories
to the Pooh, sing songs along with Pooh, and do some simple ALIVE-like
interactions in the virtual world.
The Pooh bear in this scenario looks
smaller than it looks in the companion scenario. As shown in the picture,
it has the same size with the child in the virtual world.
The whole interactive process can
be saved and replayed on Poohs virtual TV. The child can also make a video
tape of this experience, so that they can send the tape to friends or relatives
as a gift.
The vision system keeps track of
the childs position, so that the child can be mapped to a proper place
in the virtual world. There is also speech recognition in this scenario.
The front camera is mainly used to take video input of the child.
The games designed for the Magic
Playhouse are mostly Role-playing and exploring style, in which the child
is in control of directable characters. The look-and-feel of the display
is changed to signify the change in mode, with the characters appearing
much smaller and more caricatured than they do in other modes. The games
are also tightly related with the Pooh stories. For example, in the "Blustery
Day" story, there can be following kind of games:
The look and feel of the display
is different according to the game. For example, in the "Finding house
for Owl" game, the display will be an explorable Hundred Acre wood. All
characters except the one the child is acting are autonomous agents. Instead
of using a joystick or a stuffed animal to control the character, the kid
itself become the device to control the character. The child can navigate
through the woods using simple left/right gestures, observing other characters
activities, or approaching a character for hints of direction.
Finding hunny for Pooh (Role playing
Finding house for Owl (Role playing
Fighting with flood (steering, Role
playing Pooh, piglet)
Pass rescue message to Pooh and Piglet
(Role playing Owl)
Template-base speech recognition
is used at the beginning of the game to obtain information about which
game to play and which character the child wants to be. In the actual game
playing, gesture recognition is frequently used. The ceiling camera still
keeps tracking of the childs position. Here we employ the idea of intentional
control instead of direct mapping of input and output to obtain better
At the beginning of this scenario,
Pooh bear looks the same as in the companion mode. As it begins to tell
a story, the Pooh bear image fades away and a story book comes out with
pictures in it. The pictures gradually occupy the whole screen, which becomes
the Hundred Are Woods scene.
The story proceeds from scene to
scene. The child can explore each scene following the narrative. However,
the theme and development of the story cannot be changed. For example,
at some certain point of the story, the Pooh bear will ask the child, "Do
you want to wander around this place?" If the answer is "no", the story
continues. Otherwise, the scene becomes a limited explorable world, within
which the child can use simple gestures to look around. When the child
say "continue the story, Pooh", or after a certain period, the story continues.
Depending on the contents of the
stories, there can be other types of interactions. For example, in
the "blustery day" story, the Poohs voice may ask the child "Mary, can
you help Piglet to get to Christopher Robins house safely?" the child
can use steering gestures to avoid stones or waterfalls in the river, and
keep the right direction.
The perception task in this interaction
scenario is the same as those in the game playing scenario.
For a commercial product, the two most
important technical risk drivers are cost and reliability. We estimate
the manufacturing cost of each part of the house as well as the total cost
in the next section. In this section, we discuss the reliability problem
from a technical point of view.
While the vision and speech recognition
components of the playhouse contribute significantly to the natural interaction
via non-invasive perception techniques, they represent relatively
new and unreliable technologies. Since a commercial product must
work for all children in almost all situations, the risks imposed by this
unreliability must be mitigated.
To maximize the reliability of the vision
recognition system, we plan to control the lighting in the house and ensure
that only one child at a time uses the house when the vision system is
in use. This will be achieved by first detecting when more than one child
is in the house (using the overhead camera) and then having Pooh or one
of the other characters request that only one child participate in the
activity at a time. Whenever possible, redundant control mechanisms will
be used (e.g., vision and speech) to increase the overall reliability of
To obtain acceptable recognition results,
many current speech-recognition technologies require a series of training
sessions which last about half an hour. This process is boring to a kid
aged 3-8. Also, a general-purpose Speech Recognition engine requires a
list of valid sentences that can be spoken and a large dictionary. Such
a list could include thousands or even a million entries. This will slow
down the recognition speed and increase the possibility of error recognition.
In the Magic Playhouse,
there could be some simple conversations between Pooh bear and the child.
All the conversations should controlled, i.e., the Pooh bear takes the
initiative in the conversation most of the time and expecting certain
kind of responses from the child. We can implement grammar-based speech
recognition based on popular speech recognition system such as IBM Viavoice.
By applying Context-free grammar to the Speech Recognition engine (Microsoft),
we can use rules that predict the next words that might possibly follow
the word just spoken, reducing the number of candidates to evaluate in
order to recognize the next word. The advantages of this method are: a)
No need to train; 2) High accuracy; 3) High speed; 4) Need little resources
to run; 5) Can add understanding tags so that the system can "understand"
what the user say from the output of the Speech Recognition system The
disadvantages include: a) Number and pattern of sentences to speak are
limited; b) May increase the childs expectation about the power of the
house. Therefore, there must be some trade-off among number of grammar
rules, control, and running speed
In the table below, we give the estimated
manufacturing cost of the Poohs Magic Playhouse. All costs listed are
estimated mass-manufacturing cost (about 50~70% of the retail price). The
total manufacturing cost is approximately $1,380. This may vary depending
on the specific parts selected.
molded hunny pots and furniture
|2 12" LCD
(two stereo with subwoofer)
with 3D acceleration & Video capturing & TV in/out
reading equipment (VCR)
|Penny Tag Receiver
The LCD display is the major cost
driver. Currently, there are three major kinds of LCD display available
in the market, 12"~13", 15", and 20". They cost $300, $1,000 and $5,400
respectively. The price is not linear to the size. We choose to use two
12" display, which are still cheaper than one 15" display. However, even
with this option, the cost of LCD display is almost half of the total price.
We expect to see a 20" product for less than $500 in two years.
Due to the low-volume / high-end
market positioning, the whole playhouse could retail for $2,600 - $3,500.
The target market for Pooh's Magic PlayHouse
is upper-middle to upper-class American households with a single child.
We feel that this market is especially well-suited for the PlayHouse, since
we can play on parents' guilt of not spending enough time with their child,
and on their desire for discipline and education. Households with
well-educated parents (which will likely have a large overlap with the
previously described market) are also a likely target, since educated parents
make up a large share of the educational toy market.
Market channels for the PlayHouse
will likely include high-end toy stores such as FAO Schwarz, due to the
anticipated pricing of the product. FAO's on-line catalog contains toys
priced up to $40,000.00 (for a gasoline powered Lamborghini), so a $2,500
playhouse should fit into their catalog without any problem.
We do not see any direct competition
for the PlayHouse currently in the market. The closest competitors are:
Barriers to competitive entry include
market lead time and one or more patents on the entire PlayHouse system.
My Interactive Pooh (Mattel Media)-
This is an animated, stuffed Pooh bear which interacts with software on
a desktop PC. The bear can be personalized with the child's name and a
few other pieces of information, and plays games, sings songs and tells
stories. Perceptual input is limited to a pressure switch in one of Pooh's
paws, and the mouse and keyboard for PC-based games. The bear can be detached
from the PC for limited interactions away from the computer. Recommended
age range is 2+ years, retail pricing is currently $99. The major advantage
this product has over the playhouse is portability, although the functionality
and perception is extremely limited, and there are no video-based activities
as provided in the playhouse. Other products in this general category are
Barney ActiMates (Microsoft; recommended age range 2-5 years, retail
pricing $99), and Furbies (recommended age range 5+ years, pricing $35).
Interactive CDROMs (various manufacturers,
recommended ages 3+, pricing $25 and up) - Relative to Pooh's Magic PlayHouse
these lack the tangible aspects of the interaction, and do not provide
any of the video functions.
Color television with VCR deck (ages
3+, pricing $400 and up) - Purely passive interaction style and lack any
Conventional play houses (Little Tikes
and other manufacturers, 1-1/2 yrs and up, pricing $130-$250 plastic indoor
models, up to $5,000 for outdoor wooden models) - Rely entirely on children's
unreliable imagination for interaction with virtual characters, no
Adopted siblings (various domestic and
import sources; recommended age range 0-adult, pricing $10,000 and up)
- disadvantages include long procurement lead time and high maintenance
cost. No built-in video functions.
Pooh's Magic PlayHouse is a technically and commercially viable interactive
environment which provides a child with a virtual playmate in addition
to an extensible range of activities through add-on software cartridges
and an accompanying line of Pooh toys. The novelty of the system lies in
the combination of a physical playspace with embedded multimedia capabilities,
the use of a virtual Pooh bear as a child's live-in companion, and the
ability of the system to react to the child's presence and physical activities
within the playspace using non-invasive sensing technologies.
Blumberg, B. and T. Galyean, "Multi-level
Direction of Autonomous Creatures for Real-Time Virtual Environments,"
In: Proceedings of SIGGRAPH 95.
Bobick, A., S. Intille, J.Davis, F.
Baird, C. Pinhanez, L. Campbell, Y. Ivanov, A. Schutte, A. Wilson, "The
KidsRoom: A Perceptually-based Interactive and Immersive Story
Fletcher, R., J. Levitan, J. Rosenerg,
and N. Gershenfeld, "Application of Smart Materials to Wireless ID Tags
and Remote Sensors," presented at the Materials Research Society Fall Meeting,
Microsoft Speech API References: http://microsoft.com/iit/onlinedocs/