SIGGRAPH Asia 2009

BiDi Screen

A Thin, Depth-Sensing LCD for 3D Interaction using Light Fields

Matthew Hirsch1      Douglas Lanman2      Henry Holtzman1      Ramesh Raskar1
1MIT Media Lab         2Brown University

FAQ

FAQ

Section 1: BiDi Screen Concept

Q: What is BiDi screen?

Q: How does it work? How does it sense 3D information from a thin display?

Q: How is this different from Surface, SecondLight, Gesturetek, Perceptive Pixels and xyz?

Q: Will this replace webcams?

Q: Will this replace flat-bed scanners?

Q: What are some other applications of this technology?

Q: What are the limitations?

Q: What are the future directions?

Q: Isn't this the same as something Apple did?

Section 2: About the current prototype

Q: How did you implement the light-sensing layer of the BiDi Screen?

Q: Isn't this same as some other projects that use cameras in the back?

Q: What are the parameters of the current prototype?

Q: What components did you use for the prototype?

Q: Why isn't the BiDi Screen prototype as thin as the LCD panel it uses?

Q: How does the tiled-MURA mask used in the BiDi screen produce equivalent imagery to a pinhole array?

Q: How does the BiDi screen estimate depth?

Q: Is flicker noticeable in the BiDi screen prototype?

Q: Is there noticeable lag between movement and display update?

Q: What type of tie fighter is featured in the videos?

Q: Does the BiDi Screen herald a dystopic Orwellian future, in which sinister government bureaucrats monitor our every move, and personal privacy is a distant memory?

Section 1: BiDi Screen Concept

Q: What is BiDi screen?
A: The BiDi Screen is an example of a new type of I/O device that possesses the ability to both capture images and display them. This thin, bidirectional screen extends the latest trend in LCD devices, which has seen the incorporation of photo-diodes into every display pixel. Using a novel optical masking technique developed at the Media Lab, the BiDi Screen can capture lightfield-like quantities, unlocking a wide array of applications from 3-D gesture interaction with CE devices, to seamless video communication.

Q: How does it work? How does it sense 3D information from a thin display?
A: The BiDi Screen uses a sensor layer, separated by a small distance from a normal LCD display. A mask image is displayed on the LCD. When the bare sensor layer views the world through the mask, information about the distance to objects in front of the screen can be captured and decoded by a computer.

Q: How is this different from Surface, SecondLight, Gesturetek, Perceptive Pixels and xyz?
A: The BiDi Screen shares many attributes with projects that seek to capture gesture information. The difference is that the BiDi Screen can be implemented without cameras or projectors or lenses, allowing it to be as thin as existing LCD screens.

Q: Will this replace webcams?
A: It is possible to use the BiDi Screen technology to capture imagery similar to what might be captured from a webcam. Our current prototype does not capture sufficiently high resolution images explore this option yet, but future implementations could improve on the video chat experience of a webcam. Since the camera is the display, the BiDi Screen will allow both parties to make eye contact.

Q: Will this replace flat-bed scanners?
A: Optical multi-touch displays, which use sensors embedded in an LCD matrix, can already replace the function of flat-bed scanners. The BiDi Screen is a similar, but slightly different configuration, and some additional technical challenges would need to be met for this to happen.

Q: What are some other applications of this technology?
A: The BiDi Screen technology can be used for gestural interaction, but can also be used to render a scene that responds to real-world lighting, tracker a user's head in front of the display for parallax display, and other applications that make use of a depth map.

Q: What are the limitations?
A: Since the BiDi Screen uses an optical technique to capture information about the world, it relies on scene lighting. Without sufficient lighting the technology cannot capture depth. The distance over which the BiDi Screen can capture depth information varies roughly in proportion to the separation of the sensor and LCD components of the BiDi screen. This means that a very thin BiDi Screen would not be sensitive to objects far from the display.

Q: What are the future directions?
A: The BiDi Screen uses the LCD component to create an optical mask to image the world. Right now, the mask used is always the same. We are beginning to explore modifying the mask to better suit the scene being imaged.

Q: Isn't this the same as something that Apple did?
A: No. Apple Computer has a patent application describing a few techniques for creating a display that can image objects. Unlike the BiDi Screen, which uses a single large sensor behind an LCD and no lenses, the display proposed by Apply would tile cameras, each using a tiny lens or lens section, within an LCD. One reason you don't see such a product on the market today is that tiling cameras and lenses behind or within an LCD will almost certainly create visual artifacts on the display due to interference with the display's backlight, or increasing the separation between display pixels.

Another key difference between the work described in the Apple patent application and the BiDi Screen is that the BiDi Screen is used as a gestural interface, while the Apple display is used only to capture conventional images. While it is possible to configure a camera array to capture the same type of light-field data as the BiDi Screen, it is not proposed in the patent application from Apple, and it appears they have not configured their display to make this possible.

Section 2: About the current prototype

Q: How did you implement the light-sensing layer of the BiDi Screen?
A: The BiDi Screen is inspired in part by optical multi-touch displays. This new type of multi-touch device embeds a sensor in every pixel of an LCD. Since large area sensors such as these were not available at the time of this work, we used a diffuser and camera system to simulate a sensor. Much like a movie screen shows us an image at a particular place in space, the diffuser shows the camera optically, what we'd like to capture eclectically with a sensor.

Q: Isn't this same as some other projects that use cameras in the back?
A: No. Other projects use cameras to directly image objects of interest. These projects could never function without those cameras. We use the cameras only as a means of simulating a large-area bare sensor.

Q: What are the parameters of the current prototype?
A: The current prototype captures data at approximately 10fps. The display updates at approximately 30fps. Objects can be measured up to 50cm from the screen. The screen captures a light field that has 19x19 angular resolution and approximately 80x100 spatial resolution.

Q: What components did you use for the prototype?
A: The BiDi screen uses an off-the-shelf 20.1" 1680x1050 LCD. Our first prototype used a Sceptre display, and our second uses a faster LG display. The sensor is composed of two Point Grey Flea2 cameras, and the diffuser from the LCD backlight. The processing for the screen is done on an 8-core Intel Xeon.

Q: Why isn't the BiDi Screen prototype as thin as the LCD panel it uses?
A: The BiDi screen is motivated in part by upcoming optical multi-touch technology, which entails embedding optical sensing elements in every pixel of an LCD display. This technology won't be available for us to work with in the near future, so we've found another way to test our ideas. We use a camera and diffuser (much like a movie screen) to simulate the large area sensor that will be available in optical multi-touch LCDs. We use the camera and diffuser to measure optically what we would like to measure electronically. Unfortunately, this requires a large separation between the camera and the LCD of the BiDi screen.

Q: How does the tiled-MURA mask used in the BiDi screen produce equivalent imagery to a pinhole array?
A: We use a technique demonstrated earlier by the Camera Culture group at the MIT Media Lab called Spatial Heterodyning. Spatial Heterodyning is a general, mask-based technique for measuring a Light Field that relies on amplitude modulation, much like AM radio. Shifted copies of the light field frequency spectrum are produced on different parts of the BiDi screen sensor, allowing us to reconstruct the original, band-limited light field that enters the BiDi screen.

Q: How does the BiDi screen estimate depth?
A: The BiDi screen measures the position and incident angle of each of the rays striking the surface of the screen. This quantity is known as a Light Field. For each frame of captured data we use a technique known as synthetic aperture refocusing to generate a stack of images focused at different distances in front of the screen. For an output pixel in our depth map, we traverse a column in this stack of images, looking for the pixel with the highest contrast. We assign the depth value in the depth map to the location where the image containing the pixel with the highest contrast was refocused. This technique is generally known as depth from focus.

Q: Is flicker noticeable in the BiDi screen prototype?
A: Yes, the BiDi screen does not currently switch faster than the flicker fusion rate for human vision. This is due to the rate of update for current off-the-shelf LCD panels. Current LCD panels limit our refresh rate to 30Hz, because the current BiDi screen prototype requires the entire LCD panel to display a tiled-MURA code for a few milliseconds, so that the BiDi screen sensor can capture data uniformly modulated by the tiled-MURA code. Current high-end 60Hz LCDs take about 1/60th of a second to redraw the screen (see a 1000fps video of a high-end LCD here). This feature of an LCD can be controlled in the manufacturing process, and is a non-technical limitation based on current market demand. Q: Is there noticeable lag between movement and display update?
A: Some users will notice lag when using the BiDi screen prototype. This is because our processing pipeline currently runs at about 10fps. Our pipeline runs on a general purpose CPU, and its performance can be significantly increased with further software optimization, GPU processing, or special purpose hardware. The algorithms we use are highly parallel.

Q: What type of tie fighter is featured in the videos?
A: There are two types of tie fighters in the videos. The one you're asking about is Darth Vader's personal model.

Q: Does the BiDi Screen herald a dystopic Orwellian future, in which sinister government bureaucrats monitor our every move, and personal privacy is a distant memory? A: No. Cameras already exist. Screens already exist. Cameras in screens already exist and are pervasive. Every tool can be abused. The value is in how we use them.