In this section we have described a vision-steered microphone array for use in a full-body interaction virtual environment without body-mounted sensing equipment. A preliminary implementation for the ALIVE space has shown that the system works well for constrained grammars of 10-20 commands. The advantages of cross-modal integration of sensory input are paramount in this system since the desirable properties of fixed arrays are combined with the steerability of an adaptive system.