Theses/Proposals

Sumit Basu. Conversational Scene Analysis. Ph.D. Thesis. MIT Department of EECS. September, 2002. (PDF) Defense (slides)

Sumit Basu. A Three-Dimensional Model of Human Lip Motions. M.Eng. Thesis, MIT Department of EECS. 1997. (PDF)

Sumit Basu. Hyperacuity Sensing for Image Processing. S.B. Thesis, MIT Department of EECS. 1995. (PDF)

Peer-Reviewed Journal Articles

Sumit Basu, Nuria Oliver, and Alex Pentland. "3D Lip Shapes from Video: A Combined Physical-Statistical Model." Speech Communication 26, 1998. pp. 131-148. (PDF)

Note: I am currently in the process of preparing two journal articles based on my thesis work.

Peer-Reviewed Conference/Workshop Papers

Nebojsa Jojic, Sumit Basu, Nemanja Petrovic, Brendan Frey, and Thomas Huang.  Joint design of Data Analysis Algorithms and User Interface for Video Applications.  In Proceedings of the Machine Learning Meets the User Interface Workshop (MLUI) at NIPS 2003.  Vancouver, BC.  December, 2003.  (PDF)

Tanzeem Choudhury, Brian Clarkson,  Sumit Basu, and Alex Pentland, A. Learning Communities: Connectivity and Dynamics of Interacting Agents.  Proceedings of the International Joint Conference on Neural Networks (IJCNN'03), Special Session on on Autonomous Mental Development. July 2003. (PDF)

Sumit Basu, "A Two Layer Model for Voicing and Speech Detection."  Appears in the Proceedings of the IEEE Conf. on Acoustics, Speech, and Signal Processing ( ICASSP 2003)Hong Kong.  May, 2003.  [Note: this meeting was canceled due to SARS but the proceedings were published as usual].  (PDF)

Sumit Basu*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Towards Measuring Human Interactions in Conversational Settings." In Proceedings of the IEEE Int'l Workshop on Cues in Communication (CUES 2001) at CVPR 2001. Kauai, Hawaii. *The first three authors contributed equally to this work and are listed alphabetically. (PDF)

Sumit Basu, Brian Clarkson, and Alex Pentland. "Smart Headphones: Enancing Auditory Awareness through Robust Speech Detection and Source Localization." In Proceedings of the IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP '01). Salt Lake City, Utah. 2001. (PDF) (poster)

Sumit Basu, Brian Clarkson, and Alex Pentland. "Smart Headphones." In Proceedings of the Conference on Human Factors in Computing Systems (CHI '01). (Short Paper). Seattle, Washington. April, 2001. (PDF) (slides)

Sumit Basu, Steve Schwartz, and Alex Pentland. "Wearable Phased Arrays for Sound Localization and Enhancement." In Proceedings of the IEEE Int'l Symposium on Wearable Computing (ISWC '00). Atlanta, Georgia. October, 2000. pp. 103-110. (PDF) (slides)

Jacob Strom, Tony Jebara, Sumit Basu, and Alex Pentland. "Real Time Tracking and Modeling of Faces: An EKF-based Analysis by Synthesis Approach." In Proceedings of the IEEE Modeling People Workshop at ICCV'99. Kerkyra, Greece. September, 1999. (PDF)

Christopher R. Wren, Sumit Basu, Flavia Sparacino, and Alex Pentland. "Combining Audio and Video in Perceptive Spaces." In Proceedings of the First Int'l Workshop on Managing Interactions in Smart Environments. Dublin, Ireland. 1999. (PDF)

Sumit Basu, Nuria Oliver, and Alex Pentland. "Coding Human Lip Motions with a Learned 3D Model." In Proceedings of the Int'l Workshop on Very Low Bitrate Video Coding (VLBV '98). Urbana, Illinois. October, 1998. (PDF) (poster)

Sumit Basu, Nuria Oliver, and Alex Pentland. "3D Modeling and Tracking of Human Lips." In Proceedings of the IEEE Int'l Conf. on Computer Vision. Mumbai, India. January, 1998. pp. 337-343. (PDF) (poster)

Christopher R. Wren, Sumit Basu, and Alex Pentland."Perceptive Spaces: Learning Dynamic Models of Human Behavior." In Proceedings of the Workshop on Perceptual User Interfaces (PUI '97). Banff, Canada. 1997.

Sumit Basu and Alex Pentland. "Recovering 3D Lip Structure from 2D Observations Using a Model Trained from Video." In Proceedings of the ESCA Workshop on Audio-Visual Speech Processing (AVSP'97). Rodos, Greece. 1997. (email)

Sumit Basu and Alex Pentland. "A Three-Dimensional Model of Human Lip Motion." In Proceedings of the IEEE Non-Rigid and Articulated Motion Workshop at CVPR '97. San Juan, Puerto Rico. June, 1997. (PDF)

Irfan Essa, Sumit Basu, Trevor Darrell, and Alex Pentland. "Modeling, Tracking, and Interactive Animation of Faces and Heads Using Input from Video." In Proceedings of Computer Animation '96. Geneva, Switzerland. 1996. (PDF)

Sumit Basu, Irfan Essa, and Alex Pentland. "Motion Regularization for Model-Based Head Tracking." In Proceedings of the 13th IEEE Int'l Conf. on Pattern Recognition (ICPR '96). Vienna, Austria . September, 1996. pp. 611-616. (PDF)

Sumit Basu, Michael Casey, Bill Gardner, Ali Azarbayejani, and Alex Pentland. "Vision-Steered Audio for Interactive Environments." In Proceedings of IMAGE'COM '96. Bordeaux, France. 1996. (PDF)

Michael Casey, William G. Gardner, and Sumit Basu."Vision Steered Beamforming and Transaural Rendering for the Artificial Life Interactive Video Environment (ALIVE)." In Proceedings of the 99th Audio Engineering Society Conference (AES '95). 1995. (PDF)

Invited Abstracts

Sumit Basu. “Learning Variations in Conversational Patterns.”  Snowbird Learning Workshop ’03.  Snowbird, UT.  April 2003.

Sumit Basu.  “Conversational Scene Analysis.”  Snowbird Learning Workshop ’03.  Snowbird, UT.  April 2002.

Patents Granted

Warren B. Jackson, David A. Jared, Sumit Basu, and David K. Biegelsen. "Macrodetector-Based Image Conversion System." US Patent No. 5,790,699. Granted August 4, 1998.

Warren B. Jackson, David A. Jared, Sumit Basu, and David K. Biegelsen. "Position Sensitive Detector Based Image Conversion System Capable of Preserving Subpixel Information." US Patent No. 5,754,690. Granted May 19, 1998.

Patents Filed/Disclosed

Sumit Basu.  [MSR Patent 5 – the topic/title cannot be listed for IP protection]. Disclosed 2003.

Sumit Basu.  [MSR Patent 4 – the topic/title cannot be listed for IP protection]. Disclosed 2003.

[1 co-author] and Sumit Basu.  [MSR Patent 3 – the topic/title cannot be listed for IP protection]. Disclosed 2003.

[7 co-authors] and Sumit Basu.  [MSR Patent 2 – the topic/title cannot be listed for IP protection].  Disclosed 2003 (Pending).

Sumit Basu.  [MSR Patent 1 – the topic/title cannot be listed for IP protection]. Filed 2003 (Pending).

Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, Sumit Basu, and Evgeniy Gusvatin. "Method of Establishing a Communications Link Using Perceptual Sensing of a User's Presence." Filed November 10, 2000. (Pending)

Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, and Sumit Basu. "Method of Extending Image-Band Face Recognition Systems to Utilize Multi-View Image Sequences and Audio Information." Filed November 10, 2000. (Pending)

Technical Reports

Sumit Basu*, Tanzeem Choudhury*, and Brian Clarkson*. "Learning Human Interactions with the Influence Model." Vismod Tech Report #539. June, 2001. (PDF)

Sumit Basu. "ICA: A Critical Review of Three Prominent Approaches." Technical Report. April, 2000. (PDF) (slides)

Sumit Basu. "Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point Machine." December, 1999. (PDF)

Sumit Basu, Kentaro Toyama, and Alex Pentland. "A Consistent Method for Function Approximation in Mesh-based Applications." Vismod Technical Report #486. January, 1999. (PDF)

Sumit Basu. "Efficient Multiscale Template Matching with Orthogonal Wavelet Decompositions." May, 1997. (PDF)

Trevor Darrell, Sumit Basu, Christopher Wren, and Alex Pentland. "Perceptually-Driven Avatars and Interfaces: Active Methods for Direct Control." Vismod Technical Report #416. 1997. (PDF)

Tutorials/Invited Talks

Sumit Basu. “Conversational Scene Analysis.”  Invited talk at the Mitsubishi Electric Research Laboratory (MERL), Cambridge, MA, Sept 3, 2002.

Sumit Basu.  “Machine Audition for Interactive Environments.”  invited talks at the Georgia Tech Department of Computer Science, the Purdue Computer Science Department, the Department of Information and Computer Science at the University of California, Irvine, Microsoft Research  from February through May, 2002.

Sumit Basu and Alex Pentland, "Concept Formation in Multi-Modal Learning." In Alex Pentland, Tony Jebara, Brian Clarkson, and Sumit Basu, Learning Techniques in Audio-Visual Information Processing, a tutorial at the Int'l Conf. on Pattern Recognition (ICPR '00) Barcelona, Spain. September 3, 2000. (slides)

Sumit Basu. "Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point Machine." Invited talk at Tomaso Poggio's group meeting, MIT AI Lab/CBCL. May 12, 2000. (slides)

Sumit Basu, Deb Roy, Brian Clarkson, and Alex Pentland. "Learning the Structure of Human Behavior from Sensory Inputs: Language, Daily Patterns, and Conversations." At Grounded Intersensory Language Learning in Sign and Speech (GILLS '00). Grenoble, France. March 24, 2000. (slides)