Curriculum Vitae

Personal Data



Date of Birth:

February 12, 1974



Educational Background

Massachusetts Institute of Technology

Ph.D. in Electrical Engineering and Computer Science, minor in Brain and Cognitive Sciences

Advisor: Professor Alex (Sandy) Pentland, MIT Media Laboratory

Thesis: Conversational Scene Analysis

GPA: 5.0/5.0




Massachusetts Institute of Technology

Master of Engineering in Electrical Engineering and Computer Science

Advisor: Professor Alex (Sandy) Pentland, MIT Media Laboratory

Thesis: A Three-Dimensional Model of Human Lip Motions

GPA: 5.0/5.0



Massachusetts Institute of Technology

Bachelor of Science in Electrical Science and Engineering, Phi Beta Kappa

Thesis: Hyperacuity Sensing for Image Processing

GPA: 5.0/5.0


Research/Industrial Experience

Microsoft Research, Interactive Visual Media Group

Post-Doctoral Researcher

Supervisor: P. Anandan, Senior Researcher

Topics:  conversational analysis for call centers, auditory/speech analysis and sensor fusion for security applications, interfaces for menu-driven audio conversations, personal audio (storing and browsing conversational audio from our daily lives), audio-visual browsing and scene analysis, audio synthesis for limited audio outputs, musical structure analysis, and learning/synthesis of musical style.



MIT Media Laboratory, Vision and Modeling Group

Research Assistant

Advisor: Professor Alex (Sandy) Pentland

Topics: modeling conversational interactions from low-level features, prosodic feature estimation, active interfaces, Bayesian networks (exact and approximate inference), speech detection, wearable phased arrays, source localization, maximum likelihood tracking for deformable meshes (applied to lip tracking), finite element priors for 3D meshes, optical-flow regularization with 3D models (applied to head tracking), vision-steered beamforming.


1995 – 2002

Perceptive Network Technologies

Perceptual Engineer

Supervisor: Dr. Julian Center, Jr.

Topic: developed a real-time speech detection module.


August 2000

Microsoft Research, Vision Group

Research Consultant

Advisor: Kentaro Toyama

Topic: developed a real-time version of my 3D lip tracking work, began the development of a mesh-based smoothing algorithm.


August 1998

MIT Media Laboratory, Speech Interfaces Group

Undergraduate Research Assistant

Advisor: Chris Schmandt

Topics: audio interfaces, speaker identification and segmentation.



Xerox PARC, Electronic Materials Laboratory (EML)

Research Intern

Advisors: Warren Jackson, David Biegelsen, David Jared

Topic: image enhancement using position-sensitive detectors (PSD) and their mechanisms as applied to standard scanning elements.


June, 1994-

August, 1994

Xerox PARC, Computer Science Laboratory (CSL)

Research Intern

Advisor: David Goldberg

Topics: handwritten character recognition for a palmtop device, pen-based interfaces


June, 1993-

August, 1993

Iowa Department of Transportation, Information Systems


Supervisors: Ron Laird, Robert Klopping

Topics: developed a C++ application, BIAS (Bid Item Automation System), for entering/editing contract items, to be used by all contractors working for the state of Iowa.  A modified version of this software is still in use today.

May, 1992-

August, 1992

Research Interests

Please see for a full statement of my research interests.  My core area is Communicative Computing, which includes:

·   Machine Perception

·   Machine Learning

·   Human-Computer Interfaces

Teaching/Advising Experience

Microsoft Research

Intern Mentor

Worked with intern Ian Simon, along with researchers David Salesin and Maneesh Agrawala, to develop new algorithms for music analysis and synthesis.  Worked very closely with Ian to teach him the necessary basics in signal processing and dynamic programming; interacted with him on a daily basis to discuss algorithmic choices and results.   I am continuing in an advisory role for Ian along with David, his advisor at the University of Washington, and Maneesh to help guide the work on this project.


Summer, 2003 – present


MIT Deparment of Electrical Engineering and Computer Science

Graduate Teaching Assistant

Class: Probabilistic Systems Analysis (6.041/6.431)

Supervisor: Professor Dimitri Bertsekas

Description: Undergraduate/Graduate introduction to probability, going from basic axioms through central limit theorems, markov chains, basic stochastic processes.

Duties: Each week, taught one 40-student recitiation (undergraduates), eight 5-student tutorials (interactive problem solving), held two-four office hours (undergraduates/graduates), and attended a two hour staff meeting with other TAs.  Additionally graded papers, led several quiz reviews for the entire class (300+ students) and developed handouts for the students.  Received an excellent evaluation in the students’ “Underground Guide:” “Recitation instructor S. Basu was extremely well-liked by his students, receiving comments such as "Sumit rocks the house!" from his students. He was very available, very friendly, and very helpful, taking the time to help his students whenever it was needed. He was considered an excellent teacher, who was always prepared and had good examples to clarify his points. His tutorials were "great fun" and very helpful for learning. His explanations were clear, and very helpful.”


Fall, 1998

MIT Media Laboratory

UROP (Undergraduate Research Opportunities Program) Supervisor

Supervisor: Alex (Sandy) Pentland

Description: Held regular meetings with undergraduate researchers under my supervision, taught relevant theory, gave guidance in choosing research goals and coursework/career paths, engaged in many one-on-one help sessions, evaluated their progress, wrote recommendations for graduate school.  Over the course of graduate school, I worked with a total of eight undergraduates.


Fall 1995 –

Summer, 2002

MIT F/ASIP (Freshmen/Alumni Summer Internship Program)


Supervisor/Organizer: Professor Arthur Steinberg

Duties: Worked with groups of freshmen in the F/ASIP program. Led resume workshops, simulated work scenarios, helped students develop simulated consulting reports.  Also advised students in course selection and career strategies.


Spring, 1998;

Spring, 1999

MIT ISP (Integrated Studies Program)

Guest Lecturer

Supervisor: Professor Arthur Steinberg

Gave a guest lecture on “Artificial Intelligence and the Future of Computing” to a group of 40 freshmen.


Spring, 2001

MIT ESP (Educational Studies Program)

Teacher for Songwriting Workshop

Supervisor: MIT ESP

Co-taught (with Regalp Sen) a two-hour workshop on songwriting skills for 25 high school students.  Led warm-up exercises, discussed elements of rhyme and meter, helped students combine words and music in group activities.

Fall, 2001

Teaching Interests

This is a subset of the possible courses I would be interested in teaching.  Further details are in my Statement of Teaching Interests.

·   Introduction to Probability and Statistics (Undergraduate/Graduate)

·   Introduction to Pattern Recognition/Machine Learning (Undergraduate/Graduate)

·   Computational Perception – Vision and Audition (Undergraduate/Graduate)

·   Signal Processing for Voice and Music Applications (Undergraduate/Graduate)

·   Speech Processing/Speech Recognition (Graduate)

·   Machine Perception Seminar (Graduate)

·   Adaptive Interfaces (Graduate)

·   Bayesian Networks (Graduate)

·   Machine Learning Seminar (Graduate)

Awards, Honor Societies, and Fellowships

·   NSF Graduate Research Fellowship – Sept.1995-Aug., 1998

·   Member, Phi Beta Kappa (Undergraduate Honor Society) – May, 1995-present

·   Member, Sigma Xi  (Scientific Research Society) – May, 1995-present

·   Member, Tau Beta Pi (Engineering Honor Society) – March, 1994 - present

·   Member, Eta Kappa Nu  (EECS Honor Society) – March, 1994 - present

·   Winner, MIT 6.270 Robotics Competition (team: Loren Shih, myself) – January, 1993

·   United States Presidential Scholar - 1991

Patents Disclosed/Filed

Sumit Basu.  [MSR Patent 5 – the topic/title cannot be listed for IP protection]. Disclosed 2003.


Sumit Basu.  [MSR Patent 4 – the topic/title cannot be listed for IP protection]. Disclosed 2003.


[1 co-author] and Sumit Basu.  [MSR Patent 3 – the topic/title cannot be listed for IP protection]. Disclosed 2003.


[7 co-authors] and Sumit Basu.  [MSR Patent 2 – the topic/title cannot be listed for IP protection].  Filed 2003 (Pending).


Sumit Basu.  [MSR Patent 1 – the topic/title cannot be listed for IP protection]. Filed 2003 (Pending).


Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, Sumit Basu, and Evgeniy Gusvatin. "Method of Establishing a Communications Link Using Perceptual Sensing of a User's Presence." Filed November 10, 2000 (Pending).


Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, and Sumit Basu. "Method of Extending Image-Band Face Recognition Systems to Utilize Multi-View Image Sequences and Audio Information." Filed November 10, 2000 (Pending).

Patents Granted

Warren B. Jackson, David A. Jared, Sumit Basu, and David K. Biegelsen. "Macrodetector-Based Image Conversion System." US Patent No. 5,790,699. Granted August 4, 1998.


Warren B. Jackson, David A. Jared, Sumit Basu, and David K. Biegelsen. "Position Sensitive Detector Based Image Conversion System Capable of Preserving Subpixel Information." US Patent No. 5,754,690. Granted May 19, 1998.

Refereed Journal Publications

Sumit Basu, Nuria Oliver, and Alex Pentland. "3D Lip Shapes from Video: A Combined Physical-Statistical Model." Speech Communication 26, 1998. pp. 131-148.


Note: I am in the process of preparing two additional journal articles based on my thesis work


Refereed Conference/Workshop Publications

Nebojsa Jojic, Sumit Basu, Nemanja Petrovic, Brendan Frey, and Thomas Huang.  Joint design of Data Analysis Algorithms and User Interface for Video Applications.  In Proceedings of the Machine Learning Meets the User Interface Workshop (MLUI) at NIPS 2003.  Vancouver, BC.


Tanzeem Choudhury, Brian Clarkson,  Sumit Basu, and Alex Pentland, A. Learning Communities: Connectivity and Dynamics of Interacting Agents.  Proceedings of the International Joint Conference on Neural Networks (IJCNN'03), Special Session on on Autonomous Mental Development. July 2003.


Sumit Basu, “A Two Layer Model for Voicing and Speech Detection,” Proceedings of the IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP ’03).  Hong Kong.


Sumit Basu*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Towards Measuring Human Interactions in Conversational Settings." In Proceedings of the IEEE Int'l Workshop on Cues in Communication (CUES 2001) at CVPR 2001. Kauai, Hawaii. *The first three authors contributed equally to this work and are listed alphabetically.


Sumit Basu, Brian Clarkson, and Alex Pentland. "Smart Headphones: Enhancing Auditory Awareness through Robust Speech Detection and Source Localization." In Proceedings of the IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP '01). Salt Lake City, Utah. 2001.


Sumit Basu, Brian Clarkson, and Alex Pentland. "Smart Headphones." In Proceedings of the Conference on Human Factors in Computing Systems (CHI '01). (Short Paper). Seattle, Washington. April, 2001.


Sumit Basu, Steve Schwartz, and Alex Pentland. "Wearable Phased Arrays for Sound Localization and Enhancement." In Proceedings of the IEEE Int'l Symposium on Wearable Computing (ISWC '00). Atlanta, Georgia. October, 2000. pp. 103-110.


Jacob Strom, Tony Jebara, Sumit Basu, and Alex Pentland. "Real Time Tracking and Modeling of Faces: An EKF-based Analysis by Synthesis Approach." In Proceedings of the IEEE Modeling People Workshop at the IEEE Int’l Conf. on Computer Vision 1999 (ICCV '99). Kerkyra, Greece. September, 1999.


Christopher R. Wren, Sumit Basu, Flavia Sparacino, and Alex Pentland. "Combining Audio and Video in Perceptive Spaces." In Proceedings of the First Int'l Workshop on Managing Interactions in Smart Environments. Dublin, Ireland. 1999.


Sumit Basu, Nuria Oliver, and Alex Pentland. "Coding Human Lip Motions with a Learned 3D Model." In Proceedings of the Int'l Workshop on Very Low Bitrate Video Coding (VLBV '98). Urbana, Illinois. October, 1998.


Sumit Basu, Nuria Oliver, and Alex Pentland. "3D Modeling and Tracking of Human Lips." In Proceedings of the IEEE Int'l Conf. on Computer Vision (ICCV ’98). Mumbai, India. January, 1998. pp. 337-343.


Christopher R. Wren, Sumit Basu, and Alex Pentland."Perceptive Spaces: Learning Dynamic Models of Human Behavior." In Proceedings of the Workshop on Perceptual User Interfaces (PUI '97). Banff, Canada. 1997.


Sumit Basu and Alex Pentland. "Recovering 3D Lip Structure from 2D Observations Using a Model Trained from Video." In Proceedings of the ESCA Workshop on Audio-Visual Speech Processing (AVSP'97). Rodos, Greece. 1997.


Sumit Basu and Alex Pentland. "A Three-Dimensional Model of Human Lip Motion." In Proceedings of the IEEE Non-Rigid and Articulated Motion Workshop at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '97). San Juan, Puerto Rico. June, 1997.


Irfan Essa, Sumit Basu, Trevor Darrell, and Alex Pentland. "Modeling, Tracking, and Interactive Animation of Faces and Heads Using Input from Video." In Proceedings of Computer Animation '96. Geneva, Switzerland. 1996.


Sumit Basu, Irfan Essa, and Alex Pentland. "Motion Regularization for Model-Based Head Tracking." In Proceedings of the 13th IEEE Int'l Conf. on Pattern Recognition (ICPR '96). Vienna, Austria. September, 1996. pp. 611-616.


Sumit Basu, Michael Casey, Bill Gardner, Ali Azarbayejani, and Alex Pentland. "Vision-Steered Audio for Interactive Environments." In Proceedings of the 1996  Image Communications Conference (IMAGE'COM '96). Bordeaux, France. 1996.


Michael Casey, William G. Gardner, and Sumit Basu."Vision Steered Beamforming and Transaural Rendering for the Artificial Life Interactive Video Environment (ALIVE)." In Proceedings of the 99th Convention of the Audio Engineering Society. 1995.

Invited Abstracts


Sumit Basu. “Learning Variations in Conversational Patterns.”  Snowbird Learning Workshop ’03.  Snowbird, UT.  April 2003.


Sumit Basu.  “Conversational Scene Analysis.”  Snowbird Learning Workshop ’03.  Snowbird, UT.  April 2002.


Selected Technical Reports

Sumit Basu*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Learning Human Interactions with the Influence Model." Vismod Technical Report #539. June, 2001.  *The first three authors contributed equally to this work and are listed alphabetically.


Sumit Basu. “ICA: A Critical Review of Three Prominent Approaches.” Technical Report. April, 2000.


Sumit Basu. "Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point Machine" December, 1999.


Sumit Basu, Kentaro Toyama, and Alex Pentland. "A Consistent Method for Function Approximation in Mesh-based Applications." Vismod Technical Report #486. 1999.


Sumit Basu. "Efficient Multiscale Template Matching with Orthogonal Wavelet Decompositions." May, 1997.


Trevor Darrell, Sumit Basu, Christopher Wren, and Alex Pentland. "Perceptually-Driven Avatars and Interfaces: Active Methods for Direct Control." Vismod Technical Report #416. 1997.

Selected Invited Talks

Sumit Basu. “Conversational Scene Analysis.”  Invited talk at the Mitsubishi Electric Research Laboratory, Cambridge, MA, Sept 3, 2002.


Sumit Basu.  “Machine Audition for Interactive Environments.”  invited talks at the Georgia Tech Department of Computer Science, the Purdue Computer Science Department, the Department of Information and Computer Science at the University of California, Irvine, Microsoft Research  from February through May, 2002.


Sumit Basu and Alex Pentland, "Concept Formation in Multi-Modal Learning." In Alex Pentland, Tony Jebara, Brian Clarkson, and Sumit Basu, Learning Techniques in Audio-Visual Information Processing, a tutorial at the Int'l Conf. on Pattern Recognition (ICPR '00) Barcelona, Spain. September 3, 2000.


Sumit Basu. "Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point Machine." Invited talk at Tomaso Poggio's group meeting, MIT AI Lab/CBCL. May 12, 2000.


Sumit Basu, Deb Roy, Brian Clarkson, and Alex Pentland. "Learning the Structure of Human Behavior from Sensory Inputs: Language, Daily Patterns, and Conversations." At Grounded Intersensory Language Learning in Sign and Speech (GILLS '00). Grenoble, France. March 24, 2000.

Professional Activities

I have been a reviewer for a variety of conferences and journals, including CVPR, ICCV, IEEE PAMI, IEEE Transactions on Speech and Audio Processing, and more.  I served on the program committee for the 2003 International Workshop on Multimedia Technologies in E-Learning and Collaboration (WOMTEC 2003) at ICCV 2003.  I am also a member of the IEEE.


Musical Projects

I am an avid songwriter/singer/keyboardist/guitarist, and am involved in a number of musical projects:


08:29:06, a solo album I released in June 2000 under the name deepoceanblue, originally available at with over 1,700 unique downloads. [ has since gone bankrupt and ceased to exist.]


Sonovar/Bodybeat, an electronic music project experimenting with convolution and natural sounds as musical building blocks (joint work with Brian Clarkson).  We received an MIT Council for the Arts Grant for this project  in February 2000.  Samples are available on my website.


Additional projects are listed at


Listed in a separate document.