![]() |
![]() |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Curriculum Vitae
|
|
Email: |
|
|
WWW: |
|
|
Date of Birth: |
February 12, 1974 |
|
Citizenship: |
USA |
|
Massachusetts Institute of Technology Ph.D. in Electrical Engineering and Computer Science, minor in Brain and Cognitive Sciences Advisor: Professor Alex (Sandy) Pentland, MIT Media Laboratory Thesis: Conversational Scene Analysis GPA: 5.0/5.0
|
1997-2002
|
|
Massachusetts Institute of Technology Master of Engineering in Electrical Engineering and Computer Science Advisor: Professor Alex (Sandy) Pentland, MIT Media Laboratory Thesis: A Three-Dimensional Model of Human Lip Motions GPA: 5.0/5.0
|
1995-1997 |
|
Massachusetts Institute of Technology Bachelor of Science in Electrical Science and Engineering, Phi Beta Kappa Thesis: Hyperacuity Sensing for Image Processing GPA: 5.0/5.0 |
1991-1995 |
|
Microsoft Research, Interactive Visual Media Group Post-Doctoral Researcher Supervisor: P. Anandan, Senior Researcher Topics: conversational analysis for call centers, auditory/speech analysis and sensor fusion for security applications, interfaces for menu-driven audio conversations, personal audio (storing and browsing conversational audio from our daily lives), audio-visual browsing and scene analysis, audio synthesis for limited audio outputs, musical structure analysis, and learning/synthesis of musical style.
|
9/2002-Present |
|
MIT Media Laboratory, Vision and Modeling Group Research Assistant Advisor: Professor Alex (Sandy) Pentland Topics: modeling conversational interactions from low-level features, prosodic feature estimation, active interfaces, Bayesian networks (exact and approximate inference), speech detection, wearable phased arrays, source localization, maximum likelihood tracking for deformable meshes (applied to lip tracking), finite element priors for 3D meshes, optical-flow regularization with 3D models (applied to head tracking), vision-steered beamforming.
|
1995 – 2002 |
|
Perceptive Network Technologies Perceptual Engineer Supervisor: Dr. Julian Center, Jr. Topic: developed a real-time speech detection module.
|
August 2000 |
|
Microsoft Research, Vision Group Research Consultant Advisor: Kentaro Toyama Topic: developed a real-time version of my 3D lip tracking work, began the development of a mesh-based smoothing algorithm.
|
August 1998 |
|
MIT Media Laboratory, Speech Interfaces Group Undergraduate Research Assistant Advisor: Chris Schmandt Topics: audio interfaces, speaker identification and segmentation.
|
1993-1995 |
|
Xerox PARC, Electronic Materials Laboratory (EML) Research Intern Advisors: Warren Jackson, David Biegelsen, David Jared Topic: image enhancement using position-sensitive detectors (PSD) and their mechanisms as applied to standard scanning elements.
|
June, 1994- August, 1994 |
|
Xerox PARC, Computer Science Laboratory (CSL) Research Intern Advisor: David Goldberg Topics: handwritten character recognition for a palmtop device, pen-based interfaces
|
June, 1993- August, 1993 |
|
Iowa Department of Transportation, Information Systems Developer Supervisors: Ron Laird, Robert Klopping Topics: developed a C++ application, BIAS (Bid Item Automation System), for entering/editing contract items, to be used by all contractors working for the state of Iowa. A modified version of this software is still in use today. |
May, 1992- August, 1992 |
Please see http://www.media.mit.edu/~sbasu for a full statement of my research interests. My core area is Communicative Computing, which includes:
· Machine Perception
· Machine Learning
· Human-Computer Interfaces
|
Microsoft Research Intern Mentor Worked with intern Ian Simon, along with researchers David Salesin and Maneesh Agrawala, to develop new algorithms for music analysis and synthesis. Worked very closely with Ian to teach him the necessary basics in signal processing and dynamic programming; interacted with him on a daily basis to discuss algorithmic choices and results. I am continuing in an advisory role for Ian along with David, his advisor at the University of Washington, and Maneesh to help guide the work on this project.
|
Summer, 2003 – present
|
|
MIT Deparment of Electrical Engineering and Computer Science Graduate Teaching Assistant Class: Probabilistic Systems Analysis (6.041/6.431) Supervisor: Professor Dimitri Bertsekas Description: Undergraduate/Graduate introduction to probability, going from basic axioms through central limit theorems, markov chains, basic stochastic processes. Duties: Each week, taught one 40-student recitiation (undergraduates), eight 5-student tutorials (interactive problem solving), held two-four office hours (undergraduates/graduates), and attended a two hour staff meeting with other TAs. Additionally graded papers, led several quiz reviews for the entire class (300+ students) and developed handouts for the students. Received an excellent evaluation in the students’ “Underground Guide:” “Recitation instructor S. Basu was extremely well-liked by his students, receiving comments such as "Sumit rocks the house!" from his students. He was very available, very friendly, and very helpful, taking the time to help his students whenever it was needed. He was considered an excellent teacher, who was always prepared and had good examples to clarify his points. His tutorials were "great fun" and very helpful for learning. His explanations were clear, and very helpful.”
|
Fall, 1998 |
|
MIT Media Laboratory UROP (Undergraduate Research Opportunities Program) Supervisor Supervisor: Alex (Sandy) Pentland Description: Held regular meetings with undergraduate researchers under my supervision, taught relevant theory, gave guidance in choosing research goals and coursework/career paths, engaged in many one-on-one help sessions, evaluated their progress, wrote recommendations for graduate school. Over the course of graduate school, I worked with a total of eight undergraduates.
|
Fall 1995 – Summer, 2002 |
|
MIT F/ASIP (Freshmen/Alumni Summer Internship Program) Counselor Supervisor/Organizer: Professor Arthur Steinberg Duties: Worked with groups of freshmen in the F/ASIP program. Led resume workshops, simulated work scenarios, helped students develop simulated consulting reports. Also advised students in course selection and career strategies.
|
Spring, 1998; Spring, 1999 |
|
MIT ISP (Integrated Studies Program) Guest Lecturer Supervisor: Professor Arthur Steinberg Gave a guest lecture on “Artificial Intelligence and the Future of Computing” to a group of 40 freshmen.
|
Spring, 2001 |
|
MIT ESP (Educational Studies Program) Teacher for Songwriting Workshop Supervisor: MIT ESP Co-taught (with Regalp Sen) a two-hour workshop on songwriting skills for 25 high school students. Led warm-up exercises, discussed elements of rhyme and meter, helped students combine words and music in group activities. |
Fall, 2001 |
This is a subset of the possible courses I would be interested in teaching. Further details are in my Statement of Teaching Interests.
· Introduction to Probability and Statistics (Undergraduate/Graduate)
· Introduction to Pattern Recognition/Machine Learning (Undergraduate/Graduate)
· Computational Perception – Vision and Audition (Undergraduate/Graduate)
· Signal Processing for Voice and Music Applications (Undergraduate/Graduate)
· Speech Processing/Speech Recognition (Graduate)
· Machine Perception Seminar (Graduate)
· Adaptive Interfaces (Graduate)
· Bayesian Networks (Graduate)
· Machine Learning Seminar (Graduate)
· NSF Graduate Research Fellowship – Sept.1995-Aug., 1998
· Member, Phi Beta Kappa (Undergraduate Honor Society) – May, 1995-present
· Member, Sigma Xi (Scientific Research Society) – May, 1995-present
· Member, Tau Beta Pi (Engineering Honor Society) – March, 1994 - present
· Member, Eta Kappa Nu (EECS Honor Society) – March, 1994 - present
· Winner, MIT 6.270 Robotics Competition (team: Loren Shih, myself) – January, 1993
· United States Presidential Scholar - 1991
|
Sumit Basu. [MSR Patent 5 – the topic/title cannot be listed for IP protection]. Disclosed 2003.
|
|
Sumit Basu. [MSR Patent 4 – the topic/title cannot be listed for IP protection]. Disclosed 2003.
|
|
[1 co-author] and Sumit Basu. [MSR Patent 3 – the topic/title cannot be listed for IP protection]. Disclosed 2003.
|
|
[7 co-authors] and Sumit Basu. [MSR Patent 2 – the topic/title cannot be listed for IP protection]. Filed 2003 (Pending).
|
|
Sumit Basu. [MSR Patent 1 – the topic/title cannot be listed for IP protection]. Filed 2003 (Pending).
|
|
Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, Sumit Basu, and Evgeniy Gusvatin. "Method of Establishing a Communications Link Using Perceptual Sensing of a User's Presence." Filed November 10, 2000 (Pending).
|
|
Julian L. Center, Jr., Christopher R. Wren, Alex Pentland, Trevor Darrell, and Sumit Basu. "Method of Extending Image-Band Face Recognition Systems to Utilize Multi-View Image Sequences and Audio Information." Filed November 10, 2000 (Pending). |
|
Warren B. Jackson, David A. Jared, Sumit Basu, and David K. Biegelsen. "Macrodetector-Based Image Conversion System." US Patent No. 5,790,699. Granted August 4, 1998.
|
|
Warren B. Jackson, David A. Jared, Sumit Basu, and David K. Biegelsen. "Position Sensitive Detector Based Image Conversion System Capable of Preserving Subpixel Information." US Patent No. 5,754,690. Granted May 19, 1998. |
|
Sumit Basu, Nuria Oliver, and Alex Pentland. "3D Lip Shapes from Video: A Combined Physical-Statistical Model." Speech Communication 26, 1998. pp. 131-148.
|
|
Note: I am in the process of preparing two additional journal articles based on my thesis work
|
|
Nebojsa Jojic, Sumit Basu, Nemanja Petrovic, Brendan Frey, and Thomas Huang. Joint design of Data Analysis Algorithms and User Interface for Video Applications. In Proceedings of the Machine Learning Meets the User Interface Workshop (MLUI) at NIPS 2003. Vancouver, BC.
|
|
Tanzeem Choudhury, Brian Clarkson, Sumit Basu, and Alex Pentland, A. Learning Communities: Connectivity and Dynamics of Interacting Agents. Proceedings of the International Joint Conference on Neural Networks (IJCNN'03), Special Session on on Autonomous Mental Development. July 2003.
|
|
Sumit Basu, “A Two Layer Model for Voicing and Speech Detection,” Proceedings of the IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP ’03). Hong Kong.
|
|
Sumit Basu*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Towards Measuring Human Interactions in Conversational Settings." In Proceedings of the IEEE Int'l Workshop on Cues in Communication (CUES 2001) at CVPR 2001. Kauai, Hawaii. *The first three authors contributed equally to this work and are listed alphabetically.
|
|
Sumit Basu, Brian Clarkson, and Alex Pentland. "Smart Headphones: Enhancing Auditory Awareness through Robust Speech Detection and Source Localization." In Proceedings of the IEEE Conf. on Acoustics, Speech, and Signal Processing (ICASSP '01). Salt Lake City, Utah. 2001.
|
|
Sumit Basu, Brian Clarkson, and Alex Pentland. "Smart Headphones." In Proceedings of the Conference on Human Factors in Computing Systems (CHI '01). (Short Paper). Seattle, Washington. April, 2001.
|
|
Sumit Basu, Steve Schwartz, and Alex Pentland. "Wearable Phased Arrays for Sound Localization and Enhancement." In Proceedings of the IEEE Int'l Symposium on Wearable Computing (ISWC '00). Atlanta, Georgia. October, 2000. pp. 103-110.
|
|
Jacob Strom, Tony Jebara, Sumit Basu, and Alex Pentland. "Real Time Tracking and Modeling of Faces: An EKF-based Analysis by Synthesis Approach." In Proceedings of the IEEE Modeling People Workshop at the IEEE Int’l Conf. on Computer Vision 1999 (ICCV '99). Kerkyra, Greece. September, 1999.
|
|
Christopher R. Wren, Sumit Basu, Flavia Sparacino, and Alex Pentland. "Combining Audio and Video in Perceptive Spaces." In Proceedings of the First Int'l Workshop on Managing Interactions in Smart Environments. Dublin, Ireland. 1999.
|
|
Sumit Basu, Nuria Oliver, and Alex Pentland. "Coding Human Lip Motions with a Learned 3D Model." In Proceedings of the Int'l Workshop on Very Low Bitrate Video Coding (VLBV '98). Urbana, Illinois. October, 1998.
|
|
Sumit Basu, Nuria Oliver, and Alex Pentland. "3D Modeling and Tracking of Human Lips." In Proceedings of the IEEE Int'l Conf. on Computer Vision (ICCV ’98). Mumbai, India. January, 1998. pp. 337-343.
|
|
Christopher R. Wren, Sumit Basu, and Alex Pentland."Perceptive Spaces: Learning Dynamic Models of Human Behavior." In Proceedings of the Workshop on Perceptual User Interfaces (PUI '97). Banff, Canada. 1997.
|
|
Sumit Basu and Alex Pentland. "Recovering 3D Lip Structure from 2D Observations Using a Model Trained from Video." In Proceedings of the ESCA Workshop on Audio-Visual Speech Processing (AVSP'97). Rodos, Greece. 1997.
|
|
Sumit Basu and Alex Pentland. "A Three-Dimensional Model of Human Lip Motion." In Proceedings of the IEEE Non-Rigid and Articulated Motion Workshop at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '97). San Juan, Puerto Rico. June, 1997.
|
|
Irfan Essa, Sumit Basu, Trevor Darrell, and Alex Pentland. "Modeling, Tracking, and Interactive Animation of Faces and Heads Using Input from Video." In Proceedings of Computer Animation '96. Geneva, Switzerland. 1996.
|
|
Sumit Basu, Irfan Essa, and Alex Pentland. "Motion Regularization for Model-Based Head Tracking." In Proceedings of the 13th IEEE Int'l Conf. on Pattern Recognition (ICPR '96). Vienna, Austria. September, 1996. pp. 611-616.
|
|
Sumit Basu, Michael Casey, Bill Gardner, Ali Azarbayejani, and Alex Pentland. "Vision-Steered Audio for Interactive Environments." In Proceedings of the 1996 Image Communications Conference (IMAGE'COM '96). Bordeaux, France. 1996.
|
|
Michael Casey, William G. Gardner, and Sumit Basu."Vision Steered Beamforming and Transaural Rendering for the Artificial Life Interactive Video Environment (ALIVE)." In Proceedings of the 99th Convention of the Audio Engineering Society. 1995. |
|
Sumit Basu. “Learning Variations in Conversational Patterns.” Snowbird Learning Workshop ’03. Snowbird, UT. April 2003.
|
|
Sumit Basu. “Conversational Scene Analysis.” Snowbird Learning Workshop ’03. Snowbird, UT. April 2002.
|
|
Sumit Basu*, Tanzeem Choudhury*, Brian Clarkson*, and Alex Pentland. "Learning Human Interactions with the Influence Model." Vismod Technical Report #539. June, 2001. *The first three authors contributed equally to this work and are listed alphabetically.
|
|
Sumit Basu. “ICA: A Critical Review of Three Prominent Approaches.” Technical Report. April, 2000.
|
|
Sumit Basu. "Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point Machine" December, 1999.
|
|
Sumit Basu, Kentaro Toyama, and Alex Pentland. "A Consistent Method for Function Approximation in Mesh-based Applications." Vismod Technical Report #486. 1999.
|
|
Sumit Basu. "Efficient Multiscale Template Matching with Orthogonal Wavelet Decompositions." May, 1997.
|
|
Trevor Darrell, Sumit Basu, Christopher Wren, and Alex Pentland. "Perceptually-Driven Avatars and Interfaces: Active Methods for Direct Control." Vismod Technical Report #416. 1997. |
|
Sumit Basu. “Conversational Scene Analysis.” Invited talk at the Mitsubishi Electric Research Laboratory, Cambridge, MA, Sept 3, 2002.
|
|
Sumit Basu. “Machine Audition for Interactive Environments.” invited talks at the Georgia Tech Department of Computer Science, the Purdue Computer Science Department, the Department of Information and Computer Science at the University of California, Irvine, Microsoft Research from February through May, 2002.
|
|
Sumit Basu and Alex Pentland, "Concept Formation in Multi-Modal Learning." In Alex Pentland, Tony Jebara, Brian Clarkson, and Sumit Basu, Learning Techniques in Audio-Visual Information Processing, a tutorial at the Int'l Conf. on Pattern Recognition (ICPR '00) Barcelona, Spain. September 3, 2000.
|
|
Sumit Basu. "Empirical Results on the Generalization Capabilities and Convergence Properties of the Bayes Point Machine." Invited talk at Tomaso Poggio's group meeting, MIT AI Lab/CBCL. May 12, 2000.
|
|
Sumit Basu, Deb Roy, Brian Clarkson, and Alex Pentland. "Learning the Structure of Human Behavior from Sensory Inputs: Language, Daily Patterns, and Conversations." At Grounded Intersensory Language Learning in Sign and Speech (GILLS '00). Grenoble, France. March 24, 2000. |
|
I have been a reviewer for a variety of conferences and journals, including CVPR, ICCV, IEEE PAMI, IEEE Transactions on Speech and Audio Processing, and more. I served on the program committee for the 2003 International Workshop on Multimedia Technologies in E-Learning and Collaboration (WOMTEC 2003) at ICCV 2003. I am also a member of the IEEE.
|
I am an avid songwriter/singer/keyboardist/guitarist, and am involved in a number of musical projects:
|
08:29:06, a solo album I released in June 2000 under the name deepoceanblue, originally available at http://www.mp3.com/deepoceanblue with over 1,700 unique downloads. [mp3.com has since gone bankrupt and ceased to exist.]
|
|
Sonovar/Bodybeat, an electronic music project experimenting with convolution and natural sounds as musical building blocks (joint work with Brian Clarkson). We received an MIT Council for the Arts Grant for this project in February 2000. Samples are available on my website.
|
|
Additional projects are listed at http://www.media.mit.edu/~sbasu/music.html |
Listed in a separate document.