At Honda, I provided technical leadership in Machine Learning and AI. I worked on urban autonomous driving, with a focus on path planning and control with sensor data and high-resolution maps. Other past projects include car-related projects in Spoken Dialog, Belief Tracking, and Hybrid Fuel Efficiency. Projects for ASIMO humanoid robot include Knowledge Discovery, Natural Language Processing, and SLAM.
Team lead for free form dialog system for setting the destination in a
car. We built a belief maintenance and update system to track user goals
during spoken interaction.
Based on this belief, the system asked additional questions or performed the requested action.
We developed Dynamic Probabilistic Ontology Trees
(POT), a new probabilistic model to track dialog state. Our model
captured both the user goal and the history of user dialog acts using
a unified Bayesian Network. We performed efficient inference using a
form of blocked Gibbs sampling designed to exploit the structure of
the model.
Later, we combined this POT semantic belief tracker for categorical
concepts with a kernel density estimator. The kernel density estimator
incorporated landmark evidence into a posterior
probability over candidate destination locations. Our system was demonstrated via
an android app.
I also built a hybrid system to send text messages by voice, where I
used Google Speech API to recognize general messages. I used Nuance
with my language model as a fallback technique when Google Speech
confidence was low, such as when corrections were starting with
No.
In joint work with Adam Vogel and Deepak Ramachandran, we built a
probabilistic driving route prediction system. We trained
our system with Inverse Reinforcement Learning, to optimize the
battery and engine power mix for fuel-efficiency.
We predicted the routes that the
driver is likely to take and probabilistically optimized engine
and battery power for them. Our approach increased
vehicle power efficiency without any hardware
modifications or change in driver behavior. We
outperformed a standard hybrid control policy, yielding an average
of 1.2% fuel savings.
Team lead for design and implementation of autonomous path planning
and assistive features in a Personal Mobility Vehicle (PMV). We added a
motor and hardware modifications to a four-wheeled vehicle for
auto-steering. We developed a Region-based hierarchical model for
mobile robots for object delivery. To make the tracking and decision making
more efficient, we took advantage of the fact that only discrete
topological representations of entity locations are needed for
decision-making. We detected
entities using depth and vision sensors. We introduced a novel
reinforcement learning algorithm called Smoothed Sarsa. It learned a
policy for these delivery tasks by delaying the backup reinforcement
step until there was an improvement in the uncertainty estimate of the state.
The state
space is modeled by a Dynamic Bayesian Network and updated using our
Region-based Particle Filter.
Our experiments showed that policy search led to faster task
completion times as well as higher total rewards compared to a manually
crafted policy. Smoothed Sarsa learned a policy orders of magnitude
faster than previous policy search algorithms. We demonstrated our
algorithms on the Pioneer robot and the Player/Stage simulator.
Dynamic Bayes Network (DBN)
Location modeled by discrete region variable R and position
In Knowledge Discovery, we integrated knowledge from Wikipedia, Yahoo Question/Answers, Open
Directory Project and OpenMind to improve topic recognition. We showed a large error reduction over the
previous state of art on Google Answers and Switchboard datasets. Later, we extended our system to
conversations to predict correct dialog turn using lexical and semantic features.
Created OpenMind Indoor Common
Sense project to collect text data from volunteers. Data was used in-house and by Intel Research, MIT
Media Lab, and Technische Universität München.
In joint work with Ming Hsuan-Yang and Jason Meltzer, we developed a method for learning
feature descriptors for mobile robot navigation and localization.
We used small baseline tracking in image sequences to develop feature descriptors suitable
for the challenging task of wide baseline matching across significant
viewpoint changes. The variations in the appearance of each
feature were learned using kernel principal component analysis (KPCA)
over the image sequences. An approximate version of KPCA was
applied to reduce the computational complexity of the algorithms and
yield a compact representation. Our experiments demonstrated robustness
to wide appearance variations on non-planar surfaces, including changes
in illumination, viewpoint, scale, and geometry of the scene.
Our system incorporated a single camera on a mobile robot
in an extended Kalman filter framework. We developed a 3D map
of the environment and determined egomotion. At the same
time, our feature descriptors were generated from the video
sequence and were used to localize the robot when it
returned to a mapped location.
In a separate joint work with James Davis, we created globally consistent 3D maps from depth fields using an active
structured light space-time stereo system. We implemented Point to Point variant of Iterative Closest Point for local
alignment, with a novel outlier rejection strategy to create a 3D map of a room.
I wrote software for the Graphics and Modeling group at Schlumberger Austin Product Center. I worked on the 3D Common Modeler project for creation and
visualization of geometric models for geological structures. Geometric modeling of geological structures finds applications in oil exploration. These structures consist of faults and horizons with water-tight geometric partitioning. We created models extracting faults and horizon surfaces from field data. Surfaces were triangulated and intersected topologically for computation of volume properties. I also worked on volume visualization of attribute data.
I also managed an offshore team of 12 developers working on software components during the last 2 years at Schlumberger.
At MIT, I designed and implemented software for Officer of the Deck training Task for Navy sponsored
"Virtual Environment Technology for Training" project. We developed a C++ object-oriented framework and
submarine dynamics model with members of the team at BBN Inc. Our system communicated via sockets
with speech recognition system, Head Mounted Display and beachtron sound spatializer. I wrote user-level
device driver and low-level parallel communication software. I maintained and enhanced the lab’s C software
system embedded in Tcl-Tk.
As part of my Ph.D. work, I developed a multimodal Virtual Environment system with visual, haptic, and auditory displays. I built a unified physically-based simulation for modeling dynamic interactions among objects. The human designer sensed the virtual environment through multimodal displays and controlled it through a haptic interface. He saw a visual representation of the objects, heard sounds when objects hit one another, and felt the objects through haptic interface devices with force feedback.
I conducted experiments with human subjects using a physical 2D peg-in-hole apparatus and a simulation of the same apparatus. The simulation duplicated as well as possible the weight, shape, size, peg-hole clearance, and frictional characteristics of the physical apparatus. My experiments showed that the Multimodal VE was able to replicate experimental results in which increased task completion times correlated with increasing task difficulty. The task difficulty was measured as increased friction and increased handling distance combined with decreased peg-hole clearance.
Configuration of the Multimodal Virtual Environment system
For my Master's thesis, I developed a shape representation and algorithms for the simulation and shape synthesis of kinematic higher pairs. Given the shapes of the two members, initial configuration, and the motion of the members (driver), the motion of the other member (driven) is determined via simulation.
Two procedures for shape synthesis were described. Given the shape of one member and the functional kinematic relationship between the member pair, the first procedure generated the shape of the second member from a blank of the material. The second procedure modified the shapes of two interacting members to change their contact characteristics during a kinematic interaction.