Rakesh Gupta

Projects

At Honda, I provided technical leadership in Machine Learning and AI. I worked on urban autonomous driving, with a focus on path planning and control with sensor data and high-resolution maps. Other past projects include car-related projects in Spoken Dialog, Belief Tracking, and Hybrid Fuel Efficiency. Projects for ASIMO humanoid robot include Knowledge Discovery, Natural Language Processing, and SLAM.

Spoken Dialog

Team lead for free form dialog system for setting the destination in a car. We built a belief maintenance and update system to track user goals during spoken interaction. Based on this belief, the system asked additional questions or performed the requested action. We developed Dynamic Probabilistic Ontology Trees (POT), a new probabilistic model to track dialog state. Our model captured both the user goal and the history of user dialog acts using a uniﬁed Bayesian Network. We performed efficient inference using a form of blocked Gibbs sampling designed to exploit the structure of the model.

Later, we combined this POT semantic belief tracker for categorical concepts with a kernel density estimator. The kernel density estimator incorporated landmark evidence into a posterior probability over candidate destination locations. Our system was demonstrated via an android app.

I also built a hybrid system to send text messages by voice, where I used Google Speech API to recognize general messages. I used Nuance with my language model as a fallback technique when Google Speech confidence was low, such as when corrections were starting with No.

Hybrid Fuel Efficiency

In joint work with Adam Vogel and Deepak Ramachandran, we built a probabilistic driving route prediction system. We trained our system with Inverse Reinforcement Learning, to optimize the battery and engine power mix for fuel-efficiency. We predicted the routes that the driver is likely to take and probabilistically optimized engine and battery power for them. Our approach increased vehicle power efficiency without any hardware modifications or change in driver behavior. We outperformed a standard hybrid control policy, yielding an average of 1.2% fuel savings.

Robot Belief Tracking

Team lead for design and implementation of autonomous path planning and assistive features in a Personal Mobility Vehicle (PMV). We added a motor and hardware modifications to a four-wheeled vehicle for auto-steering. We developed a Region-based hierarchical model for mobile robots for object delivery. To make the tracking and decision making more efficient, we took advantage of the fact that only discrete topological representations of entity locations are needed for decision-making. We detected entities using depth and vision sensors. We introduced a novel reinforcement learning algorithm called Smoothed Sarsa. It learned a policy for these delivery tasks by delaying the backup reinforcement step until there was an improvement in the uncertainty estimate of the state. The state space is modeled by a Dynamic Bayesian Network and updated using our Region-based Particle Filter.

Our experiments showed that policy search led to faster task completion times as well as higher total rewards compared to a manually crafted policy. Smoothed Sarsa learned a policy orders of magnitude faster than previous policy search algorithms. We demonstrated our algorithms on the Pioneer robot and the Player/Stage simulator.

Dynamic Bayes Network (DBN)

Location modeled by discrete region variable R and position

Knowledge Discovery / NLP

In Knowledge Discovery, we integrated knowledge from Wikipedia, Yahoo Question/Answers, Open Directory Project and OpenMind to improve topic recognition. We showed a large error reduction over the previous state of art on Google Answers and Switchboard datasets. Later, we extended our system to conversations to predict correct dialog turn using lexical and semantic features.

Created OpenMind Indoor Common Sense project to collect text data from volunteers. Data was used in-house and by Intel Research, MIT Media Lab, and Technische Universität München.

Simultaneous Localization and Mapping

In joint work with Ming Hsuan-Yang and Jason Meltzer, we developed a method for learning feature descriptors for mobile robot navigation and localization. We used small baseline tracking in image sequences to develop feature descriptors suitable for the challenging task of wide baseline matching across significant viewpoint changes. The variations in the appearance of each feature were learned using kernel principal component analysis (KPCA) over the image sequences. An approximate version of KPCA was applied to reduce the computational complexity of the algorithms and yield a compact representation. Our experiments demonstrated robustness to wide appearance variations on non-planar surfaces, including changes in illumination, viewpoint, scale, and geometry of the scene.

Our system incorporated a single camera on a mobile robot in an extended Kalman filter framework. We developed a 3D map of the environment and determined egomotion. At the same time, our feature descriptors were generated from the video sequence and were used to localize the robot when it returned to a mapped location.

In a separate joint work with James Davis, we created globally consistent 3D maps from depth fields using an active structured light space-time stereo system. We implemented Point to Point variant of Iterative Closest Point for local alignment, with a novel outlier rejection strategy to create a 3D map of a room.

Geometric Modeling of Geological Structures

I wrote software for the Graphics and Modeling group at Schlumberger Austin Product Center. I worked on the 3D Common Modeler project for creation and visualization of geometric models for geological structures. Geometric modeling of geological structures finds applications in oil exploration. These structures consist of faults and horizons with water-tight geometric partitioning. We created models extracting faults and horizon surfaces from field data. Surfaces were triangulated and intersected topologically for computation of volume properties. I also worked on volume visualization of attribute data.

I also managed an offshore team of 12 developers working on software components during the last 2 years at Schlumberger.

Officer of The Deck Virtual Environment Training

At MIT, I designed and implemented software for Officer of the Deck training Task for Navy sponsored "Virtual Environment Technology for Training" project. We developed a C++ object-oriented framework and submarine dynamics model with members of the team at BBN Inc. Our system communicated via sockets with speech recognition system, Head Mounted Display and beachtron sound spatializer. I wrote user-level device driver and low-level parallel communication software. I maintained and enhanced the lab’s C software system embedded in Tcl-Tk.

Multimodal Virtual Environments for Assembly

As part of my Ph.D. work, I developed a multimodal Virtual Environment system with visual, haptic, and auditory displays. I built a unified physically-based simulation for modeling dynamic interactions among objects. The human designer sensed the virtual environment through multimodal displays and controlled it through a haptic interface. He saw a visual representation of the objects, heard sounds when objects hit one another, and felt the objects through haptic interface devices with force feedback.

I conducted experiments with human subjects using a physical 2D peg-in-hole apparatus and a simulation of the same apparatus. The simulation duplicated as well as possible the weight, shape, size, peg-hole clearance, and frictional characteristics of the physical apparatus. My experiments showed that the Multimodal VE was able to replicate experimental results in which increased task completion times correlated with increasing task difficulty. The task difficulty was measured as increased friction and increased handling distance combined with decreased peg-hole clearance.

Configuration of the Multimodal Virtual Environment system

Assembly task in the real world

Visual and haptic setup for the experiments

Assembly task in Multimodal Virtual Environment

Kinematic Simulation and Shape Synthesis

For my Master's thesis, I developed a shape representation and algorithms for the simulation and shape synthesis of kinematic higher pairs. Given the shapes of the two members, initial configuration, and the motion of the members (driver), the motion of the other member (driven) is determined via simulation.

Two procedures for shape synthesis were described. Given the shape of one member and the functional kinematic relationship between the member pair, the first procedure generated the shape of the second member from a blank of the material. The second procedure modified the shapes of two interacting members to change their contact characteristics during a kinematic interaction.

3D view of the Fuji disposable camera simulation