Automating the art of interruption
Today’s robots are heavily programmed and don’t learn very much. This can work well enough in factories, but if robots are going to operate in offices, schools, and homes, they must be able to adapt to certain environments — and the specifications of particular users, points out Sonia Chernova, an assistant professor in Georgia Tech’s School of Interactive Computing, who specializes in machine learning and human-robot interactions.
Among projects in Chernova’s lab, graduate students are investigating interruptibility, an area where little robotics research has been conducted.
Sonia Chernova is an assistant professor in the School of Interactive Computing.
Interactive robots are more effective if they can occasionally ask for direction or verify whether something should be done, Chernova points out. “Suppose you have a robot that cleans houses. Not everyone has the same type of house — and not everyone wants their house cleaned the same way.”
“Yet while humans are amazingly good at knowing when to interrupt, robots can be pretty rude. If robots perceive a human in the environment, they will approach, regardless of what the person is doing. And that interruption can take a toll if the person is engaged in a challenging task.”
With that in mind, the researchers are looking at when robots should interrupt a human and who they should interrupt if there are multiple people nearby.
To collect training and testing data for the project, the researchers asked five people to engage in a series of activities such as drinking coffee, talking, or working on laptops. During each instance, the robot moved through a series of waypoints that enabled it to observe the group from different perspectives. Based on such social cues as body position and face gaze, along with sounds and what kind of objects the humans were holding, the robot determined on a scale of one to four how interruptible people were.
Four different types of temporal machine-learning models were tested: three types of conditional random fields (CRFs) and a hidden Markov model (HMM). Of these, the researchers showed that latent dynamic CRFs did the best job of predicting interruptibility. CRFs were expected to have an upper hand, because they are discriminative models and perform better for classification tasks.
Robots designed to work with humans must know the right time to ask questions. School of Interactive Computing graduate student Siddhartha Banerjee takes a phone call while a robot watches his activities to determine if he can be interrupted.
“Yet we were surprised how badly the HMMs did, because they have been used in a lot of prior work,” said graduate student Siddhartha Banerjee, who led the study. “The LDCRFs, which have hidden states, are useful in modeling unknown dependencies between what you observe and what you are trying to classify,” he explained. “So they can help you classify a sub-sequence of whatever movements are observed, and based on the classification of the sequences, you classify the actual interruptibility state.”
Another novel finding was using object labels, which enabled the robot to identify objects either being used by a person or located near them. These provided an extra boost of performance — up to 10 percent more accuracy. “If a person is holding a cell phone versus a cup of coffee, it can help suggest what activity they are engaged in and how interruptible they might be,” Chernova said. “This may sound intuitive, but it’s never been demonstrated before.”
Moving forward, the researchers now are trying to gauge the impact of having interruption-savvy robots around. Among variables, they are measuring how much better these interactive robots might perform, and how much longer people might need to complete tasks they were doing after being interrupted.
Funded by NASA, the research could have implications in joint human-robot space missions. “Figuring out when to interrupt a busy astronaut is an important consideration,” Chernova pointed out.
Smarter, safer cars
Operating safely in the external environment is especially critical for robots on wheels, and researchers in IC and the Daniel Guggenheim School of Aerospace Engineering (AE) are investigating advanced perception and control for autonomous cars.
In one project, Assistant Professor Byron Boots (IC), professor James Rehg (IC), Assistant Professor Evangelos Theodorou (AE), and graduate students Grady Williams, Nolan Wagener, Brian Goldfain, and Paul Drews are developing technologies that enable robotic cars to perform better on rough terrain.
The small autonomous cars, known as AutoRally Robots, are about three feet long and equipped with electric motors, onboard computing resources, cameras, GPS, and a variety of sensors. “They have to be able to see the environment, calculate information about how they relate to the environment, and then make a sequence of decisions extremely quickly in order to drive as fast as possible and stay upright,” said Boots, director of Georgia Tech’s Robot Learning Lab. No easy feat, because the surface of the dirt test track causes the cars to drift and slide and have a lot of nonlinear interactions, which are tough to model.
The team combined traditional machine-learning algorithms with a control system that enabled the robotic cars to continuously sample data from sensors and generate optimal trajectories for both performance and control. “The algorithm allows the robot to predict the future,” explained Williams, lead author on two published papers. “Assessing its state, the robot tries to predict what will happen if it tries a set of actions — something it does millions of times per second.”
In a more recent advance, the team replaced the physics-based model (a linear regression approach) with multi-layer neural networks to handle the prediction. “This is a significant step forward because it enables a purely data-driven approach to model learning,” said Williams. “Instead of writing down equations of motions, we apply a deep neural network, and it learns the forward model from data.”
With this approach, the researchers achieved better results, increasing speed without sacrificing control. The team’s most recent paper was presented in early 2017 at the IEEE’s International Conference on Robotics and Automation in Sweden; out of 900 submissions, it was one of five nominees for the best paper award.
The researchers are now upping the ante as they test the AutoRally cars racing against each other instead of driving solo on the track. “This becomes a much more complicated problem because not only does the car have to reason with itself and its own interactions with the environment, but also about other agents in the same environment,” said Boots.
Sponsored by the National Science Foundation (NSF), the research has applications both in the automotive industry and the military.
“When it comes to machine learning, there hasn’t been a huge amount of work done on how to apply artificial intelligence to physical systems that must go out in the world and learn to act in an optimal way,” Boots added. “We’re trying to use machine learning to take robotics to the next level.”
Mind over matter — sort of
Another relatively new sphere for machine learning is brain machine interfaces (BMI), a field where researchers are trying to restore movement for people who are paralyzed due to spinal cord injuries or diseases like amyotrophic lateral sclerosis (ALS).
“When paralyzed people think about moving, nothing happens because there is an interruption between their brains and bodies,” explained Chethan Pandarinath, a new assistant professor in Georgia Tech’s and Emory University’s joint Wallace H. Coulter Department of Biomedical Engineering. “But they can still think about movement, which generates electrical activity in the brain as millions of neurons start talking to each other. Our goal is to decode this activity and do something useful for people, such as restoring their ability to drink by being able to control a robotic arm and pick up a glass of water.”
“When paralyzed people think about moving, nothing happens… but they can still think about movement, which generates electrical activity in the brain as millions of neurons start talking to each other.”
In an ongoing multi-institutional clinical trial known as BrainGate, Pandarinath and other researchers have created a high-performance BMI for communications that enables people with paralysis to type words and messages. To develop the system, small electrode arrays the size of baby aspirins were implanted into the brains of three research participants with paralysis. These electrodes allowed the system to record and decode electrical activity in real time as the individuals thought about moving, which allowed them to move and click a cursor on a computer screen. This sort of cerebral typing produced six to eight words per minute — three times faster than any previous system.
The key to BrainGate’s success is running two algorithms in parallel. A Kalman filter decodes what direction the cursor should move, while a hidden Markov model decodes whether the person wants to stop and click or continue to move the cursor. Results were extremely accurate, with people clicking on their intended target more than 98 percent of the time.
Although a big step forward, the point-and-click control of the computer cursor is fairly simple compared to all the things people do when they move, Pandarinath points out. “Our goal is to accurately manipulate more complex devices under brain control, such as robotic arms or ultimately the person’s own arm.”
Deep-learning techniques may pave the way for such devices. In collaboration with research scientist David Sussillo at Google Brain, Pandarinath’s team is trying to understand how recurrent neural networks (deep-learning algorithms used in language translation and speech processing) could be adapted for neuroscience.
“With our current electrode arrays we’re only recording 100 to 200 neurons, which is a small fraction of all the millions of neurons involved when people think about moving,” Pandarinath said. “Yet from this tiny snapshot, deep learning tools might help us model what the entire network is doing, which could help us restore movement.”