Researchers from UT-Austin and the U.S. Army Research Laboratory have teamed up to develop a new technique, called Deep TAMER, to teach robots and computer programs how to perform tasks with feedback from human trainers.
Deep TAMER — which stands for Training an Agent Manually via Evaluative Reinforcement — is a method that relies on deep learning, a class of powerful machine learning techniques that allow computers to work by example, said Garrett Warnell, a U.S. Army researcher and visiting researcher at UT-Austin. What makes it so effective is that it allows machines to represent data in new ways, he said.
“Deep learning techniques are currently the state of the art for many tasks such as face recognition and natural language processing,” Warnell said.
This method relies on a human trainer to provide feedback to a robot in the form of a signal, such as “good job” or “bad job,” similar to how an animal trainer might train an animal, Warnell said. After receiving feedback, the robot will utilize deep learning to come up with a model to predict what behaviors its human trainer will want.
Although it might be tempting to parallel the way the robot learns with how humans process information, the two processes are still distinct, Warnell said. The computer programs and robots might initially see images and videos just like humans do, but the way they apply this information is different.
“The algorithms and models used here were only very loosely inspired by the brain,” Warnell said. “In certain ways, they aim to emulate neurons, but in most ways, they are actually very different.”
Warnell said current state-of-the-art algorithms for machine learning are pretty slow, but previous research has shown that humans can aid in increasing the efficiency of these models.
“Ultimately, we’d like to create algorithms that allow humans to train and retrain machines to do all kinds of things very quickly,” Warnell said.
Although most people have never interacted with autonomous robots, they have interacted with computers in one form or another, which is what these experiments were based on. Specific experiments involved training a computer to play a video game, Warnell said.
“That said, it was extremely entertaining to do the training. I really felt like the computer player was listening to me and it was exciting to see it start to do better in the game because of my advice,” Warnell said.
The experience of interacting with a robot or program such as the one Warnell used is unique depending on the individual’s expectations going in, said Peter Stone, a computer science professor who was also one of the researchers working on this project.
“If your point of reference is the real world, it can be fun and surprising to see what the robot can do,” Stone said. “If your point of reference is the movies, you’re going to be disappointed.”
One useful application for Deep TAMER would be in training autonomous robots to work alongside humans in the army. However, as of now, it’s not quite sure when it will happen or what it will look like, Warnell said.
“One specific team scenario we have been working towards is that of multiple robots assisting humans by performing reconnaissance tasks for things like search and rescue,” Warnell said.
As artificial intelligence improves, it will be critical for machines to learn from, adapt to and work with humans, Warnell said.
“One of the most fascinating things about (AI) is that it’s always new,” Stone said. “What I’m working on now is different from what I was working on five years ago, and what I’ll be working on in five years. For the goal of creating intelligent agents, there are always new challenges. It never gets boring!”