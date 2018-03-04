main article image
(agsandrew/Shutterstock)

AI Just Took a Big Step Towards Becoming More Human

They're now capable of training themselves.

DOM GALEON, FUTURISM
4 MAR 2018
 

In recent months, researchers at OpenAI have been focusing on developing artificial intelligence (AI) that learns better.

Their machine learning algorithms are now capable of training themselves, so to speak, thanks to the reinforcement learning methods of their OpenAI Baselines.

 

Now, a new algorithm lets their AI learn from its own mistakes, almost as human beings do.

The development comes from a new open-source algorithm called Hindsight Experience Replay (HER), which OpenAI researchers released earlier this week.

As its name suggests, HER helps an AI agent "look back" in hindsight, so to speak, as it completes a task. Specifically, the AI reframes failures as successes, according to OpenAI's blog.

"The key insight that HER formalizes is what humans do intuitively: Even though we have not succeeded at a specific goal, we have at least achieved a different one," the researchers wrote.

"So why not just pretend that we wanted to achieve this goal to begin with, instead of the one that we set out to achieve originally?"

Simply put, this means that every failed attempt as an AI works towards a goal counts as another, unintended "virtual" goal.

Think back to when you learned how to ride a bike. On the first couple of tries, you actually failed to balance properly.

Even so, those attempts taught you how to not ride properly, and what to avoid when balancing on a bike. Every failure brought you closer to your goal, because that's how human beings learn.

 

Rewarding every failure

With HER, OpenAI wants their AI agents to learn the same way.

At the same time, this method will become an alternative to the usual rewards system involved in reinforcement learning models. To teach AI to learn on its own, it has to work with a rewards system: either the AI reaches its goal and gets an algorithm "cookie" or it doesn't.

Another model gives out cookies depending on how close an AI is to achieving a goal.

Both methods aren't perfect. The first one stalls learning, because an AI either gets it or it doesn't.

The second one, on the other hand, can be quite tricky to implement, according to the IEEE Spectrum.

By treating every attempt as a goal in hindsight, HER gives an AI agent a reward even when it actually failed to accomplish the specified task. This helps the AI learn faster and at a higher quality.

"By doing this substitution, the reinforcement learning algorithm can obtain a learning signal since it has achieved some goal; even if it wasn't the one that you meant to achieve originally. If you repeat this process, you will eventually learn how to achieve arbitrary goals, including the goals that you really want to achieve," according to OpenAI's blog.

Here's an example of how HER works with OpenAI's Fetch simulation.

This method doesn't mean that HER makes it completely easier for AI agents to learn specific tasks.

"Learning with HER on real robots is still hard since it still requires a significant amount of samples," OpenAI's Matthias Plappert told IEEE Spectrum.

In any case, as OpenAI's simulations demonstrated, HER can be quite helpful at "encouraging" AI agents to learn even from their mistakes, pretty much as we all do — the major difference being that AIs don't get frustrated like the rest of us feeble folks

This article was originally published by Futurism. Read the original article.

 

More From ScienceAlert

These doctors opened up the skull of the wrong patient for invasive brain surgery
These Doctors Opened Up The Skull of The Wrong Patient For Invasive Brain Surgery

A horrifying mistake.

24 minutes ago
Here's why cats knead you with their paws, according to science
Here's Why Cats Knead You With Their Paws, According to Science

So much love!

0 minutes ago
Here's the truth about 14 of the most common myths on sleep
Here's The Truth About 14 of The Most Common Myths on Sleep

Can you really "become" a morning person?

1 hour ago
Ravens evolved in a brutal and unexpected way, just like humans
Ravens Evolved in a Brutal And Unexpected Way, Just Like Humans

Global domination.

13 hours ago
Elon Musk just posted an intimate picture with his tunnel-boring machine
Elon Musk Just Posted an Intimate Picture With His Tunnel-Boring Machine

He's really passionate about those tunnels.

1 day ago
Quantum theory bends the limits of physics, showing two-way signaling may be possible
Quantum Theory Bends The Limits of Physics, Showing Two-Way Signaling May Be Possible

The limit does not exist.

1 day ago
There's a type of black hole that erases your past and messes with your future
There's a Type of Black Hole That Erases Your Past And Messes With Your Future

Hello darkness my old friend.

1 day ago
 
loadmore icon  LOAD MORE