DeepMind's New Robots: An AI Revolution!

Fellow Scholars, I think you are going to love  this. This is a follow up to an amazing project  

where AI agents learn to play football/soccer.  Initially, they start out like this. Well,  

learning is a strong word to use here. But now,  let’s jump 5 years in time, and look! They are  

now competent players, but wait a second, was  this really 5 years of training? Yes it was,  

however, not in our time, but in their time.  You see, if you have a powerful computer,  

you can simulate these little AIs  much, much faster than real time.

And now, we are going to look at a  similar, really exciting sim2real project,  

which means that first, the agents learn to play  in a simulation, in a computer game if you will,  

and then come out into the real world as real  robots to shoot a real ball. We talked a bit  

about football robots too before, but the new  results are truly something else. However,  

I am a little worried. I hear you asking,  Károly, why are you worried? Well, look  

at this. In this earlier software project,  there was no referee, and therefore there  

was no penalty of basically completely  destroying each other, and look at that.  

Absolute pandemonium ensued. Now I am worried that  these little Robot Scholars are also going to beat  

up each other really bad. So, do they? I will  tell you in a moment why this is not the case.

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér.

So, first, they learn in a video game world,  and I already see that this is going to be  

quite tough. Look, 20 degrees of freedom. That  means 20 controllable joints. That gives us at  

least 20 things that can go wrong. Ouch. This is  not going to be glorious, not in the slightest!  

This means that we have to start out from the  humble beginnings. That controlling the arms  

and the limbs of this robot will not be easy, so  much so that they first need to learn to stand,  

walk, and of course, get up after falling.  Then, basic training in football. Wow,  

this is not looking great. I wonder if enough  learning can happen here so we get to see anything  

interesting? Not sure. Then, soccer training  against an opponent. This is not too impressive  

at this point, and it will probably get even  worse upon uploading this AI into a real robot.

Especially that these poor little robots barely  see anything, and when they start running,  

oh my goodness. It gets even worse. So now  we see that this is a super difficult task,  

so let’s see if it can be done at all.

First, a penalty kick is simulated in the  video game world, and now, hold on to your  

papers Fellow Scholars, and…yes! That was a  good kick! Finally! Learning is happening!

But this is when things are going well. So what  if things are not going well at all? Well, after  

a bit more learning, look how robust it became  against perturbations. What does that mean? Well,  

this is a fancy way of saying that there was  lots of fun to be had in the lab that day. And  

it can recover from all this. Wait! Not only  that, but it can even get up and score! Bravo!

But it gets worse. I know that for a fact. Why  do I know that for a fact? Because I had the  

huge honor of recording this piece of footage  myself in person at the Google DeepMind lab.  

Look at that! Ouch! I did not expect that at  all, and it really shows how these robots can  

Time: 262.04

fail in the most spectacular manner. Wow. And,  a big thank you to Google DeepMind for the trip.

So, are they destroying each other? Not quite.  Why not? Because even though there we don’t have  

Time: 276.92

a referee in person, they were told that the rule  is that if you get too close to the other robot,  

Time: 283.44

you get a penalty. Whew! So we can expect peaceful  matches after this insanity from a previous work.

And then, something incredible happened. After  playing more and more against themselves,  

they learned new things, and now they  have 7 absolutely incredible new skills.

One, they were also advised to avoid high knee  torques during the video game phase. Essentially,  

they were taught to take it easy or  otherwise that knee is not going to  

make it. I love that as soon as they are  locked into a somewhat human-like body,  

Time: 324.4

they now have knee pain of sorts.  Welcome to the real world, little AI!  

And, over time, yes! They learned to  walk and run a bit softer. So good!

Two, they can now kick a moving ball. Three,  they learned to block the other robot’s shot  

with their bodies, that is amazing. Four, can now  turn much better, and look. Five, they can also  

do something that I did not expect at all. And  that is getting up…but from the back. So cool!

Six, they can also anticipate what is  about the happen and position themselves  

accordingly. And seven, their ball control  skills are also a thing of beauty. Bravo!

And now, here comes the absolute crown jewel  result. The manufacturer of this robot has  

little handcrafted scripts, programs that  control the robot for these movements. This  

is what it looks like. These are carefully  crafted by engineers who know these models  

Time: 393.16

really well. And now, let’s compare it to  their learned behavior in this project.  

Oh my goodness. This is so much better!  And they learned all this by themselves.

The AI agent now walks and turns 2-3 times faster,  

takes only close to half as much time  to get up, and get this Fellow Scholars,  

it even kicks a ball 34% faster than this  manufacturer baseline. Holy mother of papers!  

This is an absolutely incredible paper and I feel  honored to witness it together with you. Wow.

And note that this training could not  have been done entirely in the real world,  

but first only in a video game world  because it would have taken too long,  

and the robots could also have hurt  themselves in the process. But,  

after learning in a video game, we get this  absolute miracle. And remember, in the game,  

Time: 457.72

they can put in years and years of training  in just a few hours. What a time to be alive!

