DeepMind’s New Robots: An AI Revolution!

Time: 0.08

Fellow Scholars, I think you are going to love  this. This is a follow up to an amazing project  

Time: 6.56

where AI agents learn to play football/soccer.  Initially, they start out like this. Well,  

Time: 13.72

learning is a strong word to use here. But now,  let’s jump 5 years in time, and look! They are  

Time: 21.76

now competent players, but wait a second, was  this really 5 years of training? Yes it was,  

Time: 29.68

however, not in our time, but in their time.  You see, if you have a powerful computer,  

Time: 36.56

you can simulate these little AIs  much, much faster than real time.

Time: 42.32

And now, we are going to look at a  similar, really exciting sim2real project,  

Time: 48.4

which means that first, the agents learn to play  in a simulation, in a computer game if you will,  

Time: 55.24

and then come out into the real world as real  robots to shoot a real ball. We talked a bit  

Time: 62.64

about football robots too before, but the new  results are truly something else. However,  

Time: 69.32

I am a little worried. I hear you asking,  Károly, why are you worried? Well, look  

Time: 76.72

at this. In this earlier software project,  there was no referee, and therefore there  

Time: 82.72

was no penalty of basically completely  destroying each other, and look at that.  

Time: 90.04

Absolute pandemonium ensued. Now I am worried that  these little Robot Scholars are also going to beat  

Time: 98.28

up each other really bad. So, do they? I will  tell you in a moment why this is not the case.

Time: 106.12

Dear Fellow Scholars, this is Two Minute  Papers with Dr. Károly Zsolnai-Fehér.

Time: 110.64

So, first, they learn in a video game world,  and I already see that this is going to be  

Time: 117.36

quite tough. Look, 20 degrees of freedom. That  means 20 controllable joints. That gives us at  

Time: 125.44

least 20 things that can go wrong. Ouch. This is  not going to be glorious, not in the slightest!  

Time: 132.56

This means that we have to start out from the  humble beginnings. That controlling the arms  

Time: 139.08

and the limbs of this robot will not be easy, so  much so that they first need to learn to stand,  

Time: 146.16

walk, and of course, get up after falling.  Then, basic training in football. Wow,  

Time: 155.48

this is not looking great. I wonder if enough  learning can happen here so we get to see anything  

Time: 162.08

interesting? Not sure. Then, soccer training  against an opponent. This is not too impressive  

Time: 170.08

at this point, and it will probably get even  worse upon uploading this AI into a real robot.

Time: 178.04

Especially that these poor little robots barely  see anything, and when they start running,  

Time: 184.68

oh my goodness. It gets even worse. So now  we see that this is a super difficult task,  

Time: 192.2

so let’s see if it can be done at all.

Time: 195.44

First, a penalty kick is simulated in the  video game world, and now, hold on to your  

Time: 201.36

papers Fellow Scholars, and…yes! That was a  good kick! Finally! Learning is happening!

Time: 210.96

But this is when things are going well. So what  if things are not going well at all? Well, after  

Time: 218.28

a bit more learning, look how robust it became  against perturbations. What does that mean? Well,  

Time: 225.48

this is a fancy way of saying that there was  lots of fun to be had in the lab that day. And  

Time: 231.76

it can recover from all this. Wait! Not only  that, but it can even get up and score! Bravo!

Time: 240.72

But it gets worse. I know that for a fact. Why  do I know that for a fact? Because I had the  

Time: 247.96

huge honor of recording this piece of footage  myself in person at the Google DeepMind lab.  

Time: 254.56

Look at that! Ouch! I did not expect that at  all, and it really shows how these robots can  

Time: 262.04

fail in the most spectacular manner. Wow. And,  a big thank you to Google DeepMind for the trip.

Time: 269.36

So, are they destroying each other? Not quite.  Why not? Because even though there we don’t have  

Time: 276.92

a referee in person, they were told that the rule  is that if you get too close to the other robot,  

Time: 283.44

you get a penalty. Whew! So we can expect peaceful  matches after this insanity from a previous work.

Time: 292.52

And then, something incredible happened. After  playing more and more against themselves,  

Time: 299

they learned new things, and now they  have 7 absolutely incredible new skills.

Time: 305.96

One, they were also advised to avoid high knee  torques during the video game phase. Essentially,  

Time: 312.92

they were taught to take it easy or  otherwise that knee is not going to  

Time: 318.16

make it. I love that as soon as they are  locked into a somewhat human-like body,  

Time: 324.4

they now have knee pain of sorts.  Welcome to the real world, little AI!  

Time: 330.16

And, over time, yes! They learned to  walk and run a bit softer. So good!

Time: 336.6

Two, they can now kick a moving ball. Three,  they learned to block the other robot’s shot  

Time: 344.04

with their bodies, that is amazing. Four, can now  turn much better, and look. Five, they can also  

Time: 353.32

do something that I did not expect at all. And  that is getting up…but from the back. So cool!

Time: 360.6

Six, they can also anticipate what is  about the happen and position themselves  

Time: 366.68

accordingly. And seven, their ball control  skills are also a thing of beauty. Bravo!

Time: 374.48

And now, here comes the absolute crown jewel  result. The manufacturer of this robot has  

Time: 381.68

little handcrafted scripts, programs that  control the robot for these movements. This  

Time: 387.36

is what it looks like. These are carefully  crafted by engineers who know these models  

Time: 393.16

really well. And now, let’s compare it to  their learned behavior in this project.  

Time: 399.24

Oh my goodness. This is so much better!  And they learned all this by themselves.

Time: 406.16

The AI agent now walks and turns 2-3 times faster,  

Time: 411.4

takes only close to half as much time  to get up, and get this Fellow Scholars,  

Time: 417.28

it even kicks a ball 34% faster than this  manufacturer baseline. Holy mother of papers!  

Time: 425.92

This is an absolutely incredible paper and I feel  honored to witness it together with you. Wow.

Time: 434.4

And note that this training could not  have been done entirely in the real world,  

Time: 439.92

but first only in a video game world  because it would have taken too long,  

Time: 445.92

and the robots could also have hurt  themselves in the process. But,  

Time: 450.88

after learning in a video game, we get this  absolute miracle. And remember, in the game,  

Time: 457.72

they can put in years and years of training  in just a few hours. What a time to be alive!

Copyright © 2024. All rights reserved.