27. Oktober 2019
Open AI trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand
The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.
Human hands let us solve a wide variety of tasks. For the past 60 years of robotics, hard tasks which humans accomplish with their fixed pair of hands have required designing a custom robot for each task. As an alternative, people have spent many decades trying to use general-purpose robotic hardware, but with limited success due to their high degrees of freedom. In particular, the hardware we use here is not new—the robot hand we use has been around for the last 15 years—but the software approach is.
Since May 2017, we’ve been trying to train a human-like robotic hand to solve the Rubik’s Cube. We set this goal because we believe that successfully training such a robotic hand to do complex manipulation tasks lays the foundation for general-purpose robots. We solved the Rubik’s Cube in simulation in July 2017. But as of July 2018, we could only manipulate a block on the robot. Now, we’ve reached our initial goal. Solving a Rubik’s Cube one-handed is a challenging task even for humans, and it takes children several years to gain the dexterity required to master it. Our robot still hasn’t perfected its technique though, as it solves the Rubik’s Cube 60% of the time (and only 20% of the time for a maximally difficult scramble).
We train neural networks to solve the Rubik’s Cube in simulation using reinforcement learning and Kociemba’s algorithm for picking the solution steps. Domain randomization enables networks trained solely in simulation to transfer to a real robot.
Images: Eric Haines