Google DeepMind has unveiled a significant advancement in robotics that may transform machine interactions with the environment. The company’s latest AI enhancements enable robots to tackle far more intricate tasks while utilizing digital resources like Google Search to inform their actions. Additionally, the new system allows robots to share knowledge with other machines, irrespective of their design or configuration. This progress is marked by the introduction of Gemini Robotics 1.5 and Gemini Robotics-ER 1.5, the latest versions of DeepMind’s robotics-oriented AI models. The ‘ER’ in the latter denotes embodied reasoning, which highlights the model’s capacity to assess situations, make informed decisions, and act in physical contexts with foresight.
These developments build upon the original Gemini Robotics models launched earlier this year in March. Carolina Parada, head of robotics at Google DeepMind, emphasized the importance of this advancement. She noted that robots can now execute practical household tasks, such as sorting laundry by color or packing a suitcase according to the weather in London. ‘With this update, we’re moving from simple instructions to actual comprehension and problem-solving for physical tasks,’ Parada remarked, as reported by The Verge. One of the most notable new capabilities is the robots’ ability to consult the internet for assistance.
For instance, if a robot is tasked with sorting waste into recyclables, compost, and trash, it can look up local recycling regulations online before determining its course of action. Previously, robots were skilled at carrying out single, predefined tasks but were unable to adapt to multi-step challenges. Here’s how the system functions: when given a command, the robot initially uses Gemini Robotics-ER 1.5 to assess its environment and, if necessary, access online resources like Google Search. The information acquired is then converted into clear, step-by-step natural language commands for Gemini Robotics 1.5, which implements the plan in the real world. This division of responsibilities between the two models allows robots to both comprehend and act intelligently.
Perhaps the most groundbreaking element is how these models facilitate skill sharing among robots. DeepMind engineers showcased that a task learned by an ALOHA2 robot, which is equipped with dual mechanical arms, could also be effectively executed by an Apptronik Apollo humanoid robot. ‘This enables us to achieve two goals,’ explained Google DeepMind engineer Kanishka Rao. ‘First, we can control very different robots — including humanoids — with a single model. Secondly, skills acquired by one robot can now be transferred to another.’ The potential implications are immense. A future where the experience of one robot can immediately enhance the abilities of many others could hasten the integration of robots in homes, factories, and even healthcare settings.
Currently, these enhancements illustrate the evolution of AI, moving robots beyond basic commands into the domain of true understanding and collaboration.