HIX AI
Collapse
Simple
Home > Discover > Google's DeepMind Reveals How Robots Adapt and Learn Environment Using Gemini AI

Google's DeepMind Reveals How Robots Adapt and Learn Environment Using Gemini AI

Written by
ArticleGPT

Reviewed and fact-checked by the HIX.AI Team

2 min readJul 12, 2024
Google's DeepMind Reveals How Robots Adapt and Learn Environment Using Gemini AI

In a Nutshell

DeepMind employs video tours and Gemini 1.5 Pro to train robots for navigation and task completion.

Google DeepMind's robotics team has recently published a research paper showing how they are teaching Google's RT-2 robots to learn and adapt to their environment using Gemini AI.

Rather than relying solely on traditional programming methods, the team is using videos to train the robots, allowing them to learn in a way similar to human interns. By recording video tours of designated areas, such as homes or offices, the robots can watch and absorb information about their surroundings.

Google's Robots Navigate with Gemini AI

The model's functionality allows for verbal and visual outputs, enabling the robots to perform tasks based on their existing knowledge, and showcases the potential for robots to interact with their environment in ways that resemble human behavior.

In practical tests, there’s a vast 9,000-square-foot area to operate the Gemini-powered robots, and it turns out that the robots can successfully follow over 50 different user instructions with a 90 percent success rate.

This high level of accuracy opens up numerous real-world applications for AI-powered robots, including assisting with household chores or performing more complex tasks in the workplace.

The robots are equipped with the Gemini 1.5 Pro generative AI model, which enables a long context window. This allows the AI to multi-task and process information efficiently, enabling the robots to learn about their environment in detail.

For example, if a user asks if a specific drink is available, the robot can navigate to the refrigerator, visually assess its contents, and then provide an answer based on that information. This level of understanding and execution represents a significant advancement in the capabilities of AI-powered robots.

Despite the promising results achieved with Gemini 1.5 Pro, there are still challenges to overcome. The robots currently take between 10 to 30 seconds to process each instruction, which is slower compared to human execution in most cases.

Additionally, the complexities and unpredictability of real-world environments pose challenges for the robots' navigation abilities.

Although the Gemini-powered robots are not yet ready for mass commercialization, their potential impact across various industries is promising. Integrating AI models such as Gemini 1.5 Pro into robotics, sectors like healthcare, shipping, and janitorial duties can be transformed.

Based on 3 search sources

3 sources

Google's AI robots are learning from watching movies – just like the rest of us

Google DeepMind's robotics team is teaching robots to learn how a human intern would: by watching a video.

Google says Gemini AI is making its robots smarter

DeepMind is using video tours and Gemini 1.5 Pro to train robots to navigate and complete tasks.

Google is using GeminiAI to make its robots smarter and better, here’s how

Google is empowering the robots to be more flexible and adaptable by remembering and understanding their environment, thanks to Gemini 1.5 Pro model

On This Page

  • Google's Robots Navigate with Gemini AI