Gemini 2.5 Computer Use is a new AI model from Google that can navigate the web via a virtual browser, executing tasks such as filling out forms, scrolling, clicking, and typing in a manner similar to a human. Built on the Gemini 2.5 Pro architecture, Google claims that this model offers superior performance compared to its competitors, featuring lower latency and better benchmarking outcomes. In a blog post announcing the model’s launch in 2025, Google noted that Gemini 2.5 Computer Use can effectively follow user instructions for web tasks requiring intricate navigation and interaction. Developers can access this tool through Google AI Studio and Vertex AI.
The Gemini 2.5 can perform actions such as typing, clicking, scrolling, opening dropdown menus, moving the cursor, and using keyboard shortcuts, all while managing a virtual web browser. Google has also shared demo videos showcasing the model in action, albeit at three times the normal speed. In one demonstration, users directed Gemini with instructions to organize tasks on a specified website. Google asserts that the model surpasses leading alternatives across various web and mobile benchmarks. However, it is important to note that the browsing model currently supports only 13 actions and is limited to web browser interactions, with no desktop operating system controls available yet.
Within Google, teams are utilizing this innovative AI for UI testing, significantly reducing the time required for software testing. Variants of this model also enhance Gemini’s agentic features in AI Mode in Search, the Firebase Testing Agent, and Project Mariner, a platform enabling users to assign AI agents tasks like research, planning, and data entry using natural language.