Model Release

Google Integrates Computer Control into Gemini 3.5 Flash

Google's Gemini 3.5 Flash model can now see and operate computer screens, surpassing its predecessor with a score of 78.4 on the OSWorld benchmark.

Abstract digital visualization of AI, featuring colorful 3D elements and modern design.

Photo: Google DeepMind / Pexels

Google has integrated computer control directly into its Gemini 3.5 Flash model, allowing it to see, understand, and interact with computers, browsers, and mobile devices autonomously. This capability was previously available only through a separate Gemini 2.5 model. Developers can now build agents that function across browser, mobile, and desktop environments for tasks such as software testing and office automation. The new feature is accessible via the Gemini API and the Gemini Enterprise Agent Platform, with a Browserbase demo and a GitHub reference implementation also available. Source: thedecoder

On the OSWorld benchmark, Gemini 3.5 Flash scores 78.4, outperforming Gemini 3 Flash (65.1) and GPT-5.4 mini (72.1). GPT-5.5 scores slightly higher at 78.7, while Anthropic's Opus 4.8 leads at 83.4. Sonnet 4.6 also scores 78.4, and Gemini 3.1 Pro lands at 76.2. To protect against prompt injection attacks, Google employs adversarial training and two optional enterprise safeguards. One requires user confirmation for sensitive or irreversible actions, while the other automatically halts tasks when indirect prompt injections are detected. Source: thedecoder

Google recommends sandboxing, human oversight, and strict access controls, with more details in its best practices documentation. The company emphasized the importance of security measures to ensure safe and responsible use of the model's capabilities. Source: thedecoder

Key points

Google integrated computer control directly into Gemini 3.5 Flash.
Gemini 3.5 Flash scores 78.4 on the OSWorld benchmark, surpassing Gemini 3 Flash (65.1) and GPT-5.4 mini (72.1).
Google uses adversarial training and two enterprise safeguards to guard against prompt injection attacks.
The feature is available through the Gemini API and the Gemini Enterprise Agent Platform.
Google recommends sandboxing, human oversight, and strict access controls for secure use.

Source: The Decoder Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.