Google has integrated computer control directly into its Gemini 3.5 Flash model, allowing it to see, understand, and interact with computers, browsers, and mobile devices autonomously. This capability was previously available only through a separate Gemini 2.5 model. Developers can now build agents that function across browser, mobile, and desktop environments for tasks such as software testing and office automation. The new feature is accessible via the Gemini API and the Gemini Enterprise Agent Platform, with a Browserbase demo and a GitHub reference implementation also available. Source: thedecoder

On the OSWorld benchmark, Gemini 3.5 Flash scores 78.4, outperforming Gemini 3 Flash (65.1) and GPT-5.4 mini (72.1). GPT-5.5 scores slightly higher at 78.7, while Anthropic's Opus 4.8 leads at 83.4. Sonnet 4.6 also scores 78.4, and Gemini 3.1 Pro lands at 76.2. To protect against prompt injection attacks, Google employs adversarial training and two optional enterprise safeguards. One requires user confirmation for sensitive or irreversible actions, while the other automatically halts tasks when indirect prompt injections are detected. Source: thedecoder

Google recommends sandboxing, human oversight, and strict access controls, with more details in its best practices documentation. The company emphasized the importance of security measures to ensure safe and responsible use of the model's capabilities. Source: thedecoder