Model Release

Alibaba Launches Qwen3.7-Plus Multimodal AI Agent

Alibaba released Qwen3.7-Plus, a multimodal AI model capable of autonomously operating apps and generating 10,000 lines of code in 11 hours.

Image: The Decoder

Alibaba has released Qwen3.7-Plus, a new AI model that integrates visual understanding with agent capabilities, allowing it to operate graphical user interfaces and applications independently. The model is designed to recognize real-world scenes, read screen content, and generate code from visual templates. It is available as a proprietary offering through Alibaba Cloud, with pricing significantly lower than its text-based counterpart, Qwen3.7-Max.

In testing, the system demonstrated its ability to recreate desktop applications, perform cloud tasks, and independently program a complete app with 10,000 lines of code. A hybrid agent system built using Qwen3.7-Plus developed an English vocabulary learning app, running for over eleven hours and producing more than 10,000 lines of code across more than 1,000 agent calls. The process included requirements documentation, automated code generation, installation, test case creation, and independent version management.

The model excels at operating graphical interfaces, outperforming competitors like GPT-5.4 (xhigh), Opus 4.6 Max, and Gemini 3.1 Pro on AndroidWorld and ScreenSpot Pro benchmarks. However, it falls short in pure logic benchmarks, such as MedXpertQA-MM, where it trails behind Gemini 3.1 Pro and GPT-5.4. On the text side, its performance is described as on par with max-tier models without surpassing them across the board.

Source: thedecoder

Key points

Alibaba released Qwen3.7-Plus, a multimodal AI model capable of autonomously operating apps and generating 10,000 lines of code in 11 hours.
Qwen3.7-Plus is available as a proprietary offering through Alibaba Cloud, with pricing significantly lower than its text-based counterpart, Qwen3.7-Max.
The model excels at operating graphical interfaces, outperforming competitors like GPT-5.4 (xhigh), Opus 4.6 Max, and Gemini 3.1 Pro on AndroidWorld and ScreenSpot Pro benchmarks.
Qwen3.7-Plus falls short in pure logic benchmarks, such as MedXpertQA-MM, where it trails behind Gemini 3.1 Pro and GPT-5.4.
The model supports the Anthropic API protocol and works directly with Claude Code, OpenClaw, and Alibaba's own Qwen Code.

Source: The Decoder Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.

Alibaba Launches Qwen3.7-Plus Multimodal AI Agent

Key points

Related articles

Xiaomi-Robotics-1 Demonstrates Data Over Model Size for Robot Training

Adobe Adds AI Features to Project Indigo Camera App

Neill Blomkamp Releases AI-Generated Short Film Using Seedance 2.0

China's AI Models Challenge U.S. Tech Dominance