Model Release

NVIDIA Unveils Cosmos 3: Open Omni-model for Physical AI

NVIDIA launches Cosmos 3, an open omni-model for physical AI, available on Hugging Face. The model supports multiple modalities and is available in two versions: Nano and Super.

Image: Hugging Face

NVIDIA has released Cosmos 3, its first open omni-model for physical AI, now available on Hugging Face. This model combines world generation, physical reasoning, and action generation into a single unified architecture, eliminating the need for multiple models and inference pipelines. The release includes two versions of the model: Cosmos 3 Nano and Cosmos 3 Super, each optimized for different deployment scenarios.

Cosmos 3 is built on a Mixture-of-Transformers (MoT) architecture that processes text, image, video, audio, and action modalities within a single unified framework. It uses dedicated encoders for each modality and projects them into a shared representation space. The model splits input sequences into autoregressive and diffusion subsequences for reasoning and generation, respectively. This allows the model to seamlessly switch between acting as a vision language model, a video generator, a forward/inverse dynamics model, or a robot policy without architectural changes.

The release includes Cosmos 3 Nano, a 16B parameter model optimized for efficient inference, and Cosmos 3 Super, a 64B parameter model designed for large-scale synthetic data generation and research. Both versions are available on Hugging Face, with the Nano version running on workstation-grade GPUs and the Super version on NVIDIA Hopper and Blackwell GPUs.

Source: huggingface

Key points

NVIDIA has released Cosmos 3, its first open omni-model for physical AI, now available on Hugging Face.
Cosmos 3 combines world generation, physical reasoning, and action generation into a single unified architecture.
The model is built on a Mixture-of-Transformers (MoT) architecture that processes multiple modalities within a single framework.
Cosmos 3 Nano is a 16B parameter model optimized for efficient inference, while Cosmos 3 Super is a 64B parameter model for large-scale synthetic data generation.
Both versions of Cosmos 3 are available on Hugging Face, with Nano running on workstation-grade GPUs and Super on NVIDIA Hopper and Blackwell GPUs.

Source: Hugging Face Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.

NVIDIA Unveils Cosmos 3: Open Omni-model for Physical AI

Key points

Related articles

Google Deepmind's GenCeption Uses Video Generators for Computer Vision Tasks

Alibaba's Qwen 3.8 Competes With Kimi K3, Claims Second to Fable 5

Aether-7B-5Attn: Korean Startup Releases Fully Open Foundation Model

Moonshot AI Launches Kimi K3, Open Source AI Model