Model Release

NVIDIA Introduces Nemotron-Labs Diffusion Language Models

NVIDIA released Nemotron-Labs Diffusion models with 3B, 8B, and 14B parameters, offering faster inference speeds and new generation modes.

Image: Hugging Face

NVIDIA has introduced a new family of diffusion language models (DLMs) called Nemotron-Labs Diffusion, designed to improve text generation efficiency. These models generate multiple tokens in parallel and iteratively refine them, offering performance benefits over traditional autoregressive models. The Nemotron-Labs Diffusion family includes text models at 3B, 8B, and 14B scales, along with an 8B vision-language model (VLM), all available under NVIDIA licenses.

The models support three generation modes: autoregressive, diffusion, and self-speculation. The diffusion mode achieves 2.6× higher tokens per forward pass (TPF) than autoregressive models, while self-speculation reaches 6.4× TPF with comparable accuracy. NVIDIA also released training code through the Megatron Bridge framework, enabling developers to train and fine-tune the models.

Deployment of the models is supported in the main branch of SGLang, with inference available through a GitHub issue tracker. The models can be served in three different ways by adjusting a single line in the algorithm configuration, providing flexibility for developers.

Source: huggingface

Key points

NVIDIA released Nemotron-Labs Diffusion models with 3B, 8B, and 14B parameters.
Nemotron-Labs Diffusion models offer faster inference speeds compared to autoregressive models.
The diffusion mode achieves 2.6× higher tokens per forward pass (TPF) than autoregressive models.
Self-speculation mode reaches 6.4× TPF with comparable accuracy across evaluated tasks.
NVIDIA released training code through the Megatron Bridge framework for these models.
Deployment of the models is supported in the main branch of SGLang.
The models can be served in three different ways by adjusting a single line in the algorithm configuration.

Source: Hugging Face Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.

NVIDIA Introduces Nemotron-Labs Diffusion Language Models

Key points

Related articles

Kimi's Open Model K3 Approaches GPT-5.6 Sol and Fable 5

xAI’s Grok 4.3 Now Available on Amazon Bedrock

Google Renames NotebookLM as Gemini Notebook

Google Vids Update Lets Users Star in AI Videos