Hardware

AMD Ryzen AI Max+ Enables 100B+ Parameter LLM Inference

AMD Ryzen AI Max+ processors with 128 GB unified memory allow running 100B+ parameter models locally, eliminating the need for multiple GPUs or cloud services.

Image: AMD

AMD has introduced the Ryzen AI Max+ processor, which features a unified memory architecture (UMA) that enables local large language model (LLM) inference with models exceeding 100 billion parameters. The Ryzen AI Max+ processor, paired with the Radeon 8060S integrated graphics, provides up to 128 GB of shared memory between the CPU and GPU, allowing models such as Qwen3.5 with 122B parameters to run on a single system without the need for multiple GPUs or cloud-based solutions. This setup supports a range of model sizes, from 9B to 122B parameters, with varying degrees of GPU utilization.

The 9B and 35B models run with 100% GPU offloading, while the 122B model uses a CPU/GPU mixed loading configuration when it exceeds the 64 GB GPU-accessible memory limit. The Ryzen AI Max+ system requires Ubuntu 24.04 LTS, AMD ROCm 7.2.1, and Ollama 0.20.x for optimal performance. The unified memory architecture allows the GPU and CPU to share the same physical memory pool, providing flexibility in how memory is allocated for different tasks.

Source: amd

Key points

AMD Ryzen AI Max+ processors with 128 GB unified memory allow running 100B+ parameter models locally.
The Ryzen AI Max+ processor uses a unified memory architecture where the CPU and GPU share the same physical memory pool.
Qwen3.5 models with up to 122B parameters can run on a single Ryzen AI Max+ processor-based system.
The 122B model runs with CPU/GPU mixed loading when it exceeds the 64 GB GPU-accessible memory allocation.
The 9B and 35B models run with 100% GPU offloading, while the 122B model uses a CPU/GPU mixed loading configuration.
Ollama provides a streamlined installation and model management experience for Ryzen AI Max+ processors.

Source: AMD Read the original →

WRITTEN BY

Sam Bergstrom

AI Infrastructure & Hardware

Sam specializes in AI chips, data centers, and training infrastructure.

AMD Ryzen AI Max+ Enables 100B+ Parameter LLM Inference

Key points

Related articles

OpenAI Launches $230 Keyboard for Codex

OpenAI Launches Screenless AI Speaker Designed to Feel Alive

OpenAI's First Hardware Device Is a Screenless Speaker That Can Move

Space Data Centers Face Real-World Challenges Despite SpaceX's Ambitions