Hugging Face has released Optimum Intel 2.0, a major update to its toolkit for deploying open models on Intel hardware. This version focuses on OpenVINO as the primary framework, offering a simplified installation process and enhanced support for the latest open models. The release includes streamlined workflows for exporting, quantizing, and running models with OpenVINO, making it easier for developers to deploy models on Intel CPUs, Arc GPUs, and Core Ultra NPUs.

The update removes deprecated integrations with Intel Neural Compressor and Intel Extension for PyTorch, replacing them with default OpenVINO and NNCF installations. This change simplifies the package and provides a unified approach to model optimization. Users can now install the toolkit with a single command, pip install --upgrade optimum-intel, which includes all necessary components for model deployment. The release also introduces improved quantization options, such as data-aware AWQ with optimized configurations for specific models like Qwen3-30B, and enhanced calibration features for better compression results.

The release notes emphasize the toolkit’s broad support for recent model architectures, including text generation, MoE, vision-language, ASR, TTS, and video understanding models. These features are designed to align with the current trends in open model development and ensure compatibility with Intel hardware through a single API. The update also includes runtime improvements, such as Transformers v5 compatibility and support for hybrid and recurrent architectures, which enhance the performance of newer model families in production settings.

Source: huggingface