Perplexity has launched a hybrid AI inference system that dynamically decides whether to process tasks locally or in the cloud. The system aims to balance accuracy, privacy, and energy efficiency by routing sensitive data to local devices and compute-heavy tasks to cloud models. The new framework is integrated into Perplexity's Always-on agent product for personal computers, which was released in March 2026. According to Perplexity, the system allows for data sovereignty by keeping sensitive information, such as financial or health records, on local hardware.

"The race for local compute is on," the company stated in its announcement. The hybrid system is designed to reduce reliance on centralized computing infrastructure by shifting routine tasks to local devices, which could simplify data management and compliance. Perplexity also emphasized that its business model prioritizes correct answers over high computational usage, making efficiency a natural incentive. The company introduced the system in collaboration with Intel, though it is also compatible with other hardware, including Nvidia's RTX Spark.

The system's model-agnostic design allows it to work with various AI models, ensuring flexibility across different use cases. Source: thedecoder