Model Release

Amazon Nova 2 Lite and Claude Sonnet 4.6 Enable Cost-Optimized Document Digitization

A two-model pipeline using Amazon Nova 2 Lite and Claude Sonnet 4.6 processed 336 scanned yearbook pages, achieving 93% confidence in name-to-face associations.

Image: AWS Machine Learning

Amazon and Anthropic have introduced a cost-optimized solution for digitizing scanned documents by combining Amazon Nova 2 Lite with Claude Sonnet 4.6. The system processes scanned yearbook pages to extract names and match them to faces using spatial reasoning. According to the source, the pipeline handled 336 pages and produced 3,122 name-to-face associations with 93 percent scoring at or above 0.95 confidence. This approach costs about two-thirds less per page than a single-model alternative. The solution is designed to handle the complexities of document layouts and provide scalable, predictable costs for large-scale processing. Source: awsml

The pipeline consists of two stages, each using a different model for specific tasks. In the first stage, Amazon Nova 2 Lite performs native multimodal extraction, detecting photos with bounding boxes and extracting visible names with approximate positions. It also returns page-level metadata like titles and categories. In the second stage, Claude Sonnet 4.6 uses spatial reasoning to match names to faces based on page layout. The system was tested across all 336 pages, showing no meaningful accuracy difference between LOW, MEDIUM, and HIGH reasoning levels for structured extraction. The source states that setting reasoning to LOW is the cheapest option for this task. Source: awsml

Amazon Nova 2 Lite has fixed per-image pricing, making cost forecasting straightforward for large-scale document processing. The model bills image and document page inputs at a fixed per-image rate, regardless of resolution or file size. For a full page extraction including prompt and output, the per-page cost breaks down into image tokens, prompt tokens, and output tokens. The total cost per page is approximately $0.0027 at published input-token rates. This pricing model simplifies cost projections for yearbook-scale workloads, as image input cost scales linearly with page count and is independent of resolution. Source: awsml

Key points

Amazon Nova 2 Lite and Claude Sonnet 4.6 processed 336 scanned yearbook pages.
The pipeline produced 3,120 name-to-face associations with 93 percent scoring at or above 0.95 confidence.
The two-model approach costs about two-thirds less per page than a single-model alternative.
Amazon Nova 2 Lite bills image and document page inputs at a fixed per-image rate, regardless of resolution or file size.
The per-page cost for full extraction breaks down into image tokens, prompt tokens, and output tokens, totaling approximately $0.0027.
Fixed per-image pricing simplifies cost forecasting for yearbook-scale workloads as image input cost scales linearly with page count.
Claude Sonnet 4.6 uses adaptive thinking to adjust reasoning depth based on input complexity.

Source: AWS Machine Learning Read the original →

WRITTEN BY

Alex Lindgren

LLMs & Frontier Models

Alex covers the large language models and their impact on society.