IBS Software developed a bilingual Named Entity Recognition (NER) system to extract critical information from cargo logistics emails in English and Japanese. The system identifies 23 entity types, including AWB numbers, flight details, and delivery instructions. This solution, built using Amazon Bedrock’s knowledge distillation capabilities, reduced operational costs by 14x while maintaining high accuracy. The project involved annotating 500 bilingual email messages and training a custom model to process real-time cargo data. The system now supports real-time email processing, improving efficiency for IBS Software’s logistics operations.
The team faced challenges with open-source frameworks, including difficulty configuring distillation pipelines for bilingual data and lack of managed infrastructure. By switching to Amazon Bedrock, they leveraged managed training and token-level distillation, which simplified the process. The solution used Amazon Nova Pro as the teacher model and Nova Lite as the student model, achieving a loss reduction from 0.05 to 0.008 over 70 training steps. This approach enabled the system to maintain 98% of the teacher model’s performance while significantly lowering inference costs. The final model achieved an F1-Score of 95.085% on the test set, demonstrating its effectiveness for real-world logistics applications.
The deployment workflow involves processing .eml files through Amazon S3, AWS Lambda, and Amazon Bedrock’s inference endpoint. Extracted entities are validated using post-processing rules and confidence thresholds before being stored in Amazon DynamoDB. The system is designed to handle large-scale operations, with the ability to scale without requiring custom hosting infrastructure. This case study highlights the benefits of using Amazon Bedrock for complex NER tasks in multilingual environments.
Source: awsml