Huntington Bank, a top 10 U.S. bank, faced the challenge of redacting sensitive customer data from hundreds of millions of documents stored on-premises. The task required a scalable solution to process the documents quickly and securely, meeting strict compliance standards. By leveraging AWS services, the bank significantly reduced the estimated processing time from years to just a few months. The project aimed to ensure data encryption at rest and in transit while maintaining high accuracy in redaction. The solution involved moving documents to Amazon S3, using Amazon Textract for sensitive data detection, and automating the redaction process through AWS Step Functions and Lambda. This approach enabled efficient handling of large document volumes while maintaining compliance and data security standards. Source: awsml

Huntington used AWS DataSync, AWS Direct Connect, and AWS Key Management Service (AWS KMS) to securely transfer over 400 million documents from on-premises storage to Amazon S3. The bank ensured data encryption during transit and at rest, meeting strict access and compliance requirements. AWS DataSync was deployed as an agent in the on-premises data center to monitor and transfer documents, supporting both data movement to and from AWS. The solution also included replicating processed data back to on-premises storage, ensuring data availability and compliance. The team organized documents into a JSON collection and used AWS Step Functions in distributed mode to maximize concurrency and throughput. This approach allowed the bank to process millions of documents daily while maintaining control over request rates and avoiding throttling. Source: awsml

Huntington established core requirements for the project, including data encryption at rest and in transit, strict access controls, PCI DSS compliance for services used, and accurate redaction exceeding 95% to meet compliance standards. The bank used Amazon Textract to detect sensitive data such as Social Security numbers and account details, providing coordinates and metadata in JSON format. AWS Step Functions orchestrated the redaction process, reducing manual review time while improving accuracy across large document volumes. The solution also included monitoring through Amazon CloudWatch dashboards to track response times, throttle counts, and error rates, ensuring high throughput and reliability. Source: awsml