Reducing costs by the healthy application of machine learning and AI
Industry
Financial Services
Teams & Services
Data Engineering / Data Sciences
Tech & Tools
Amazon Textract / Amazon Comprehend / Amazon SageMaker / AWS Glue / Amazon Athena / Amazon QuickSight
Key Data Points
The Vision
To adopt an Intelligent Document Processing pipeline
The Goal
Leveraging AWS AI/ML services to automate the identification of various cost savings opportunities for customers in the energy space.
The Challenge
Our client provides their partners in the energy industry with a variety of financial solutions to uncover overlooked savings. One of their primary systems that identifies those savings requires data from hundreds of different document types to be consolidated into a single common format before performing analysis. This required an entire team dedicated to reading the financial documents and manually entering the data into this common format.
As they evaluated the future growth of their products and services, they quickly realized that the current manual solution for capturing data from the various document types would not scale. They looked to Protagona to design and build an automated solution to accurately capture the relevant data from hundreds of document types and consolidate them into a centralized data lake.
The Solution
Protagona worked closely to quickly identify an appropriate sample size of documents with these very complex formats to begin training models around. Proof-of-concepts were then performed on various AI/ML services within AWS to validate the raw data output and design an automated data pipeline to integrate each service into their corresponding stage of the data lake. The fully built data pipeline now allows the client to upload documents to S3, where a series of Textract, Comprehend and Glue jobs are executed to take the raw data from an image and transform it into the common format their systems need in order to identify cost savings.