Screen close up with financial graphics

Oil & Gas - Intelligent Document Processing

How a financial services firm leverages AI-powered IDP to save 6000 FTE hours annually and dramatically reduce data processing time.

Industry

Financial Services

Teams & Services

/DevOps /Back-end /Front-end /Microservices /Data Lake /Big Data /Data Analytics /UI + UX

Tech & Tools

/AWS /DynamoDB /S3 /Lambda /Glue /Comprehend /Textract /SES /Terraform /Python

Key Data Points

720,000,000 data fields ETL’d (Extracted, Transformed and Loaded) with 99.96% accuracy
2,000,000 pages processed from 20+ unique document formats (different tables, no tables, styles, delimiters, structures, nested structures, etc)
Work that previously took 6 full-time employees now takes only minutes to validate for select records that fall in the 0.04% needing manual validation

The Vision

One of Protagona’s financial services customers aimed to process and track their client's tax and royalty payment information, pulled from various physical and digital document types, through a fully automated process that would allow them to maintain data integrity while aligning all client information into a single data format.

The Goal

Customer had a desire to completely remove full-time employee effort to collect data from both physical documents and digital spreadsheets produced by partner companies within the industry. At the same time, customer seeked to improve the quality of data, and ensure that data was available to key decision makers and technical staff for consumption, analytics, and to meet both client and business goals.

The Challenge

  • Physical documents utilized by Customer range heavily in format and structure, with some having very little structure 
  • Digital documents utilized by Customer contained similar information between formats, but varied heavily by producer and would need to be procured from the web
  • Both physical and digital documents are currently processed manually by full-time employees, which is not a scalable business model
  • Data from both physical and digital formats would require heavy pre-processing and post-processing to normalize the data and make it suitable for business purposes
  • Customer did not have any technical staff capable of maintaining cloud-native development or infrastructure upon project commencement

The Solution

  • This solution can read physical documents, recognize and transcribe the contents, determine the data type and store the data in a normalized format that the business can use.
  • The architecture produced makes use of entirely cloud-native components, scale near-infinitely, and are economical. All architecture developed is deployed and maintained as Infrastructure as Code (IaC), so that programmatic testing and deployment can take place.

OUTCOMES

Your data is trying to tell you something

Contact us

... are you listening?