Women in charity

Nonprofit Transforms Affiliate Financial Oversight with Automated Form 990 Data Lake in 4 Weeks

Nonprofit partners with Protagona to build AWS-powered data lake, delivering 5x performance improvements and self-service analytics for 1,000+ affiliates

Industry

Teams & Services

Tech & Tools

AWS Fargate, Step Functions, Glue, S3, DynamoDB, Lake Formation, QuickSight, Amazon Q, Athena, Bedrock Claude 3.5, PropPublica API

Key Data Points

~3M files processed monthly, filtered to ~5K relevant records in 10-11 minutes
400+ XML columns extracted with AI-generated user-friendly names
4 weeks delivered on time and on budget

The Vision

A nonprofit set out to transform how financial oversight works across a network of 1,000+ affiliates—replacing hours of manual IRS Form 990 analysis with a centralized, automated system that delivers real-time performance visibility and data integrity at scale.

The Goal

This nonprofit needed to consolidate financial oversight for 1,000+ affiliates into a single automated platform—reducing hours of manual Form 990 analysis to zero, establishing systematic data verification against official IRS filings, and giving leadership an always-current view of affiliate financial health.

The Challenge

The nonprofit needed visibility into financial health across 1,000+ affiliate organizations. Staff spent hours manually downloading and analyzing IRS Form 990 tax documents with no centralized view of affiliate performance or systematic verification of reported data against official filings.

Technical Complexity:

  • Processing ~3 million IRS XML files monthly to find relevant records
  • Unreliable IRS index files requiring full ZIP processing
  • Form 990, 990-EZ, and 18+ schedule types with inconsistent XML schemas
  • 400+ cryptic XML columns requiring transformation for usability
  • Self-service analytics needed for non-technical users

The Solution

Protagona built an AWS-powered medallion data lake that automatically ingests, cleans, and surfaces IRS Form 990 data across 1,000+ affiliates. AI-powered transformation via AWS Bedrock converts 400+ cryptic XML fields into plain-language insights, while QuickSight dashboards and Amazon Q deliver self-service analytics—no SQL required. ETL performance improved 5x, from 30 minutes to just over 5.

OUTCOMES

Your data is trying to tell you something

Contact us

... are you listening?