Skip to content

aws-samples/sample-ai-receipt-processing-methods

AI Receipt Processing Demo

A comprehensive demonstration of AI-powered receipt processing using AWS services, comparing four different methods for intelligent data extraction from receipts.

⚠️ Sample Data Only: This is a demonstration application. Do not upload receipts containing real sensitive data, PII, or financial information. Use sample/test receipts only.

What is AI Receipt Processing?

This sample demonstrates how AI and machine learning transform receipt processing. Unlike basic OCR that only converts images to text, these methods understand document context, structure, and meaning to extract structured business data automatically.

Four AI Processing Methods Compared

This sample compares four distinct AWS approaches to intelligent receipt processing, each representing different levels of AI sophistication and implementation complexity:

  1. Amazon Textract - Document API: Basic OCR with key-value detection, requiring heavy custom parsing
  2. Amazon Textract - Expense API: Pre-trained receipt/invoice processing with structured output
  3. Amazon Bedrock Data Automation: Custom blueprints with AI-enhanced field extraction
  4. Vision LLMs (Nova/Claude): Direct image-to-JSON conversion using large language models with vision capabilities

The comparison table below analyzes these methods across parsing requirements, field completeness, output quality, processing time, cost, and flexibility to help you choose the right approach for your use case.

Deployment Guide

Prerequisites

  • Node.js 18+ and npm - Required for CDK, React frontend, and dependency management
    • Download from nodejs.org (npm is included with Node.js)
    • Verify installation: node --version and npm --version
  • AWS CLI - For AWS account access and configuration
  • AWS Account - With appropriate permissions for creating resources
  • Git - For cloning the repository

AWS Setup (First Time Only)

  1. Configure AWS CLI

    aws configure
    # Enter your AWS Access Key ID, Secret Access Key, Region (us-east-1), and output format (json)
  2. Install AWS CDK CLI

    npm install -g aws-cdk
    # Verify installation
    cdk --version

Deployment Steps

  1. Clone Repository

    git clone <repository-url>
    cd sample-ai-receipt-processing-methods
  2. Install Dependencies

    npm install
    cd src/frontend && npm install && cd ../..
  3. Build Frontend

    cd src/frontend && npm run build && cd ../..
  4. Bootstrap CDK (one-time per AWS account/region)

    cdk bootstrap aws://YOUR-ACCOUNT-ID/us-east-1
  5. Deploy

    cdk deploy

    For other regions: cdk deploy -c region=us-west-2

    ⚠️ Verify Amazon Bedrock Data Automation and foundation model availability in your target region before deployment

  6. Access Application

    • Use the CloudFront URL from deployment output
    • Create account via Cognito sign-up
    • Start testing the four AI processing methods

Required AWS Permissions

Your AWS user/role needs permissions for: AWS CloudFormation, AWS IAM, AWS Lambda, Amazon API Gateway, Amazon S3, Amazon DynamoDB, Amazon Cognito, Amazon CloudFront, AWS Step Functions, Amazon Textract, and Amazon Bedrock.

Technology Stack

Frontend: React PWA with Vite, Amazon CloudFront CDN, Amazon S3 hosting, Camera API for mobile capture

Backend: Amazon API Gateway (REST), AWS Lambda (Node.js 18.x, Python 3.11), AWS Step Functions for workflow orchestration

Storage: Amazon S3 (receipt images, BDA output), Amazon DynamoDB (users, expenses, receipts tables)

AI/ML: Amazon Textract (Document & Expense APIs), Amazon Bedrock Data Automation, Amazon Bedrock (Nova, Claude vision models)

Security: Amazon Cognito (authentication), AWS KMS (encryption), AWS IAM (least privilege roles), Amazon CloudWatch & AWS X-Ray (monitoring)

Architecture

Architecture Diagram

This serverless application uses AWS managed services for scalable, secure receipt processing. Users upload receipts via Amazon CloudFront/Amazon S3, triggering AWS Step Functions workflows that route to one of four AI processing methods (Amazon Textract Document/Expense, BDA, or LLM Vision). Extracted data is stored in Amazon DynamoDB, with Amazon Cognito handling authentication and Amazon API Gateway managing all API requests.

Testing the AI Processing Methods

  1. Access the Application - Use the Amazon CloudFront URL from AWS CDK deployment output
  2. Sign Up/Sign In - Create an account via Amazon Cognito
  3. Upload Test Receipts - Try different receipt types and formats
  4. Compare Results - See how each processing method handles the same document
  5. Analyze Performance - Check processing time and accuracy differences

Key Project Structure

sample-ai-receipt-processing-methods/
├── src/
│   ├── frontend/                   # React demo UI
│   └── lambda/
│       ├── ocr-functions/          # Four AI processing methods
│       │   ├── ocr_with_textract.py           # Textract Expense API
│       │   ├── ocr_with_textract_document.py  # Textract Document API
│       │   ├── ocr_with_bda.py                # BDA Custom Blueprint
│       │   └── ocr_with_llm.py                # LLM Vision Models
│       ├── expense-management/     # Expense CRUD operations
│       ├── user-management/        # User profile operations
│       ├── receipt-presigned-url/  # S3 upload URL generation
│       ├── receipt-status/         # Processing status API
│       ├── receipt-processing/     # S3 event handler
│       └── receipt-metadata-extractor/ # Receipt metadata extraction
└── infrastructure/                 # AWS CDK deployment code
    └── lib/stacks/compute-stack.ts # Includes custom BDA blueprint definition

OCR Function Comparison

Metric Textract Document Textract Expense BDA Custom Blueprint LLM Vision
Parsing Logic ❌ Heavy ⚠️ Moderate ✅ Minimal ✅ Minimal
Setup Required None None Custom Blueprint Prompt Engineering
Field Completeness ❌ Incomplete ⚠️ Partial (missing title/category/description) ✅ Complete (7 fields) ✅ Complete (7 fields)
Output Quality ❌ Basic (OCR only) ⚠️ Good (structured) ✅ Excellent (AI-enhanced) ✅ Excellent (AI-enhanced)
Processing Time ~2-5s ~2-5s ~15-25s ~5-10s
Cost per 1K Receipts* ~$65.00 ~$10.00 ~$5.00 ~$0.10 (Nova Lite)
Flexibility ❌ Low ⚠️ Medium ✅ High ✅ Very High
Conclusion Most expensive (~$65/1K), heavy parsing, incomplete data - not recommended for receipt scanning Moderate cost (~$10/1K), fast processing, good for structured receipt data but missing AI-enhanced fields Lower cost (~$5/1K), complete extraction with AI enhancement, but slower processing and requires blueprint setup Lowest cost (~$0.10/1K), excellent quality, complete fields, flexible - best balance of cost, speed, and intelligence

Cost Disclaimer: Costs are approximate estimates based on US East (Ohio) region pricing as of January 2025. Actual costs may vary based on region, volume tier, document complexity, and AWS pricing changes. LLM costs include both image input tokens (~1,200 per receipt) and text output tokens (120 per receipt). Textract Document API cost is highest ($65/1K) due to requiring FORMS and TABLES features with heavy parsing.


Key Design Decisions

Why Four Processing Methods?

This AWS sample compares different AI-powered receipt processing approaches to show the evolution from basic OCR to AI-powered extraction, highlighting trade-offs in cost, accuracy, completeness, and flexibility to help developers choose the right approach.

Why Serverless?

Serverless architecture provides auto-scaling for variable receipt processing loads, cost-effective pay-per-use pricing, reduced operational overhead with managed services, and fast deployment through infrastructure as code with CDK.

Why AWS Step Functions?

AWS Step Functions orchestrate the workflow routing to different OCR methods, provide clear visibility into processing status, offer built-in retry and error handling, and make it easy to add new processing methods.

Security

This sample demonstrates AWS security best practices:

  • Authentication: Amazon Cognito user authentication with JWT tokens, Amazon API Gateway Cognito Authorizer for token validation
  • Encryption: All data encrypted at rest (AWS KMS with automatic rotation) and in transit (TLS 1.2+)
  • Access Control: AWS IAM roles with least privilege for each AWS Lambda function, presigned Amazon S3 URLs with 15-minute expiration
  • Input Validation: File upload validation (size, type), Amazon API Gateway request validators, sanitized user inputs
  • Monitoring: Amazon CloudWatch Logs, AWS X-Ray distributed tracing, AWS CloudTrail audit logs

Known Limitations

This is a demonstration application optimized for cost and ease of deployment. For production use, consider implementing:

  • Security: AWS WAF protection, Amazon Cognito MFA (set to REQUIRED), Amazon VPC isolation for AWS Lambda functions, log sanitization to remove PII
  • Scalability: Caching layer, enhanced error handling, per-user rate limiting
  • Data Management: Automated retention policies, Amazon DynamoDB Point-in-Time Recovery, backup/restore procedures
  • Monitoring: Amazon CloudWatch alarms for errors and cost anomalies, enhanced audit logging

Cleanup

To avoid ongoing AWS charges, destroy the deployed resources:

cdk destroy

Note: This will delete all resources including Amazon S3 buckets, Amazon DynamoDB tables, and stored receipts. Backup any data you want to keep before destroying.


Troubleshooting

CDK Bootstrap Error: Ensure you've run cdk bootstrap in your target region Deployment Fails: Check AWS credentials with aws sts get-caller-identity BDA Not Available: Ensure Amazon Bedrock Data Automation and your selected models are available in your deployment region Frontend Not Loading: Wait 5-10 minutes for Amazon CloudFront distribution to deploy AWS Lambda Timeout: First invocation may be slow due to cold start (normal behavior)


Additional Resources

About

No description, website, or topics provided.

Resources

License

MIT-0, MIT-0 licenses found

Licenses found

MIT-0
LICENSE
MIT-0
LICENSE.txt

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published