Skip to content

An automated document analyzer for Paperless-ngx using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically analyze and tag your documents.

License

Notifications You must be signed in to change notification settings

clusterzx/paperless-ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GitHub commit activity Docker Pulls GitHub User's stars GitHub License

Paperless-AI

An automated document analyzer for Paperless-ngx using OpenAI API and Ollama (Mistral, llama, phi 3, gemma 2) to automatically analyze and tag your documents.
It features: Automode, Manual Mode, Ollama and OpenAI, a Chat function to query your documents with AI, a modern and intuitive Webinterface.

Disclaimer:

paperless-ai makes changes to the documents in your productive paperlessNGX instance that cannot be easily undone. Do the configuration carefully and think twice. Please test the results beforehand in a separate development environment and be sure to back up your documents and metadata beforehand.

We exploded! πŸ’₯

πŸ’š Thank you for all your support, bug submit, feature requests πŸ’š

Star History Chart

Features

Features

Automated Document Management

  • Automatic Scanning: Identifies and processes new documents within Paperless-ngx.
  • AI-Powered Analysis: Leverages OpenAI API and Ollama (Mistral, Llama, Phi 3, Gemma 2) for precise document analysis.
  • Metadata Assignment: Automatically assigns titles, tags, and correspondent details.

Advanced Customization Options

  • Predefined Processing Rules: Specify which documents to process based on existing tags. (Optional) πŸ†•
  • Selective Tag Assignment: Use only selected tags for processing. (Disables the prompt dialog) πŸ†•
  • Custom Tagging: Assign a specific tag (of your choice) to AI-processed documents for easy identification. πŸ†•

Manual Mode

  • AI-Assisted Analysis: Manually analyze documents with AI support in a modern web interface. (Accessible via the /manual endpoint) πŸ†•

Interactive Chat Functionality

  • Document Querying: Ask questions about your documents and receive accurate, AI-generated answers. πŸ†•

Intuitive Web Interface

  • Streamlined Configuration: Easy-to-use setup interface available at /setup.
  • Dashboard Overview: A clean and organized dashboard for monitoring and managing document processing.

Reliability and Security

  • Error Handling: Automatic restarts and health monitoring for improved stability.
  • Health Checks: Ensures system integrity and alerts on potential issues.
  • Docker Integration: Full Docker support, including health checks, resource management, and persistent data storage.

Prerequisites

  • Docker and Docker Compose
  • Access to a Paperless-ngx installation
  • OpenAI API key or your own Ollama instance with your chosen model running and reachable.
  • Basic understanding of cron syntax (for scan interval configuration)

Installation

Visit the Wiki for installation:
Click here for Installation

How it Works

  1. Document Discovery

    • Periodically scans Paperless-ngx for new documents
    • Tracks processed documents in a local SQLite database
  2. AI Analysis

    • Sends document content to OpenAI API or Ollama for analysis
    • Extracts relevant tags and correspondent information
    • Uses GPT-4o-mini or your custom Ollama model for accurate document understanding
  3. Automatic Organization

    • Creates new tags if they don't exist
    • Creates new correspondents if they don't exist
    • Updates documents with analyzed information
    • Marks documents as processed to avoid duplicate analysis

NEW! Manual Mode

You can now manually analyze your files by hand with the help of AI in a beautiful Webinterface. Reachable via the /manual endpoint from the webinterface.

NEW Dashboard:

Dashboard Image

Configuration Options

The application can be configured through the Webinterface on the /setup Route. You dont need/can't set the environment vars through docker.

Setup Image

Docker Support

The application comes with full Docker support:

  • Automatic container restart on failure
  • Health monitoring
  • Volume persistence for database
  • Resource management
  • Graceful shutdown handling

Docker Commands

# Start the container
docker-compose up -d

# View logs
docker-compose logs -f

# Restart container
docker-compose restart

# Stop container
docker-compose down

# Rebuild and start
docker-compose up -d --build

Health Checks

The application provides a health check endpoint at /health that returns:

# Healthy system
{
  "status": "healthy"
}

# System not configured
{
  "status": "not_configured",
  "message": "Application setup not completed"
}

# Database error
{
  "status": "database_error",
  "message": "Database check failed"
}

Debug Interface

The application includes a debug interface accessible via /debug that helps administrators monitor and troubleshoot the system's data:

  • πŸ” View all system tags
  • πŸ“„ Inspect processed documents
  • πŸ‘₯ Review correspondent information

Accessing the Debug Interface

  1. Navigate to:
http://your-instance:3000/debug
  1. The interface provides:
    • Interactive dropdown to select data category
    • Tree view visualization of JSON responses
    • Color-coded data representation
    • Collapsible/expandable data nodes

Available Debug Endpoints

Endpoint Description
/debug/tags Lists all tags in the system
/debug/documents Shows processed document information
/debug/correspondents Displays correspondent data

Health Check Integration

The debug interface also integrates with the health check system, showing a configuration warning if the system is not properly set up.

Development

To run the application locally without Docker:

  1. Install dependencies:
npm install
  1. Start the development server:
npm run test

Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Security Considerations

  • Store API keys securely
  • Restrict container access
  • Monitor API usage
  • Regularly update dependencies
  • Back up your database

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Paperless-ngx for the amazing document management system
  • OpenAI API
  • The Express.js and Node.js communities for their excellent tools

Support

If you encounter any issues or have questions:

  1. Check the Issues section
  2. Create a new issue if yours isn't already listed
  3. Provide detailed information about your setup and the problem

Roadmap

  • Support for custom AI models
  • Support for multiple language analysis
  • Advanced tag matching algorithms
  • Custom rules for document processing
  • Enhanced web interface with statistics