- Extract text and images from documents
- Vision Language Models support (OpenAI and Ollama)
- Support for PDF, DOCX, PPTX, and HTML files
- Interactive HTML page capture
- Added file extractor API diagram
- Enhanced documentation with mermaid diagrams
- Added retry mechanism for API calls
- Improved model factory implementation
- Via PyPI: pip install pyvisionai
- Via Homebrew: brew install pyvisionai
- Python 3.11+
- 1GB disk space (more for Llama model)