mdfy is a FastAPI-based web service that converts various document formats to Markdown.
This repository contains the source code for the PDF to Markdown conversion service, which uses code from the RAG project.
- Convert PDF, DOCX, XLSX, PPTX, HWP, OXPS, EPUB, and MOBI files to Markdown
- Process files from URLs or direct uploads
- Redis caching for improved performance
- OAuth2 authentication for secure access
- Python 3.11+
- Poetry
- Redis server
- Docker (optional)
- Clone the repository:
git clone https://github.com/jaigouk/mdfy.git
cd mdfy
- Create a virtual environment and activate it:
conda create -n mdfy python=3.11
conda activate mdfy
- Install the required packages:
poetry install
- Copy the
.env.example
file to.env
and fill in the required values:
cp .env.example .env
poetry run uvicorn mdfy:app --reload
The server will be available at http://127.0.0.1:8000
.
- Build the Docker image:
docker build -t mdfy .
- Run the container:
docker run -p 8000:8000 --env-file .env mdfy
The server will be available at http://localhost:8000
.
GET /
: Welcome messageGET /health
: Health check endpointPOST /process_url/
: Convert a document from a URL to MarkdownPOST /process_upload/
: Convert an uploaded document to Markdown
FastAPI automatically generates interactive API documentation:
- Swagger UI:
http://127.0.0.1:8000/docs
- ReDoc:
http://127.0.0.1:8000/redoc
You can use these interfaces to explore and test the API endpoints.
The OpenAPI (Swagger) specification is available at: http://127.0.0.1:8000/openapi.json
poetry run pytest
This project is licensed under the GNU AGPL v3.0 License - see the LICENSE file for details.