Moondream Web Interface

A modern web interface for the Moondream vision language model, built with Next.js and FastAPI. This project provides a user-friendly way to interact with images using Moondream's vision-language capabilities.

Core Features

🌓 Light/Dark Mode: Automatic theme switching with system preference detection
🖼️ Drag-and-Drop Upload: Easy image uploading with drag-and-drop support
💬 Interactive Q&A: Ask questions about uploaded images through a chat interface
🚀 Smooth Animations: Beautiful transitions powered by Framer Motion
🔒 Privacy-First: All processing happens locally on your machine
📱 Responsive Design: Optimized for all devices and screen sizes
⚡ CUDA Support: GPU acceleration for faster inference
🎨 Modern UI: Built with Next.js, Tailwind CSS, and Framer Motion

Architecture

Frontend (Next.js)

Theme System: Light/dark mode with system preference detection
Image Upload Component: Drag-and-drop image handling and preview
Chat Interface: Interactive Q&A about uploaded images
State Management: Maintains image and chat state
API Integration: Communicates with FastAPI backend

Backend (FastAPI)

Model Management: Loads and manages Moondream model
Image Processing: Handles image encoding and caching
Q&A System: Processes questions about encoded images
Memory Management: Cleans up cached encodings

Detailed API Flow

Image Description Flow

User uploads image via frontend
Image sent to /describe endpoint
Backend encodes image and caches encoding
Model generates description
Frontend displays description and enables Q&A

Question-Answer Flow

User types question in chat interface
Question sent to /ask endpoint with image key
Backend retrieves cached encoding
Model generates answer
Frontend displays response in chat

Installation

Prerequisites

# System Requirements
- Python 3.8+
- Node.js 16+
- CUDA-capable GPU (recommended)
- 8GB+ RAM

# Python Dependencies
pip install transformers einops torch fastapi uvicorn python-multipart pillow

# Node.js Dependencies
npm install axios framer-motion @radix-ui/react-slot formidable

Backend Setup

# Clone repository
git clone [repository-url]
cd moondream-web

# Install Python dependencies
pip install -r requirements.txt

# Start FastAPI server
uvicorn app:app --host 127.0.0.1 --port 8000 --reload

Frontend Setup

# Navigate to frontend directory
cd moondream-web

# Install dependencies
npm install

# Start development server
npm run dev

API Endpoints

`/describe` (POST)

Handles initial image upload and description

Input: Image file (multipart/form-data)

Output:

{
  "description": "Generated description of the image",
  "image_key": "Unique key for cached encoding"
}

`/ask` (POST)

Handles questions about previously uploaded images

Input:

{
  "question": "User's question about the image",
  "image_key": "Key from previous describe call"
}

Output:

{
  "answer": "Model's answer to the question"
}

`/health` (GET)

System health and status check

Output:

{
  "status": "healthy",
  "model_loaded": true,
  "tokenizer_loaded": true,
  "cuda_available": true,
  "device": "cuda:0"
}

Implementation Details

Theme System

Uses next-themes for theme management
Automatically detects system color scheme preference
Smooth transitions between light and dark modes
Persists user theme preference

Image Encoding Cache

Stores encoded images in memory using unique timestamps
Enables fast subsequent Q&A without re-encoding
Automatically cleans up on server restart

Error Handling

Frontend displays user-friendly error messages
Backend provides detailed error logging
Graceful fallbacks for common failure cases

Performance Optimizations

Uses torch.float16 for reduced memory usage
CUDA acceleration when available
Efficient image encoding caching
Streaming responses for large payloads

Development

Code Structure

moondream-web/
├── src/
│   ├── components/
│   │   ├── ui/               # Reusable UI components
│   │   ├── ImageUpload.tsx   # Image upload component
│   │   └── Chat.tsx         # Chat interface component
│   ├── pages/
│   │   ├── api/
│   │   │   ├── ask.ts       # Question handling endpoint
│   │   │   └── api.ts       # API utilities
│   │   ├── _app.tsx         # App configuration
│   │   ├── _document.tsx    # Document configuration
│   │   └── index.tsx        # Main page
│   └── styles/
│       └── globals.css      # Global styles
├── public/                  # Static assets
└── app.py                  # FastAPI backend

Development Workflow

Make changes to frontend or backend
Backend auto-reloads with uvicorn
Frontend hot-reloads with Next.js
Test changes in development environment

Troubleshooting

Common Issues

CUDA Memory Errors
- Reduce batch size
- Close other GPU applications
- Monitor memory usage with nvidia-smi
Connection Errors
- Verify FastAPI server is running
- Check correct ports are open
- Ensure correct IP addresses (127.0.0.1)
Image Processing Errors
- Verify supported image formats
- Check image file size
- Monitor server logs

Contributing

Fork the repository
Create feature branch
Implement changes
Add tests if applicable
Submit pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built on Moondream model
UI components from shadcn/ui
Animations by Framer Motion
Theme system by next-themes

Created with ❤️ for the AI community

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
moondream-web		moondream-web
README.md		README.md
app.py		app.py
next.config.js		next.config.js
prerequisites.md		prerequisites.md
requirements.txt		requirements.txt
test_pytorch-cuda.py		test_pytorch-cuda.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Moondream Web Interface

Core Features

Architecture

Frontend (Next.js)

Backend (FastAPI)

Detailed API Flow

Image Description Flow

Question-Answer Flow

Installation

Prerequisites

Backend Setup

Frontend Setup

API Endpoints

`/describe` (POST)

`/ask` (POST)

`/health` (GET)

Implementation Details

Theme System

Image Encoding Cache

Error Handling

Performance Optimizations

Development

Code Structure

Development Workflow

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

parsakhaz/moondream-local-vlm-nextjs-starter

Folders and files

Latest commit

History

Repository files navigation

Moondream Web Interface

Core Features

Architecture

Frontend (Next.js)

Backend (FastAPI)

Detailed API Flow

Image Description Flow

Question-Answer Flow

Installation

Prerequisites

Backend Setup

Frontend Setup

API Endpoints

/describe (POST)

/ask (POST)

/health (GET)

Implementation Details

Theme System

Image Encoding Cache

Error Handling

Performance Optimizations

Development

Code Structure

Development Workflow

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`/describe` (POST)

`/ask` (POST)

`/health` (GET)

Packages