VLM UI

VLM UI is a web-based user interface for interacting with various Vision Language Models (VLMs).

It provides a convenient way to upload images, ask questions, and receive responses from the model.

Features

Web-based interface using Gradio
Support for multiple VLM models
Image upload and processing
Real-time streaming responses
Dockerised deployment

Prerequisites

Docker
NVIDIA GPU with CUDA support (for running models)

Quick Start

Clone the repository:

git clone --depth=1 https://github.com/sammcj/vlm-ui.git
cd vlm-ui

Build and run the Docker container:

docker build -t vlm-ui .
docker run -d --gpus all -p 7860:7860 -e MODEL_NAME=OpenGVLab/InternVL2-8B vlm-ui

Open your browser and navigate to http://localhost:7860 to access the VLM UI.

Configuration

You can customize the behaviour of VLM UI by setting the following environment variables:

SYSTEM_MESSAGE: The system message to use for the conversation (default: "Carefully follow the users request.")
TEMPERATURE: Controls randomness in the model's output (default: 0.3)
TOP_P: Controls diversity of the model's output (default: 0.7)
MAX_NEW_TOKENS: Maximum number of tokens to generate (default: 2048)
MAX_INPUT_TILES: Maximum number of image tiles to process (default: 12)
REPETITION_PENALTY: Penalizes repetition in the model's output (default: 1.0)
MODEL_NAME: The name of the model to use (default: OpenGVLab/InternVL2-8B)
LOAD_IN_8BIT: Whether to load the model in 8-bit precision (default: 1)

Example:

docker run -d --gpus all -p 7860:7860 \
  -e MODEL_NAME=OpenGVLab/InternVL2-8B \
  -e TEMPERATURE=0.3 \
  -e MAX_NEW_TOKENS=2048 \
  vlm-ui

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

Copyright Sam McLeod
This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

This app builds on the work of the following projects:

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
assets		assets
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
api.py		api.py
constants.py		constants.py
controller.py		controller.py
conversation.py		conversation.py
entrypoint.sh		entrypoint.sh
gradio_web_server.py		gradio_web_server.py
model_worker.py		model_worker.py
requirements.txt		requirements.txt
screenshot.png		screenshot.png
supervisord.conf		supervisord.conf
utils.py		utils.py
wait_for_model_worker.sh		wait_for_model_worker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VLM UI

Features

Prerequisites

Quick Start

Configuration

Contributing

License

Acknowledgements

About

Releases

Packages

Languages

sammcj/vlm-ui

Folders and files

Latest commit

History

Repository files navigation

VLM UI

Features

Prerequisites

Quick Start

Configuration

Contributing

License

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages