Important
This repository has been deprecated and is only intended for launching Instill Core projects up to version v0.12.0-beta
, where the Instill Model version corresponds to v0.9.0-alpha
in this deprecated repository. Check the latest Instill Core project in the instill-ai/instill-core repository.
⚗️ Instill Model, or simply Model, is an integral component of the Instill Core project. It serves as an advanced ModelOps/LLMOps platform focused on empowering users to seamlessly import, serve, fine-tune, and monitor Machine Learning (ML) models for continuous optimization.
-
macOS or Linux - Instill Model works on macOS or Linux, but does not support Windows yet.
-
Docker and Docker Compose - Instill Model uses Docker Compose (specifically,
Compose V2
andCompose specification
) to run all services at local. Please install the latest stable Docker and Docker Compose before using Instill Model. -
yq
>v4.x
. Please follow the installation guide. -
(Optional) NVIDIA Container Toolkit - To enable GPU support in Instill Model, please refer to NVIDIA Cloud Native Documentation to install NVIDIA Container Toolkit. If you'd like to specifically allot GPUs to Instill Model, you can set the environment variable
NVIDIA_VISIBLE_DEVICES
. For example,NVIDIA_VISIBLE_DEVICES=0,1
will make thetriton-server
consume GPU device id0
and1
specifically. By defaultNVIDIA_VISIBLE_DEVICES
is set toall
to use all available GPUs on the machine.
Note The image of model-backend (~2GB) and Triton Inference Server (~23GB) can take a while to pull, but this should be an one-time effort at the first setup.
Use stable release version
Execute the following commands to pull pre-built images with all the dependencies to launch:
$ git clone -b v0.10.0-alpha https://github.com/instill-ai/deprecated-model.git && cd deprecated-model
# Launch all services
$ make all
🚀 That's it! Once all the services are up with health status, the UI is ready to go at http://localhost:3000. Please find the default login credentials in the documentation.
To shut down all running services:
$ make down
Explore the documentation to discover all available deployment options.
We curate a list of ready-to-use models. These pre-trained models are from different sources and have been trained and deployed by our team. Want to contribute a new model? Please create an issue, we are happy to add it to the list 👐.
Model | Task | Sources | Framework | CPU | GPU |
---|---|---|---|---|---|
MobileNet v2 | Image Classification | GitHub-DVC | ONNX | ✅ | ✅ |
Vision Transformer (ViT) | Image Classification | Hugging Face | ONNX | ✅ | ❌ |
YOLOv4 | Object Detection | GitHub-DVC | ONNX | ✅ | ✅ |
YOLOv7 | Object Detection | GitHub-DVC | ONNX | ✅ | ✅ |
YOLOv7 W6 Pose | Keypoint Detection | GitHub-DVC | ONNX | ✅ | ✅ |
PSNet + EasyOCR | Optical Character Recognition (OCR) | GitHub-DVC | ONNX | ✅ | ✅ |
Mask RCNN | Instance Segmentation | GitHub-DVC | PyTorch | ✅ | ✅ |
Lite R-ASPP based on MobileNetV3 | Semantic Segmentation | GitHub-DVC | ONNX | ✅ | ✅ |
Stable Diffusion | Text to Image | GitHub-DVC, Local-CPU, Local-GPU | ONNX | ✅ | ✅ |
Stable Diffusion XL | Text to Image | GitHub-DVC | PyTorch | ❌ | ✅ |
Control Net - Canny | Image to Image | GitHub-DVC | PyTorch | ❌ | ✅ |
Megatron GPT2 | Text Generation | GitHub-DVC | FasterTransformer | ❌ | ✅ |
Llama2 | Text Generation | GitHub-DVC | vLLM, PyTorch | ✅ | ✅ |
Code Llama | Text Generation | GitHub-DVC | vLLM | ❌ | ✅ |
Llama2 Chat | Text Generation Chat | GitHub-DVC | vLLM | ❌ | ✅ |
MosaicML MPT | Text Generation Chat | GitHub-DVC | vLLM | ❌ | ✅ |
Mistral | Text Generation Chat | GitHub-DVC | vLLM | ❌ | ✅ |
Zephyr-7b | Text Generation Chat | GitHub-DVC | PyTorch | ✅ | ✅ |
Llava | Visual Question Answering | GitHub-DVC | PyTorch | ❌ | ✅ |
Note: The GitHub-DVC
source in the table means importing a model into Instill Model from a GitHub repository that uses DVC to manage large files.
See the LICENSE file for licensing information.