This repository has been archived by the owner on Jul 17, 2024. It is now read-only.
Releases: instill-ai/model-mistral-7b-dvc
Releases · instill-ai/model-mistral-7b-dvc
for-test
for-test
fp16-7b-vllm-a100
fp16-7b-vllm-a100
fp16-7b-vllm-p80-2gpu
Support Mistral-7b Text Completion Task via vLLM in Triton Inference Server's Python Operator, running in parallel with 2 GPU instances, each utilizing 80% of GPU memory.
fp16-7b-vllm-p80-1gpu
Support Mistral-7b Text Completion Task via vLLM in Triton Inference Server's Python Operator, running not in parallel with only 1 gpu instance with utilizing 80% of GPU memory.