v0.1.20

av released this 13 Sep 14:11

· 224 commits to main since this release

988f8b9

v0.1.20 - SGLang integration

SGLang is a fast serving framework for large language models and vision language models.

Starting

# [Optional] Pre-pull the image
harbor pull sglang

# Download with HF CLI
harbor hf download google/gemma-2-2b-it

# Set the model to run using HF specifier
harbor sglang model google/gemma-2-2b-it

# See original CLI help for available options
harbor run sglang --help

# Set the extra arguments via "harbor args"
harbor sglang args --context-length 2048 --disable-cuda-graph

Full Changelog: v0.1.19...v0.1.20

Assets 2