update: Improve Model Manager Configuration and CI Integration#830
Conversation
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
There was a problem hiding this comment.
Pull request overview
This PR reorganizes the Model Manager configuration structure and enhances CI integration by moving configuration files to a dedicated directory, adding comprehensive documentation, and unifying dependency management across all workflows.
- Moved model configuration files from
config/toconfig/model_manager/for better organization - Added comprehensive README.md with usage examples, API reference, and development guidelines
- Updated CI workflows to use centralized dependency management via
requirements.txtand added HF_TOKEN support for gated models
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/make/models.mk | Updated comments and commands to reference new config paths in config/model_manager/ |
| src/model_manager/cli.py | Updated default config paths to point to new config/model_manager/ directory |
| src/model_manager/tests/test_cli.py | Updated test assertions to expect new config file paths |
| src/model_manager/README.md | Added comprehensive documentation including quick start guide, API reference, and development instructions |
| config/model_manager/models.yaml | Updated usage comments to reflect new config path |
| config/model_manager/models.minimal.yaml | Updated usage comments and clarified note about gated models being in the full set |
| config/model_manager/models.lora.yaml | Updated usage comments to reflect new config path |
| .github/workflows/test-and-build.yml | Replaced manual huggingface_hub[cli] installation with requirements.txt and added HF_TOKEN environment variable |
| .github/workflows/performance-test.yml | Replaced manual huggingface_hub[cli] installation with requirements.txt and added HF_TOKEN environment variable |
| .github/workflows/performance-nightly.yml | Replaced manual huggingface_hub[cli] installation with requirements.txt, added HF_TOKEN, and changed CI_MINIMAL_MODELS to false for full model set |
| .github/workflows/integration-test-docker.yml | Replaced manual huggingface_hub[cli] installation with requirements.txt and added HF_TOKEN environment variable |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
samzong
left a comment
There was a problem hiding this comment.
👍👍 A few minor suggestions about README.md
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: Jared <w13431838023@gmail.com>
|
@samzong PTAL, Thanks~! |
samzong
left a comment
There was a problem hiding this comment.
Thanks for u making this better.
LGTM~

Summary
This PR enhances the Model Manager by reorganizing configuration files and improving CI workflow integration.
Changes
1. Reorganize Model Manager Configs
models.lora.yaml→ models.lora.yaml2. Add Comprehensive Documentation
3. Improve CI Integration
pip install huggingface_hub[cli]withpip install -r src/model_manager/requirements.txtHF_TOKENsupport for gated models (e.g.,embeddinggemma-300m):Benefits
BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE
-swhen doinggit commit[Bugfix],[Feat], and[CI].Detailed Checklist (Click to Expand)
Thank you for your contribution to semantic-router! Before submitting the pull request, please ensure the PR meets the following criteria. This helps us maintain the code quality and improve the efficiency of the review process.
PR Title and Classification
Please try to classify PRs for easy understanding of the type of changes. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:
[Bugfix]for bug fixes.[CI/Build]for build or continuous integration improvements.[Doc]for documentation fixes and improvements.[Feat]for new features in the cluster (e.g., autoscaling, disaggregated prefill, etc.).[Router]for changes to thevllm_router(e.g., routing algorithm, router observability, etc.).[Misc]for PRs that do not fit the above categories. Please use this sparingly.Note: If the PR spans more than one category, please include all relevant prefixes.
Code Quality
The PR need to meet the following code quality standards:
pre-committo format your code. SeeREADME.mdfor installation.DCO and Signed-off-by
When contributing changes to this project, you must agree to the DCO. Commits must include a
Signed-off-by:header which certifies agreement with the terms of the DCO.Using
-swithgit commitwill automatically add this header.What to Expect for the Reviews