Skip to content

[Misc] Deprecate semantic_cache.backend_config_path and embed backend config inline#1100

Merged
Xunzhuo merged 10 commits intovllm-project:mainfrom
Scanf-s:fix/update-cache-configuration
Jan 19, 2026
Merged

[Misc] Deprecate semantic_cache.backend_config_path and embed backend config inline#1100
Xunzhuo merged 10 commits intovllm-project:mainfrom
Scanf-s:fix/update-cache-configuration

Conversation

@Scanf-s
Copy link
Copy Markdown
Contributor

@Scanf-s Scanf-s commented Jan 17, 2026

Overview

FIX #1022
This pull request adds inline redis/milvus cache configuration support in a single configuration file.
As suggested, previous file base cache backend configuration marked as Deprecated.
If both provided, inline configuration takes priority

Solution

  • Refactored MilvusConfig and RedisConfig into config.go to resolve circular import issues.
  • Updated cache options and cache factory logic to prioritize inline config over file path.
  • Marked backend_config_path as (Deprecated) in the relevant code sections.
  • Updated HybridCache to match the current MilvusCache implementation.
  • Added tests in cache_test.go and config_test.go to validate the changes

Tests

  • Unit tests pass for inline Milvus configuration
  • Unit tests pass for inline Redis configuration
  • Deprecated file-based configuration still works
  • Config parsing validates all nested fields correctly

  • Make sure the code changes pass the pre-commit checks.
  • Sign-off your commit by using -s when doing git commit
  • Try to classify PRs for easy understanding of the type of changes, such as [Bugfix], [Feat], and [CI].

@netlify
Copy link
Copy Markdown

netlify Bot commented Jan 17, 2026

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit e458004
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/696ce027e09b4600085c1bc0
😎 Deploy Preview https://deploy-preview-1100--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@Scanf-s Scanf-s force-pushed the fix/update-cache-configuration branch from d213e2c to 11affa5 Compare January 17, 2026 09:11
Scanf-s and others added 9 commits January 18, 2026 22:27
…cheOption initializer to handle the new configuration approach

Signed-off-by: Scanf-s <sullung2yo@gmail.com>
…s not given

Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
…ject#1089)

Add pluggable model selection algorithms for intelligent routing:
- Elo rating system with Bradley-Terry model for preference-based selection
- RouterDC for query-to-model embedding matching
- AutoMix for POMDP-based cost-quality optimization
- Hybrid selector combining multiple methods with configurable weights
- Static selector for backwards compatibility

Integration:
- OpenAIRouter initializes selection registry on startup
- req_filter_classification uses configured selector instead of hardcoded first model
- Prometheus metrics for selection tracking

Signed-off-by: asaadbalum <asaad.balum@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>
@Scanf-s Scanf-s force-pushed the fix/update-cache-configuration branch from d48833a to 7e31f04 Compare January 18, 2026 13:27
@Scanf-s Scanf-s marked this pull request as ready for review January 18, 2026 13:29
@github-actions
Copy link
Copy Markdown
Contributor

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 config

Owners: @rootfs, @Xunzhuo
Files changed:

  • config/semantic-cache/config.hybrid.yaml
  • config/semantic-cache/config.redis.yaml

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/cache/cache_factory.go
  • src/semantic-router/pkg/cache/cache_interface.go
  • src/semantic-router/pkg/cache/cache_test.go
  • src/semantic-router/pkg/cache/hybrid_cache.go
  • src/semantic-router/pkg/cache/hybrid_cache_stub.go
  • src/semantic-router/pkg/cache/milvus_cache.go
  • src/semantic-router/pkg/cache/redis_cache.go
  • src/semantic-router/pkg/config/config.go
  • src/semantic-router/pkg/config/config_test.go
  • src/semantic-router/pkg/extproc/router.go

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Copy link
Copy Markdown
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks! would you like to double check if all backend_config_path has been removed as follow-up?

@Xunzhuo Xunzhuo merged commit 5ad3063 into vllm-project:main Jan 19, 2026
40 checks passed
@Scanf-s
Copy link
Copy Markdown
Contributor Author

Scanf-s commented Jan 19, 2026

@Xunzhuo Sure! I can do that for the follow-up task.

henschwartz pushed a commit to henschwartz/semantic-router that referenced this pull request Jan 21, 2026
… config inline (vllm-project#1100)

* fix: Refactor Redis and Milvus Cache Config into config.go, Update CacheOption initializer to handle the new configuration approach

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* fix: Add fallback logic when proper redis or milvus configuration does not given

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* docs: Add sample inline redis configuration example

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* docs: Update cache configuration examples

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* fix: Update HybridCache Milvus configuration

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* chore: Apply code linter

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* Feat(selection): implement advanced model selection methods (vllm-project#1089)

Add pluggable model selection algorithms for intelligent routing:
- Elo rating system with Bradley-Terry model for preference-based selection
- RouterDC for query-to-model embedding matching
- AutoMix for POMDP-based cost-quality optimization
- Hybrid selector combining multiple methods with configurable weights
- Static selector for backwards compatibility

Integration:
- OpenAIRouter initializes selection registry on startup
- req_filter_classification uses configured selector instead of hardcoded first model
- Prometheus metrics for selection tracking

Signed-off-by: asaadbalum <asaad.balum@gmail.com>
Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* feat: Add inline cache configuration unit tests

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

* feat: Add cache unit tests

Signed-off-by: Scanf-s <sullung2yo@gmail.com>

---------

Signed-off-by: Scanf-s <sullung2yo@gmail.com>
Signed-off-by: asaadbalum <asaad.balum@gmail.com>
Co-authored-by: asaadbalum <154635253+asaadbalum@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[v0.2-Athena]: Deprecate semantic_cache.backend_config_path and embed backend config inline

5 participants