-
Notifications
You must be signed in to change notification settings - Fork 5.9k
feat: Priority-based scheduling optimization (including default priority, preemption toggle, priority-based metrics, etc.) #17026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
hnyls2002
merged 51 commits into
sgl-project:main
from
zhuxinjie-nz:priority_metrics_dev
Mar 4, 2026
Merged
Changes from all commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
7d66ba9
feat: Add metrics based on priority-based scheduling.
8676199
[fix] Incorrect parameter naming
999d4df
fix lint
huangtingwei9988 e1ef8c3
Merge branch 'main' into priority_metrics_dev
huangtingwei9988 bfad739
Merge branch 'main' into priority_metrics_dev
huangtingwei9988 a582950
[fix] Logic issue in the num_running_reqs_by_priority tracking record.
f9b834a
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
53785e5
Merge branch 'main' into priority_metrics_dev
huangtingwei9988 bfe854a
[fix] Optimize priority scheduling logic.
311721f
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
934758c
[fix] Optimize priority scheduling logic.
e5dc82b
[fix] Code formatting issues
38c8494
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
6922329
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
16eb3cc
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
a5f2c62
Merge branch 'main' into priority_metrics_dev
JustinTong0323 683b39b
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
347ade2
[fix] Optimize the logic of the log_prefill_stats interface
046b3b8
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
168a589
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
eb75cca
Merge branch 'main' into priority_metrics_dev
zhuxinjie-nz 878ddaa
[PD-Disagg] Support query dp rank from bootstrap server. (#19168)
hnyls2002 5e1202e
[CI] fix the teardown output of disaggregation test (#19193)
hnyls2002 c9a5157
add new ci user (#19133)
narutolhy 96516d5
[CI] Tiny enhance the dp attention load blance benchmark (#19194)
hnyls2002 2fada5b
[PD-Disagg] Unify prefill info data transition flow, all with `Prefil…
hnyls2002 3c5de3f
fix: patch docker image fixes (#19100)
dougyster 55c7122
Whisper model support & `/v1/audio/transcriptions` endpoint & benchma…
JustinTong0323 95eb0c7
fix: add missing blank line after docstring in serving_transcription.…
Kangyan-Zhou 1191c05
[PD-Disagg] Deduplicate common KVManager methods into CommonKVManager…
hnyls2002 c30d115
fix(docker): migrate ROCm Dockerfiles from setuptools-rust to maturin…
slin1237 11dc631
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
6078c7c
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
06813e7
[fix] Optimize the preempt logic
5d8dd95
[fix] Optimize the preemption logic
b5c0a8d
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
bf7cdf5
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
cc90291
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
1f8cebe
Merge remote-tracking branch 'origin/main' into priority_metrics_dev
6f3e93d
fix wrong server args naming
hnyls2002 81ccd78
rename confusing name
hnyls2002 8491531
fix wrong comments
hnyls2002 56e9b44
simplify the code
hnyls2002 0724cbe
use QueueCount
hnyls2002 cdbaf59
fix import error
hnyls2002 f9d66ad
add comments
hnyls2002 b14152c
fix dict copy & label type
hnyls2002 75251d9
Add unit test for priority scheduling metrics
hnyls2002 e7257b7
fix future
hnyls2002 6f957cb
fix
hnyls2002 4504bcf
tiny fix
hnyls2002 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to make one or two helper functions that calculate per-priority request counts in scheduler.py and scheduler_metrics_mixin.py and consolidate the usage?
Perhaps that a) takes in the list of requests and return xxx_reqs_by_priority dictionary or b) additionally taking the dictionary and updating in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much — I’ll work on fixing these issues.