Skip to content

Add vLLM-based runtime statistics for subblock latency measurement#1358

Merged
kevalmorabia97 merged 43 commits into
mainfrom
gkarch/runtime_opt
Jun 8, 2026
Merged

Add vLLM-based runtime statistics for subblock latency measurement#1358
kevalmorabia97 merged 43 commits into
mainfrom
gkarch/runtime_opt

Merge branch 'main' into gkarch/runtime_opt

105c736
Select commit
Loading
Failed to load commit list.
Codecov / codecov/project succeeded Jun 8, 2026 in 0s

76.74% (-0.77%) compared to 01415c2

View this Pull Request on Codecov

76.74% (-0.77%) compared to 01415c2

Details

Codecov Report

❌ Patch coverage is 29.11392% with 168 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.74%. Comparing base (01415c2) to head (105c736).

Files with missing lines Patch % Lines
modelopt/torch/puzzletron/utils/vllm_adapter.py 10.58% 76 Missing ⚠️
...ch/puzzletron/subblock_stats/calc_runtime_stats.py 29.33% 53 Missing ⚠️
...pt/torch/puzzletron/subblock_stats/runtime_vllm.py 25.92% 20 Missing ⚠️
...t/torch/puzzletron/subblock_stats/runtime_utils.py 62.85% 13 Missing ⚠️
...h/puzzletron/subblock_stats/calc_subblock_stats.py 70.00% 3 Missing ⚠️
.../subblock_stats/calc_subblock_params_and_memory.py 33.33% 2 Missing ⚠️
modelopt/torch/puzzletron/mip/run_puzzle.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1358      +/-   ##
==========================================
- Coverage   77.51%   76.74%   -0.77%     
==========================================
  Files         489      493       +4     
  Lines       54498    54687     +189     
==========================================
- Hits        42242    41971     -271     
- Misses      12256    12716     +460     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.