Introduce De-dup/Similarity-Check in CI Workflow for PR/Issue by panpan0000 · Pull Request #39695 · vllm-project/vllm

panpan0000 · 2026-04-13T10:45:27Z

Co-Author: Trae + GPT5.3-Codex

Purpose

Example to explain #39694

Example Algorithm:

Scoring: 0.75 * text_similarity + 0.25 * file_overlap .
Threshold used for report: 0.75 .
Using Github Action CI Cache to temp save the Github API result cache for recent 1000 PR/500 issue..etc

Test Plan

Using 1000 recent PR to test the similarity check :

High-similarity pairs ( >=0.75 ): 26

Test Result

PR Similarity

Repo: vllm-project/vllm
PR count: 1000
Candidate pairs: 17375
High-similarity pairs (>= 0.75): 26

Score	Text	Files	PR A	PR B
100%	100%	100%	#39553 Okakarpa shadow clone	#39577 Okakarpa shadow clone
99%	99%	100%	#37929 [Core] Use standalone autograd_cache_key for compilation dedup optimization	#39517 [Core] Use standalone autograd_cache_key for compilation dedup optimization
96%	95%	100%	#37947 [DRAFT][XPU] Upgrade torch 2.11 for xpu	#39257 [XPU] update triton version for torch 2.11 upgrade
96%	95%	100%	#37947 [DRAFT][XPU] Upgrade torch 2.11 for xpu	#39313 [XPU] upgrade to triton-xpu 3.7.0
95%	97%	88%	#38249 [Misc] Organize NixlConnector into own directory	#39354 [KVConnector][NIXL] Organize NIXL connector into its own directory
95%	93%	100%	#39410 [XPU] Disable fusion passes on XPU Platform	#39671 use spawn multiproc method on xpu
94%	92%	100%	#38856 [LMCache] vLLM Block Allocation Event	#39719 fix(lmcache): correct store for cached requests while enable prefix cache
94%	91%	100%	#39606 Pass extra_config to the constructor of LMCacheMPXXXAdapter	#39719 fix(lmcache): correct store for cached requests while enable prefix cache
94%	91%	100%	#39257 [XPU] update triton version for torch 2.11 upgrade	#39313 [XPU] upgrade to triton-xpu 3.7.0
91%	100%	67%	#39432 Gfx1250 wip	#39437 Gfx1250 wip rebase test
90%	92%	85%	#36823 [vLLM IR] 3/N fused_add_rms_norm and maybe_inplace	#38775 [vLLM IR] 4/N Compile native implementation
90%	86%	100%	#39402 [kv_offload+HMA[10/N]: Support load with multiple KV groups	#39403 [kv_offload+HMA][11/N]: Support store with multiple KV groups
86%	98%	50%	#23995 Feature/deepseek v31 lora support	#39661 [DOC] Update Gemma 4
82%	76%	100%	#39110 [Core] Disable HMA for eagle/MTP with sliding window models	#39376 [Core] Disable HMA for eagle/MTP with sliding window models
82%	76%	100%	#39401 [kv_offload+HMA][9/N]: Support lookup with multiple KV groups	#39402 [kv_offload+HMA[10/N]: Support load with multiple KV groups
82%	76%	100%	#39401 [kv_offload+HMA][9/N]: Support lookup with multiple KV groups	#39403 [kv_offload+HMA][11/N]: Support store with multiple KV groups
80%	96%	33%	#26583 add log for request trace	#39646 V0.12.0 support n sampling delay split to eliminate redundant prefill computation and memory
79%	97%	22%	#35721 [LoRA] Support dual CUDA streams-Linear Layer	#37297 [LoRA] Support FP8 LoRA E2E inference-dense model
79%	94%	32%	#39153 [Frontend][4/n] Improve pooling entrypoints	pooling.
79%	74%	91%	#38775 [vLLM IR] 4/N Compile native implementation	#39453 Port activations to IR op 1/3
79%	88%	50%	#39312 [Mergify] Update model vendor auto-label rules	#39429 [CI/Build] Update auto-rebase rule
78%	100%	13%	#39723 [SimpleCPUOffloadConnector]: Add support for `reset_cache()`	#39726 [SimpleCPUOffloadConnector]: Add support for reset_cache()
77%	98%	14%	#38780 [vLLM IR][RMSNorm] Port GemmaRMSNorm to vLLM IR Ops	#38798 [vLLM IR][RMSNorm] Port RMSNormGated to vLLM IR Ops
77%	69%	100%	#39744 [v1] Expose num_prompt_tokens in CommonAttentionMetadata	#39745 [v1] Expose num_prompt_tokens in CommonAttentionMetadata
77%	81%	62%	#23133 Split compressed_tensors_moe.py into separate wna16, int8, fp8, nvfp4	#29427 [Refactor] Split up compressed_tensors_moe.py into separate files per method
76%	82%	59%	#39267 [vllm IR] 1/N Port FP8 Quantization to vLLM IR Ops	#39481 [vllm IR] Port FP8 Quantization to vLLM IR Ops

Similar Issues:

Repo: vllm-project/vllm
Issue count: 500
Candidate pairs: 9909
High-similarity pairs (>= 0.75): 12

Match Score	Desc Similarity	Title Overlap	Issue A	Issue B
100%	100%	100%	#39270 [Bug]: Qwen3.5 crashes when using suffix-decoding	#39271 [Bug]: Qwen3.5 crashes when using suffix-decoding
100%	100%	100%	#39372 [Bug]:	#39373 [Bug]:
100%	100%	100%	#39372 [Bug]:	#39374 [Bug]:
100%	100%	100%	#39373 [Bug]:	#39374 [Bug]:
100%	100%	100%	#39433 RFC: Add logit_scale to PoolerConfig for Affine Score Calibration (Platt Scaling)	#39434 [RFC]: Add logit_scale to PoolerConfig for Affine Score Calibration (Platt Scaling)
100%	100%	100%	#39299 [Performance] DSV3.2 Indexer: Overlap indexer k+w path
81%	95%	25%	#31888 [Usage]: rollout slow	#38642 [Usage]: 模型返回值reasoning_content
80%	88%	50%	#38734 [Transformers v5] SarvamMLAForCausalLM	#38740 [Transformers v5] NemotronParseForConditionalGeneration
79%	94%	20%	#29245 [Usage]: 启动 qwen3 vl 超级超级超级慢，sglang 启动很快，可能的原因是什么？	#38642 [Usage]: 模型返回值reasoning_content
77%	92%	17%	#29245 [Usage]: 启动 qwen3 vl 超级超级超级慢，sglang 启动很快，可能的原因是什么？	#31888 [Usage]: rollout slow
77%	89%	29%	#38384 [Transformers v5] Distributed shutdown test timetout	#38740 [Transformers v5] NemotronParseForConditionalGeneration
76%	88%	31%	#31661 [Bug]: jina-reranker-m0 [image_index] IndexError: list index out of range	#32151 [Bug]: jina-reranker-m0 infer error

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist · 2026-04-13T10:45:34Z

Note

Gemini is unable to generate a review for this pull request due to the file types involved not being currently supported.

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

refine Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

panpan0000 · 2026-04-16T10:39:51Z

Tested in real CI on my fork with multiple cases.

Similar cases: bot comment is posted/updated correctly.
Duplicate probe alpha panpan0000/vllm#18
Dissimilar cases: no false-positive comment.
Unrelated probe gamma panpan0000/vllm#20
Re-runs: bot upserts one comment (no spam comments).
Cache works: first run shows misses/writes, later runs show high hits and fewer API requests.
init cache: https://github.com/panpan0000/vllm/actions/runs/24502646351/job/71613106180
reuse cache: https://github.com/panpan0000/vllm/actions/runs/24503503902/job/71616035899

Updated comment with 1 similar PRs.
Stats: api_requests=3 file_cache_hits=33 file_cache_misses=0 file_cache_writes=0

After PR content changes, detection result updates accordingly.
will be re-trigged and update same comment(instead of adding new comment)

Overall, duplicate detection and cache behavior work as expected.

mergify · 2026-04-16T10:44:38Z

Hi @panpan0000, the pre-commit checks have failed. Please run:

uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy failing?

mypy is run differently in CI. If the failure is related to this check, please use the following command to run it locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

mergify bot added the ci/build label Apr 13, 2026

panpan0000 mentioned this pull request Apr 13, 2026

[RFC]: PR de-dup/Similarity-Check CI workflow ? #39694

Open

1 task

test AI de-dup CI

69ae756

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

panpan0000 force-pushed the de-dup-ci branch from 91eb951 to 41a5e1b Compare April 14, 2026 04:08

panpan0000 added 2 commits April 14, 2026 12:49

refine

fdee28f

refine Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

adding closed PR + change threshold

5e779ee

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

panpan0000 force-pushed the de-dup-ci branch from 41a5e1b to 5e779ee Compare April 14, 2026 04:49

Improve duplicate issue/PR detection workflows and caching

97f0f31

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

panpan0000 changed the title ~~Introduce De-dup CI Workflow for PR/Issue~~ Introduce De-dup/Similarity-Check in CI Workflow for PR/Issue Apr 14, 2026

ci: use regex module for forbidden import check

c4504b0

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

panpan0000 force-pushed the de-dup-ci branch from d77cd59 to c4504b0 Compare April 16, 2026 10:25

panpan0000 marked this pull request as ready for review April 16, 2026 10:39

Merge branch 'main' into de-dup-ci

3e3f3b8

Fix E501 in duplicate PR checker script

243eb58

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce De-dup/Similarity-Check in CI Workflow for PR/Issue#39695

Introduce De-dup/Similarity-Check in CI Workflow for PR/Issue#39695
panpan0000 wants to merge 7 commits intovllm-project:mainfrom
panpan0000:de-dup-ci

panpan0000 commented Apr 13, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Apr 13, 2026

Uh oh!

panpan0000 commented Apr 16, 2026

Uh oh!

mergify bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

panpan0000 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

PR Similarity

Similar Issues:

Uh oh!

gemini-code-assist bot commented Apr 13, 2026

Uh oh!

panpan0000 commented Apr 16, 2026

Uh oh!

mergify bot commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

panpan0000 commented Apr 13, 2026 •

edited

Loading