Skip to content

feature(normalize gen_kwargs; add truncation_side to vllm)!#3509

Merged
baberabb merged 23 commits intomainfrom
vllm_do_sample
Jan 27, 2026
Merged

feature(normalize gen_kwargs; add truncation_side to vllm)!#3509
baberabb merged 23 commits intomainfrom
vllm_do_sample

Conversation

@baberabb
Copy link
Copy Markdown
Contributor

@baberabb baberabb commented Jan 21, 2026

closes #3505

  • Added a utility to normalize gen_kwargs. Normalizes do_sample and temperature to be consistent across models.
  • Added truncation_side=right|middle|left arg to vllm to use other than left truncation. Will see about adding this to huggingface as well, but slightly non-trivial as HF requires sampling params per batch while vllm can run different per sample.

closes #3505 where there was an inconsistency in sampling params between HF and vllm when both do_sample: false AND a non-zero temperature is specified in a task config. Now normalized to:

Config Result
Nothing specified Greedy (temp=0.0)
temperature: 0.8 (no do_sample) Sampling (temp=0.8)
do_sample: false Greedy (temp=0.0)
do_sample: false, temperature: 0.8 Greedy (temp forced to 0.0)
do_sample: true, temperature: 0.8 Sampling (temp=0.8)

@baberabb baberabb marked this pull request as draft January 21, 2026 13:38
@baberabb baberabb marked this pull request as ready for review January 26, 2026 20:03
@baberabb baberabb changed the title fix(vllm)!: set temp=0 when do_sample=False feature(normalize gen_kwargs; add truncation types to vllm)! Jan 26, 2026
@baberabb baberabb changed the title feature(normalize gen_kwargs; add truncation types to vllm)! feature(normalize gen_kwargs; add truncation_side to vllm)! Jan 26, 2026
@baberabb baberabb merged commit 30d4e2e into main Jan 27, 2026
6 checks passed
@baberabb baberabb deleted the vllm_do_sample branch January 27, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set temperature=0 for LongBench match do_sample=False behavior?

1 participant