fix: suppress tokenizer parallelism warning in oneshot by majiayu000 · Pull Request #2183 · vllm-project/llm-compressor

majiayu000 · 2026-01-04T10:09:12Z

SUMMARY:
Suppress the tokenizer parallelism warning that appears during oneshot calibration by setting TOKENIZERS_PARALLELISM=false in Oneshot.__init__.

The warning occurs when FastTokenizer's internal threading conflicts with dataset.map's multiprocessing (num_proc parameter). This fix sets the environment variable early in the oneshot lifecycle to prevent the conflict, while respecting any existing user-set value.

Closes #2007

TEST PLAN:

Added unit tests in tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py
Tests verify:
1. TOKENIZERS_PARALLELISM is set to false when not already set
2. Existing user-set TOKENIZERS_PARALLELISM values are respected
All tests pass locally with pytest tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py -v

Set TOKENIZERS_PARALLELISM=false in Oneshot.__init__ to prevent the warning that occurs when FastTokenizer's internal threading conflicts with dataset.map's multiprocessing (num_proc). The warning appears as: "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks..." This fix respects any existing user-set TOKENIZERS_PARALLELISM value. Closes vllm-project#2007 Signed-off-by: majiayu000 <1835304752@qq.com>

gemini-code-assist · 2026-01-04T10:09:25Z

Summary of Changes

Hello @majiayu000, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request aims to improve the user experience during oneshot calibration by eliminating a common tokenizer parallelism warning. It implements a targeted fix that automatically manages the TOKENIZERS_PARALLELISM environment variable to prevent conflicts arising from multiprocessing, while ensuring that any explicit user configurations are preserved. The change is supported by new unit tests to guarantee its intended functionality.

Highlights

Suppress Tokenizer Parallelism Warning: The pull request addresses and suppresses a recurring tokenizer parallelism warning that appears during oneshot calibration. This is achieved by setting the TOKENIZERS_PARALLELISM environment variable to false within the Oneshot.__init__ method, specifically when it's not already set by the user. This prevents conflicts between FastTokenizer's internal threading and dataset.map's multiprocessing.
Added Unit Tests: New unit tests have been introduced to verify the correct behavior of the TOKENIZERS_PARALLELISM environment variable handling. These tests confirm that the variable is set to false when not present and that any existing user-defined value for TOKENIZERS_PARALLELISM is respected and not overridden.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively addresses the tokenizer parallelism warning by setting the TOKENIZERS_PARALLELISM environment variable. The implementation in Oneshot.__init__ is straightforward and correctly avoids overriding existing user settings. The new tests in test_tokenizer_parallelism.py are comprehensive, covering both scenarios where the environment variable is set and not set. I've provided a few suggestions to enhance the new test file's maintainability and robustness by introducing a constant for the environment variable name and strengthening the assertions.

tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py

github-actions · 2026-01-04T11:14:05Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

kylesayrs

Awesome investigation work! I think that now that we have a clear understanding of what's going on (which has now been documented), this change is justified. Please add a warning to make sure that users are aware of the change in environment.

src/llmcompressor/entrypoints/oneshot.py

Address review feedback by adding a warning log message when TOKENIZERS_PARALLELISM is automatically set to false. This ensures users are aware of the environment change. Also improved test file: - Added _TOKENIZERS_PARALLELISM_ENV constant for maintainability - Changed os.environ.get() to os.environ[] for explicit assertions Signed-off-by: majiayu000 <1835304752@qq.com>

kylesayrs

Thanks!

Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Signed-off-by: lif <1835304752@qq.com>

Signed-off-by: majiayu000 <1835304752@qq.com>

mergify · 2026-01-15T03:40:32Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

Signed-off-by: lif <1835304752@qq.com>

brian-dellabetta

Hi @majiayu000 , please run formatting as mergify explains and we can work to merge this in. Thanks for the contribution!

Signed-off-by: majiayu000 <1835304752@qq.com>

majiayu000 · 2026-01-28T07:36:41Z

Done! I've run make style and all formatting checks have passed. The code is now ready for review. ✓

mergify · 2026-01-28T15:00:29Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

HDCharles · 2026-01-28T15:50:25Z

still failing, you need to do make style and make quality

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: majiayu000 <1835304752@qq.com>

majiayu000 · 2026-01-28T17:21:31Z

@HDCharles All check passed. Thanks!

mergify · 2026-01-28T17:24:21Z

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

HDCharles · 2026-01-28T17:27:33Z

its still failing the quality checks

usually this happens when you have a different version of ruff

if you do pip install -e ./[dev]

or just make sure your ruff version is the same as in https://github.com/vllm-project/llm-compressor/blob/main/setup.py#L172 it should work.

brian-dellabetta

tests are green -- thanks for updating and for the contribution!

…2183) SUMMARY: Suppress the tokenizer parallelism warning that appears during oneshot calibration by setting `TOKENIZERS_PARALLELISM=false` in `Oneshot.__init__`. The warning occurs when FastTokenizer's internal threading conflicts with `dataset.map`'s multiprocessing (`num_proc` parameter). This fix sets the environment variable early in the oneshot lifecycle to prevent the conflict, while respecting any existing user-set value. Closes vllm-project#2007 TEST PLAN: - Added unit tests in `tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py` - Tests verify: 1. `TOKENIZERS_PARALLELISM` is set to `false` when not already set 2. Existing user-set `TOKENIZERS_PARALLELISM` values are respected - All tests pass locally with `pytest tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py -v` --------- Signed-off-by: majiayu000 <1835304752@qq.com> Signed-off-by: lif <1835304752@qq.com> Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com> Co-authored-by: HDCharles <39544797+HDCharles@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>

gemini-code-assist bot reviewed Jan 4, 2026

View reviewed changes

kylesayrs requested changes Jan 4, 2026

View reviewed changes

src/llmcompressor/entrypoints/oneshot.py Outdated Show resolved Hide resolved

kylesayrs previously approved these changes Jan 10, 2026

View reviewed changes

Update src/llmcompressor/entrypoints/oneshot.py

289851b

Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Signed-off-by: lif <1835304752@qq.com>

majiayu000 dismissed kylesayrs’s stale review via 289851b January 10, 2026 09:23

majiayu000 added 2 commits January 10, 2026 17:24

Merge branch 'main' into fix/tokenizer-parallelism-warning

9bf2430

refactor: use constant for TOKENIZERS_PARALLELISM environment variable

1cc5939

Signed-off-by: majiayu000 <1835304752@qq.com>

majiayu000 force-pushed the fix/tokenizer-parallelism-warning branch from adf30e4 to 1cc5939 Compare January 10, 2026 15:55

Merge branch 'main' into fix/tokenizer-parallelism-warning

0fba51a

dsikka requested review from HDCharles, brian-dellabetta and dsikka as code owners January 15, 2026 03:39

mergify bot added quality-failed and removed quality-failed labels Jan 15, 2026

style: fix linting issues

e23d5da

Signed-off-by: lif <1835304752@qq.com>

majiayu000 force-pushed the fix/tokenizer-parallelism-warning branch from 0b1b88f to e23d5da Compare January 15, 2026 08:46

brian-dellabetta requested changes Jan 16, 2026

View reviewed changes

HDCharles and others added 2 commits January 27, 2026 15:46

Merge branch 'main' into fix/tokenizer-parallelism-warning

01b35ab

style: apply automatic formatting

8427320

Signed-off-by: majiayu000 <1835304752@qq.com>

Merge branch 'main' into fix/tokenizer-parallelism-warning

b06efca

HDCharles previously approved these changes Jan 28, 2026

View reviewed changes

mergify bot added the quality-failed label Jan 28, 2026

mergify bot removed the quality-failed label Jan 28, 2026

chore: trigger CI re-run after format checks

96751e0

Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: majiayu000 <1835304752@qq.com>

majiayu000 force-pushed the fix/tokenizer-parallelism-warning branch from b5f4165 to 96751e0 Compare January 28, 2026 16:54

mergify bot added the quality-failed label Jan 28, 2026

style: fix ruff formatting and import ordering

0b63a4c

majiayu000 dismissed HDCharles’s stale review via 0b63a4c January 29, 2026 06:17

mergify bot removed the quality-failed label Jan 29, 2026

Merge branch 'main' into fix/tokenizer-parallelism-warning

3ed0564

brian-dellabetta approved these changes Jan 29, 2026

View reviewed changes

Merge branch 'main' into fix/tokenizer-parallelism-warning

7c1e6a3

HDCharles approved these changes Feb 2, 2026

View reviewed changes

HDCharles merged commit bd111bc into vllm-project:main Feb 2, 2026
10 of 11 checks passed

majiayu000 deleted the fix/tokenizer-parallelism-warning branch February 2, 2026 16:26

Conversation

majiayu000 commented Jan 4, 2026

Uh oh!

gemini-code-assist bot commented Jan 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jan 4, 2026

Uh oh!

kylesayrs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kylesayrs left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Jan 15, 2026

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

majiayu000 commented Jan 28, 2026

Uh oh!

mergify bot commented Jan 28, 2026

Uh oh!

HDCharles commented Jan 28, 2026

Uh oh!

majiayu000 commented Jan 28, 2026

Uh oh!

mergify bot commented Jan 28, 2026

Uh oh!

HDCharles commented Jan 28, 2026

Uh oh!

brian-dellabetta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants