Skip to content

fix: suppress tokenizer parallelism warning in oneshot#2183

Merged
HDCharles merged 14 commits intovllm-project:mainfrom
majiayu000:fix/tokenizer-parallelism-warning
Feb 2, 2026
Merged

fix: suppress tokenizer parallelism warning in oneshot#2183
HDCharles merged 14 commits intovllm-project:mainfrom
majiayu000:fix/tokenizer-parallelism-warning

Conversation

@majiayu000
Copy link
Contributor

SUMMARY:
Suppress the tokenizer parallelism warning that appears during oneshot calibration by setting TOKENIZERS_PARALLELISM=false in Oneshot.__init__.

The warning occurs when FastTokenizer's internal threading conflicts with dataset.map's multiprocessing (num_proc parameter). This fix sets the environment variable early in the oneshot lifecycle to prevent the conflict, while respecting any existing user-set value.

Closes #2007

TEST PLAN:

  • Added unit tests in tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py
  • Tests verify:
    1. TOKENIZERS_PARALLELISM is set to false when not already set
    2. Existing user-set TOKENIZERS_PARALLELISM values are respected
  • All tests pass locally with pytest tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py -v

Set TOKENIZERS_PARALLELISM=false in Oneshot.__init__ to prevent the
warning that occurs when FastTokenizer's internal threading conflicts
with dataset.map's multiprocessing (num_proc).

The warning appears as:
"huggingface/tokenizers: The current process just got forked, after
parallelism has already been used. Disabling parallelism to avoid
deadlocks..."

This fix respects any existing user-set TOKENIZERS_PARALLELISM value.

Closes vllm-project#2007

Signed-off-by: majiayu000 <1835304752@qq.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @majiayu000, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request aims to improve the user experience during oneshot calibration by eliminating a common tokenizer parallelism warning. It implements a targeted fix that automatically manages the TOKENIZERS_PARALLELISM environment variable to prevent conflicts arising from multiprocessing, while ensuring that any explicit user configurations are preserved. The change is supported by new unit tests to guarantee its intended functionality.

Highlights

  • Suppress Tokenizer Parallelism Warning: The pull request addresses and suppresses a recurring tokenizer parallelism warning that appears during oneshot calibration. This is achieved by setting the TOKENIZERS_PARALLELISM environment variable to false within the Oneshot.__init__ method, specifically when it's not already set by the user. This prevents conflicts between FastTokenizer's internal threading and dataset.map's multiprocessing.
  • Added Unit Tests: New unit tests have been introduced to verify the correct behavior of the TOKENIZERS_PARALLELISM environment variable handling. These tests confirm that the variable is set to false when not present and that any existing user-defined value for TOKENIZERS_PARALLELISM is respected and not overridden.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the tokenizer parallelism warning by setting the TOKENIZERS_PARALLELISM environment variable. The implementation in Oneshot.__init__ is straightforward and correctly avoids overriding existing user settings. The new tests in test_tokenizer_parallelism.py are comprehensive, covering both scenarios where the environment variable is set and not set. I've provided a few suggestions to enhance the new test file's maintainability and robustness by introducing a constant for the environment variable name and strengthening the assertions.

@github-actions
Copy link

github-actions bot commented Jan 4, 2026

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

Copy link
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome investigation work! I think that now that we have a clear understanding of what's going on (which has now been documented), this change is justified. Please add a warning to make sure that users are aware of the change in environment.

Address review feedback by adding a warning log message when
TOKENIZERS_PARALLELISM is automatically set to false. This ensures
users are aware of the environment change.

Also improved test file:
- Added _TOKENIZERS_PARALLELISM_ENV constant for maintainability
- Changed os.environ.get() to os.environ[] for explicit assertions

Signed-off-by: majiayu000 <1835304752@qq.com>
kylesayrs
kylesayrs previously approved these changes Jan 10, 2026
Copy link
Collaborator

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: lif <1835304752@qq.com>
@majiayu000 majiayu000 force-pushed the fix/tokenizer-parallelism-warning branch from adf30e4 to 1cc5939 Compare January 10, 2026 15:55
@mergify
Copy link
Contributor

mergify bot commented Jan 15, 2026

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

Signed-off-by: lif <1835304752@qq.com>
@majiayu000 majiayu000 force-pushed the fix/tokenizer-parallelism-warning branch from 0b1b88f to e23d5da Compare January 15, 2026 08:46
Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @majiayu000 , please run formatting as mergify explains and we can work to merge this in. Thanks for the contribution!

@majiayu000
Copy link
Contributor Author

Done! I've run make style and all formatting checks have passed. The code is now ready for review. ✓

HDCharles
HDCharles previously approved these changes Jan 28, 2026
@mergify
Copy link
Contributor

mergify bot commented Jan 28, 2026

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

@HDCharles
Copy link
Collaborator

still failing, you need to do make style and make quality

@mergify mergify bot removed the quality-failed label Jan 28, 2026
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: majiayu000 <1835304752@qq.com>
@majiayu000 majiayu000 force-pushed the fix/tokenizer-parallelism-warning branch from b5f4165 to 96751e0 Compare January 28, 2026 16:54
@majiayu000
Copy link
Contributor Author

@HDCharles All check passed. Thanks!
image

@mergify
Copy link
Contributor

mergify bot commented Jan 28, 2026

The quality checks have failed. Please run make style and make quality under
the root directory to adddress the lint failures. You will need to install the
dev optional install to get the required linting packages:
https://github.com/vllm-project/llm-compressor/blob/main/CONTRIBUTING.md

@HDCharles
Copy link
Collaborator

its still failing the quality checks

usually this happens when you have a different version of ruff

if you do pip install -e ./[dev]

or just make sure your ruff version is the same as in https://github.com/vllm-project/llm-compressor/blob/main/setup.py#L172 it should work.

Copy link
Collaborator

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are green -- thanks for updating and for the contribution!

@HDCharles HDCharles merged commit bd111bc into vllm-project:main Feb 2, 2026
10 of 11 checks passed
@majiayu000 majiayu000 deleted the fix/tokenizer-parallelism-warning branch February 2, 2026 16:26
cajeonrh pushed a commit to cajeonrh/llm-compressor that referenced this pull request Feb 10, 2026
…2183)

SUMMARY:
Suppress the tokenizer parallelism warning that appears during oneshot
calibration by setting `TOKENIZERS_PARALLELISM=false` in
`Oneshot.__init__`.

The warning occurs when FastTokenizer's internal threading conflicts
with `dataset.map`'s multiprocessing (`num_proc` parameter). This fix
sets the environment variable early in the oneshot lifecycle to prevent
the conflict, while respecting any existing user-set value.

Closes vllm-project#2007

TEST PLAN:
- Added unit tests in
`tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py`
- Tests verify:
  1. `TOKENIZERS_PARALLELISM` is set to `false` when not already set
  2. Existing user-set `TOKENIZERS_PARALLELISM` values are respected
- All tests pass locally with `pytest
tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py
-v`

---------

Signed-off-by: majiayu000 <1835304752@qq.com>
Signed-off-by: lif <1835304752@qq.com>
Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Co-authored-by: HDCharles <39544797+HDCharles@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Help Wanted] Tokenzier warning messages

5 participants