fix: suppress tokenizer parallelism warning in oneshot#2183
fix: suppress tokenizer parallelism warning in oneshot#2183HDCharles merged 14 commits intovllm-project:mainfrom
Conversation
Set TOKENIZERS_PARALLELISM=false in Oneshot.__init__ to prevent the warning that occurs when FastTokenizer's internal threading conflicts with dataset.map's multiprocessing (num_proc). The warning appears as: "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks..." This fix respects any existing user-set TOKENIZERS_PARALLELISM value. Closes vllm-project#2007 Signed-off-by: majiayu000 <1835304752@qq.com>
Summary of ChangesHello @majiayu000, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request aims to improve the user experience during oneshot calibration by eliminating a common tokenizer parallelism warning. It implements a targeted fix that automatically manages the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses the tokenizer parallelism warning by setting the TOKENIZERS_PARALLELISM environment variable. The implementation in Oneshot.__init__ is straightforward and correctly avoids overriding existing user settings. The new tests in test_tokenizer_parallelism.py are comprehensive, covering both scenarios where the environment variable is set and not set. I've provided a few suggestions to enhance the new test file's maintainability and robustness by introducing a constant for the environment variable name and strengthening the assertions.
tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py
Outdated
Show resolved
Hide resolved
tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py
Outdated
Show resolved
Hide resolved
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
kylesayrs
left a comment
There was a problem hiding this comment.
Awesome investigation work! I think that now that we have a clear understanding of what's going on (which has now been documented), this change is justified. Please add a warning to make sure that users are aware of the change in environment.
Address review feedback by adding a warning log message when TOKENIZERS_PARALLELISM is automatically set to false. This ensures users are aware of the environment change. Also improved test file: - Added _TOKENIZERS_PARALLELISM_ENV constant for maintainability - Changed os.environ.get() to os.environ[] for explicit assertions Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Signed-off-by: lif <1835304752@qq.com>
Signed-off-by: majiayu000 <1835304752@qq.com>
adf30e4 to
1cc5939
Compare
|
The quality checks have failed. Please run |
Signed-off-by: lif <1835304752@qq.com>
0b1b88f to
e23d5da
Compare
brian-dellabetta
left a comment
There was a problem hiding this comment.
Hi @majiayu000 , please run formatting as mergify explains and we can work to merge this in. Thanks for the contribution!
Signed-off-by: majiayu000 <1835304752@qq.com>
|
Done! I've run |
|
The quality checks have failed. Please run |
|
still failing, you need to do make style and make quality |
Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: majiayu000 <1835304752@qq.com>
b5f4165 to
96751e0
Compare
|
@HDCharles All check passed. Thanks! |
|
The quality checks have failed. Please run |
|
its still failing the quality checks usually this happens when you have a different version of ruff if you do pip install -e ./[dev] or just make sure your ruff version is the same as in https://github.com/vllm-project/llm-compressor/blob/main/setup.py#L172 it should work. |
brian-dellabetta
left a comment
There was a problem hiding this comment.
tests are green -- thanks for updating and for the contribution!
…2183) SUMMARY: Suppress the tokenizer parallelism warning that appears during oneshot calibration by setting `TOKENIZERS_PARALLELISM=false` in `Oneshot.__init__`. The warning occurs when FastTokenizer's internal threading conflicts with `dataset.map`'s multiprocessing (`num_proc` parameter). This fix sets the environment variable early in the oneshot lifecycle to prevent the conflict, while respecting any existing user-set value. Closes vllm-project#2007 TEST PLAN: - Added unit tests in `tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py` - Tests verify: 1. `TOKENIZERS_PARALLELISM` is set to `false` when not already set 2. Existing user-set `TOKENIZERS_PARALLELISM` values are respected - All tests pass locally with `pytest tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py -v` --------- Signed-off-by: majiayu000 <1835304752@qq.com> Signed-off-by: lif <1835304752@qq.com> Co-authored-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com> Co-authored-by: HDCharles <39544797+HDCharles@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>

SUMMARY:
Suppress the tokenizer parallelism warning that appears during oneshot calibration by setting
TOKENIZERS_PARALLELISM=falseinOneshot.__init__.The warning occurs when FastTokenizer's internal threading conflicts with
dataset.map's multiprocessing (num_procparameter). This fix sets the environment variable early in the oneshot lifecycle to prevent the conflict, while respecting any existing user-set value.Closes #2007
TEST PLAN:
tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.pyTOKENIZERS_PARALLELISMis set tofalsewhen not already setTOKENIZERS_PARALLELISMvalues are respectedpytest tests/llmcompressor/transformers/oneshot/test_tokenizer_parallelism.py -v