Skip to content

Integrated NF4 inference tests to text-generation#2058

Merged
vivekgoe merged 1 commit into
huggingface:v1.19-releasefrom
rsshaik1:nf4_tests
Jun 17, 2025
Merged

Integrated NF4 inference tests to text-generation#2058
vivekgoe merged 1 commit into
huggingface:v1.19-releasefrom
rsshaik1:nf4_tests

Conversation

@rsshaik1
Copy link
Copy Markdown
Contributor

This PR integrates inference tests (NF4 quantized Llama-3.1-8B and Llama-3.1-70B using bitsandbytes) to test-text generation

@rsshaik1 rsshaik1 requested a review from regisss as a code owner June 13, 2025 05:44
Copy link
Copy Markdown
Collaborator

@vivekgoe vivekgoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for NF4 quantization via BitsAndBytes (bnb) to the text-generation examples and integrates end-to-end inference tests for Llama-3.1 models quantized with bnb.

  • Introduce --quantize_with_bnb flag in the CLI and handle it in setup_model
  • Extend _test_text_generation and add test_text_generation_bnb in the test suite
  • Update baseline fixtures with expected outputs and throughputs for bnb tests

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
tests/test_text_generation_example.py Added quantize_with_bnb parameter, new run_model_with_bnb test group, and test_text_generation_bnb case
tests/baselines/fixture/tests/test_text_generation_example.json Added baseline entry for test_text_generation_bnb on gaudi2
examples/text-generation/utils.py Added elif args.quantize_with_bnb branch with BitsAndBytesConfig
examples/text-generation/run_generation.py Added --quantize_with_bnb CLI argument
Comments suppressed due to low confidence (1)

examples/text-generation/utils.py:302

  • The new quantize_with_bnb branch in setup_model isn't covered by any unit tests. Consider adding a small test or mocking BitsAndBytesConfig to verify that the correct config is passed when this flag is set.
elif args.quantize_with_bnb:

Comment thread tests/test_text_generation_example.py
Comment thread examples/text-generation/utils.py Outdated
@rsshaik1 rsshaik1 force-pushed the nf4_tests branch 3 times, most recently from 0763eb0 to 325a54a Compare June 16, 2025 06:58
@vivekgoe vivekgoe merged commit 0ad692c into huggingface:v1.19-release Jun 17, 2025
1 check passed
astachowiczhabana pushed a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jul 3, 2025
astachowiczhabana added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jul 8, 2025
astachowiczhabana added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jul 8, 2025
astachowiczhabana added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jul 10, 2025
astachowiczhabana added a commit to HabanaAI/optimum-habana-fork that referenced this pull request Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants