Pad the examples for QLoRa finetuning test by ckvermaAI · Pull Request #1941 · huggingface/optimum-habana

ckvermaAI · 2025-04-21T05:44:49Z

Pad the examples up to max_seq_len of 1024
Increase the max_steps (from 5 to 50) and eval_steps (from 3 to 10). Set throughput-related arguments (adjust_throughput, throughput_warmup_steps)
Update the fine-tuning test name
Also, adjust the reference "eval loss" value for fine-tuning test and "output" for inference test.

Additional updates
6. Enable the eager mode for the test (disable the torch.compile mode for now).
7. Add new requirement for installing the bitsandbytes (from https://github.com/bitsandbytes-foundation/bitsandbytes/tree/multi-backend-refactor)

* [SW-226132] Pad the examples * update test name --------- Co-authored-by: Vivek Goel <vgoel@habana.ai>

HuggingFaceDocBuilderDev · 2025-04-21T05:58:53Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vivekgoe

LGTM. @regisss can we add these tests to slow_tests you run for G2/G3? This will help us avoid regressions. We can publish QLoRA support as experimental/beta future on G2/G3 (works for limited configurations and performance is not great). Note this PR is dependent on Synapse 1.21.0 release (not backward compatible).

ckvermaAI · 2025-04-22T06:20:31Z

Support for NF4 quantization/dequantization using Intel Gaudi hardware: bitsandbytes-foundation/bitsandbytes#1592

regisss

LGTM

uartie · 2025-04-22T18:10:52Z

We're starting to see OSError: /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory when running some of the tests (e.g. test_diffusers.py)

We've run python -m pip install .[tests] and it installed bitsandbytes 1.0.0... so not sure why we're seeing this error.

uartie · 2025-04-22T18:55:31Z

...

pip install -r examples/stable-diffusion/requirements.txt
pip install -r examples/stable-diffusion/training/requirements.txt

... appears to cause the error to show up during test run.

uartie · 2025-04-22T19:54:13Z

Possibly related to old peft version in examples/stable-diffusion/training/requirements.txt... please fix the requirements file.

ckvermaAI · 2025-04-23T03:23:12Z

This error is coming from bitsandbytes, it should not be related to peft. I've run other tests locally (test_bnb_qlora.py, test_bnb_inference.py), and I didn't face this issue.

Let me check it again.

ckvermaAI · 2025-04-23T07:30:43Z

In case of HPU, bitsandbytes loads the CPU binaries (https://github.com/bitsandbytes-foundation/bitsandbytes/blob/multi-backend-refactor/bitsandbytes/cextension.py#L73), which are not required.
And if there is no issue in test/topology, you'll not see OSError due to missing *.so file, but if failure occurs due to any other reason, you'll also see the OSError due to missing *.so file.

In short,

I'll move the bitsandbytes installation from setup.py to bitsandbytes tests (test_bnb_qlora.py, test_bnb_inference.py). This should resolve any issue you're seeing.
Later, we'll try to upstream the fix to the bitsandbytes repo.

ckvermaAI · 2025-04-23T08:23:46Z

Fix for the above issue:
Move bitsandbytes requirements from setup.py to bnb tests
#1946

[SW-226132] Pad the examples for QLoRa finetuning test (#252)

9c78421

* [SW-226132] Pad the examples * update test name --------- Co-authored-by: Vivek Goel <vgoel@habana.ai>

ckvermaAI requested a review from regisss as a code owner April 21, 2025 05:44

vivekgoe added the synapse 1.21 label Apr 21, 2025

vivekgoe approved these changes Apr 21, 2025

View reviewed changes

vivekgoe requested a review from libinta April 21, 2025 07:59

libinta added the run-test Run CI for PRs from external contributors label Apr 22, 2025

libinta changed the title ~~[SW-226132] Pad the examples for QLoRa finetuning test~~ Pad the examples for QLoRa finetuning test Apr 22, 2025

regisss reviewed Apr 22, 2025

View reviewed changes

Comment thread setup.py

regisss approved these changes Apr 22, 2025

View reviewed changes

regisss merged commit f52d8cd into huggingface:main Apr 22, 2025
4 checks passed

ckvermaAI mentioned this pull request Apr 25, 2025

Bitsandbytes installation for qlora tests #1951

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pad the examples for QLoRa finetuning test#1941

Pad the examples for QLoRa finetuning test#1941
regisss merged 1 commit into
huggingface:mainfrom
HabanaAI:auto-pr-976a202

ckvermaAI commented Apr 21, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 21, 2025

Uh oh!

vivekgoe left a comment •

edited

Loading

Uh oh!

ckvermaAI commented Apr 22, 2025

Uh oh!

Uh oh!

regisss left a comment

Uh oh!

Uh oh!

uartie commented Apr 22, 2025

Uh oh!

uartie commented Apr 22, 2025

Uh oh!

uartie commented Apr 22, 2025

Uh oh!

ckvermaAI commented Apr 23, 2025

Uh oh!

ckvermaAI commented Apr 23, 2025 •

edited

Loading

Uh oh!

ckvermaAI commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ckvermaAI commented Apr 21, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 21, 2025

Uh oh!

vivekgoe left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ckvermaAI commented Apr 22, 2025

Uh oh!

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

uartie commented Apr 22, 2025

Uh oh!

uartie commented Apr 22, 2025

Uh oh!

uartie commented Apr 22, 2025

Uh oh!

ckvermaAI commented Apr 23, 2025

Uh oh!

ckvermaAI commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ckvermaAI commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vivekgoe left a comment •

edited

Loading

ckvermaAI commented Apr 23, 2025 •

edited

Loading