Support loading hugging face checkpoint by ulivne · Pull Request #1165 · huggingface/optimum-habana

ulivne · 2024-07-28T10:22:27Z

Support loading checkpoint with INC
load_cp explanation
Add torch_dtype bf16 for the model
Support getting quantized weight tensor

* Support loading checkpoint with INC * load_cp explanation * Add torch_dtype bf16 for the model * Support getting quantized weight tensor --------- Co-authored-by: yan tomsinsky <ytomsinsky@habana.ai>

yafshar · 2024-07-29T14:34:47Z

            **model_kwargs,
        )
+    elif args.load_cp:
+        from neural_compressor.torch.quantization import load


@ulivne sounds like neural_compressor is missing from the requirements consider adding it!

@ulivne sounds like neural_compressor is missing from the requirements consider adding it!

neural_compressor is installed automatically as part of habana software stack. it replaces habana_quantization_toolkit which was also not part of requirements.

Support loading hugging face checkpoint huggingface#1165

ulivne · 2024-08-06T07:43:11Z

Removed test as it is uses a huggingface model, and we are not sure that we can use in our open source code (in terms of licence, intel regulations)

regisss

I think load_cp is not explicit enough. Transformers has load_in_4bit which I think would be better here.

@libinta You'll let me know when this PR can be merged.

ulivne · 2024-08-08T07:40:39Z

I think load_cp is not explicit enough. Transformers has load_in_4bit which I think would be better here.

@libinta You'll let me know when this PR can be merged.

this flag triggers neural compressor load api, which also supports loading of high precision models, and is planned to support loading of fp8 models in the future. it is correct that currently for gaudi only 4bit is suppoerted, however i think we should keep the load_cp name to avoid changing it in the future when we add loading of fp8 models.

regisss · 2024-08-08T08:10:51Z

I think load_cp is not explicit enough. Transformers has load_in_4bit which I think would be better here.
@libinta You'll let me know when this PR can be merged.

this flag triggers neural compressor load api, which also supports loading of high precision models, and is planned to support loading of fp8 models in the future. it is correct that currently for gaudi only 4bit is suppoerted, however i think we should keep the load_cp name to avoid changing it in the future when we add loading of fp8 models.

load_in_4_bit is just a proposal but honestly, if I'm a user and I don't know the arguments, load_cp doesn't tell me anything about what it does. I would prefer something more explicit like load_quantized_model or even better automatically detect if this is a 4/8-bit checkpoint.

HuggingFaceDocBuilderDev · 2024-08-08T21:34:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

libinta · 2024-08-08T21:54:12Z

+        action="store_true",
+        help="Whether to load model from hugging face checkpoint.",
+    )
+


one line missing here

Co-authored-by: yan tomsinsky <ytomsinsky@habana.ai> Co-authored-by: Libin Tang <litang@habana.ai>

Support loading hugging face checkpoint

d3a8f9c

* Support loading checkpoint with INC * load_cp explanation * Add torch_dtype bf16 for the model * Support getting quantized weight tensor --------- Co-authored-by: yan tomsinsky <ytomsinsky@habana.ai>

ulivne requested review from libinta and mandy-li as code owners July 28, 2024 10:22

ulivne requested a review from a user July 28, 2024 10:22

ulivne requested a review from regisss as a code owner July 28, 2024 10:22

MrGeva approved these changes Jul 28, 2024

View reviewed changes

libinta added the synapse 1.17_dependency PR not backward compatible can be merged only when synapse 1.17 is available. label Jul 28, 2024

yafshar reviewed Jul 29, 2024

View reviewed changes

Add test for load uint4 checkpoint

ff76274

vidyasiv pushed a commit to emascarenhas/optimum-habana that referenced this pull request Aug 1, 2024

Merge branch 'dev/ulivne/upstream_1_17_load_cp' into syn1.17tr4.43

d8b39ba

Support loading hugging face checkpoint huggingface#1165

vidyasiv added a commit to emascarenhas/optimum-habana that referenced this pull request Aug 2, 2024

Merge branch '1165' into syn1.17tr4.43

c5c9dc4

Support loading hugging face checkpoint huggingface#1165

Update text generation load cp test command and threshold

99dc97f

astachowiczhabana mentioned this pull request Aug 5, 2024

Support loading huggingface checkpoint HabanaAI/optimum-habana-fork#302

Merged

Remove load checkpoint test

72a39e2

regisss reviewed Aug 7, 2024

View reviewed changes

libinta added 3 commits August 8, 2024 14:11

Update README.md

c04cd96

Update run_generation.py

0cb125c

Update utils.py

6d5e0ac

regisss approved these changes Aug 8, 2024

View reviewed changes

Merge branch 'main' into dev/ulivne/upstream_1_17_load_cp

3650fc8

regisss merged commit 3ea3145 into huggingface:main Aug 8, 2024

libinta reviewed Aug 8, 2024

View reviewed changes

regisss pushed a commit that referenced this pull request Aug 8, 2024

Support loading hugging face checkpoint (#1165)

010d48a

Co-authored-by: yan tomsinsky <ytomsinsky@habana.ai> Co-authored-by: Libin Tang <litang@habana.ai>

astachowiczhabana mentioned this pull request Aug 12, 2024

Support loading hugging face checkpoint HabanaAI/optimum-habana-fork#303

Merged

yafshar mentioned this pull request Sep 10, 2024

Supporting llama int4 inference using AutoGPTQ in HPU (#166) #1125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support loading hugging face checkpoint#1165

Support loading hugging face checkpoint#1165
regisss merged 8 commits into
huggingface:mainfrom
HabanaAI:dev/ulivne/upstream_1_17_load_cp

ulivne commented Jul 28, 2024

Uh oh!

yafshar Jul 29, 2024

Uh oh!

ulivne Aug 1, 2024

Uh oh!

ulivne commented Aug 6, 2024

Uh oh!

regisss left a comment

Uh oh!

ulivne commented Aug 8, 2024

Uh oh!

regisss commented Aug 8, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Aug 8, 2024

Uh oh!

libinta Aug 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ulivne commented Jul 28, 2024

Uh oh!

yafshar Jul 29, 2024

Choose a reason for hiding this comment

Uh oh!

ulivne Aug 1, 2024

Choose a reason for hiding this comment

Uh oh!

ulivne commented Aug 6, 2024

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

ulivne commented Aug 8, 2024

Uh oh!

regisss commented Aug 8, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Aug 8, 2024

Uh oh!

libinta Aug 8, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants