Mark scale as const and remove --fp8 flag usage by Yantom1 · Pull Request #156 · HabanaAI/optimum-habana-fork

Yantom1 · 2024-04-10T13:23:48Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HolyFalafel · 2024-04-10T16:04:24Z

    set_seed(args.seed)
    get_repo_root(args.model_name_or_path, local_rank=args.local_rank, token=args.token)
    use_deepspeed = args.world_size > 0
-    if use_deepspeed or args.bf16 or args.fp8:


This changes behavior than what we had before, is it tested on 7b? (seems like the only way to be false)

we always set --bf16 even in fp8 runs so it should work.

HolyFalafel · 2024-04-10T16:05:20Z

        import habana_frameworks.torch.core as htcore

-        if args.fp8:
+        if args.quant_config:


This is true also when performing a measurement, was it tested?

HolyFalafel · 2024-04-10T16:05:33Z

    if args.const_serialization_path:
        setup_const_serialization(args.const_serialization_path)
-    if args.fp8:
+    if args.quant_config:


Also in measurement

ulivne · 2024-04-11T10:31:48Z

    const_marking = os.getenv("ENABLE_CONST_MARKING", "True")
    if const_marking == "True":
-        htcore.hpu_initialize(model)
+        htcore.hpu_initialize(model, mark_only_scales_as_const=True)


if we mark only scales as const we may lose some constant folding optimization (in theory).
for example we may lose transpose on constant weitghs, i think this happens in SDXL.

So anyway did you tested it for performance right ?

@ulivne it was tested on llama 70B and 7B.
the transpose on weights now happen in HQT patched module init so it should be fine

MrGeva · 2024-04-11T06:17:20Z

    set_seed(args.seed)
    get_repo_root(args.model_name_or_path, local_rank=args.local_rank, token=args.token)
    use_deepspeed = args.world_size > 0
-    if use_deepspeed or args.bf16 or args.fp8:


we always set --bf16 even in fp8 runs so it should work.

MrGeva · 2024-04-14T08:54:46Z

    const_marking = os.getenv("ENABLE_CONST_MARKING", "True")
    if const_marking == "True":
-        htcore.hpu_initialize(model)
+        htcore.hpu_initialize(model, mark_only_scales_as_const=True)


@ulivne it was tested on llama 70B and 7B.
the transpose on weights now happen in HQT patched module init so it should be fine

MrGeva · 2024-04-14T09:04:57Z

@@ -102,7 +102,7 @@ def setup_inference(args, model):
    print("Initializing inference mode")
    const_marking = os.getenv("ENABLE_CONST_MARKING", "True")


@Yantom1 you should remove it. we should not use this variable anymore. we should always call hpu_initialize in this function. as QA set this flag to false, did you test it correctly?

Change-Id: I6dba8691d842fc62d09da5202ea1e61a111f5f18

it was tested

* Mark only scales as const * remove --fp8 flag usage from llama * removed usage of ENABLE_CONST_MARKING Change-Id: I6dba8691d842fc62d09da5202ea1e61a111f5f18 --------- Co-authored-by: Eran Geva <egeva@habana.ai>

astachowiczhabana · 2024-06-12T10:27:40Z

huggingface#962

* Add memory, graph stats * fix import formatting issues * sort imports * sort imports

Yantom1 added 2 commits April 2, 2024 15:01

Mark only scales as const

3f518ed

remove --fp8 flag usage from llama

412a380

Yantom1 requested review from HolyFalafel, MrGeva and bgoldberg-habana April 10, 2024 13:23

Yantom1 changed the title ~~Dev/ytoms111~~ Mark scale as const and remove --fp8 flag usage Apr 10, 2024

Yantom1 requested review from dudilester and ulivne April 10, 2024 13:30

HolyFalafel previously requested changes Apr 10, 2024

View reviewed changes

ulivne reviewed Apr 11, 2024

View reviewed changes

Yantom1 requested review from HolyFalafel and ulivne April 11, 2024 15:48

MrGeva reviewed Apr 14, 2024

View reviewed changes

removed usage of ENABLE_CONST_MARKING

ea6c51e

Change-Id: I6dba8691d842fc62d09da5202ea1e61a111f5f18

MrGeva approved these changes Apr 16, 2024

View reviewed changes

MrGeva merged commit 8bfd6ef into habana-main Apr 16, 2024

MrGeva deleted the dev/ytoms111 branch April 16, 2024 17:38

astachowiczhabana pushed a commit that referenced this pull request Mar 11, 2025

Adding memory and graph stats (#156)

4acbee4

* Add memory, graph stats * fix import formatting issues * sort imports * sort imports

astachowiczhabana pushed a commit that referenced this pull request Mar 31, 2025

Adding memory and graph stats (#156)

1a05f01

* Add memory, graph stats * fix import formatting issues * sort imports * sort imports

vivekgoe pushed a commit that referenced this pull request Apr 18, 2025

Adding memory and graph stats (#156) (huggingface#1858)

2e30261

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark scale as const and remove --fp8 flag usage#156

Mark scale as const and remove --fp8 flag usage#156
MrGeva merged 3 commits into
habana-mainfrom
dev/ytoms111

Yantom1 commented Apr 10, 2024

Uh oh!

HolyFalafel Apr 10, 2024

Uh oh!

MrGeva Apr 11, 2024

Uh oh!

HolyFalafel Apr 10, 2024

Uh oh!

Yantom1 Apr 11, 2024

Uh oh!

HolyFalafel Apr 10, 2024

Uh oh!

Yantom1 Apr 11, 2024

Uh oh!

ulivne Apr 11, 2024

Uh oh!

MrGeva Apr 14, 2024

Uh oh!

MrGeva Apr 11, 2024

Uh oh!

MrGeva Apr 14, 2024

Uh oh!

MrGeva Apr 14, 2024

Uh oh!

astachowiczhabana commented Jun 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		@@ -102,7 +102,7 @@ def setup_inference(args, model):
		print("Initializing inference mode")
		const_marking = os.getenv("ENABLE_CONST_MARKING", "True")

Conversation

Yantom1 commented Apr 10, 2024

What does this PR do?

Before submitting

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

astachowiczhabana commented Jun 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants