Add GLM4 model by Cyrilvallez · Pull Request #33729 · huggingface/transformers

Cyrilvallez · 2024-09-26T14:14:29Z

What does this PR do?

Adds GLM model.

HuggingFaceDocBuilderDev · 2024-09-26T14:39:21Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Super nice, you are missing the test files, integration tests etc! (And readme etc)

ArthurZucker · 2024-09-26T22:17:51Z

+        initializer_range=0.02,
+        rms_norm_eps=0.00000015625,
+        use_rms_norm=True,
+        apply_residual_connection_post_layernorm=False,


is this false for all models? If so, to delete!

ArthurZucker · 2024-09-26T22:19:27Z

+        self.mlp = GlmMLP(config)
+        self.input_layernorm = (
+            GlmRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
+            if config.use_rms_norm


check what config uses, but we avoid that in general as well! (code path)

ArthurZucker · 2024-09-26T22:19:42Z

+        """
+
+        hidden_states_after_norm = self.input_layernorm(hidden_states)
+        residual = hidden_states_after_norm if self.apply_residual_connection_post_layernorm else hidden_states


same here! check if any released models have both

ArthurZucker · 2024-09-26T22:19:49Z

+        self.layers = nn.ModuleList(
+            [GlmDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
+        )
+        if config.post_layer_norm:


ArthurZucker · 2024-09-26T22:22:46Z

Init should be like this: https://github.com/huggingface/transformers/pull/31329/files#diff-e13de4b5db0c6872b5f0ec197d07fdaf80174b37f55bd4fefbe1526e57635683
(you could not know but we ought to enforce this now!)

* HQQ model serialization attempt * fix hqq dispatch and unexpected keys * style * remove check_old_param * revert to check HQQLinear in quantizer_hqq.py * revert to check HQQLinear in quantizer_hqq.py * update HqqConfig default params * make ci happy * make ci happy * revert to HQQLinear check in quantizer_hqq.py * check hqq_min version 0.2.0 * set axis=1 as default in quantization_config.py * validate_env with hqq>=0.2.0 version message * deprecated hqq kwargs message * make ci happy * remove run_expected_keys_check hack + bump to 0.2.1 min hqq version * fix unexpected_keys hqq update * add pre_quantized check * add update_expected_keys to base quantizerr * ci base.py fix? * ci base.py fix? * fix "quantization typo" src/transformers/utils/quantization_config.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix post merge --------- Co-authored-by: Marc Sun <marc@huggingface.co> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

ArthurZucker

Something went wrong with the rebasing / merging as you have unrelated changes!

ArthurZucker · 2024-09-30T14:34:08Z

+}
+
+
+class GlmDecoderLayer(nn.Module):


this one looks fairly classic, I would have supposed you don't need the forward (unless the issue is with the name of layers?)

Cyrilvallez · 2024-09-30T14:36:56Z

Something went wrong with the rebasing / merging as you have unrelated changes!

Yes, currently looking at it

ArthurZucker · 2024-10-02T14:20:49Z

Superseed by #33823

ArthurZucker reviewed Sep 26, 2024

View reviewed changes

ArthurZucker reviewed Sep 27, 2024

View reviewed changes

Comment thread src/transformers/models/glm/modular_glm.py

Cyrilvallez mentioned this pull request Sep 27, 2024

Add GLM-4 and Later GLM Model (Draft) #31977

Closed

3 tasks

ArthurZucker and others added 26 commits September 30, 2024 16:03

fix converter for function definitions

4403382

Create modular_glm.py

30400c0

Update modular_glm.py

5677a55

Finalize architecture without all attentions

c4c9f5c

Add all attentions modules

6443004

Finalize modular

ff01996

Update given last version

7584246

Last update

603421e

Finalize model

9e0dfee

Finalize converter

414100d

Update convert_glm_weights_to_hf.py

b816507

style

590321b

style

85cbd60

Create __init__.py

7ba2f3a

Aff all inits

fd727a6

Update convert_glm_weights_to_hf.py

dfa54bb

Update convert_glm_weights_to_hf.py

ecd5bf4

Update convert_glm_weights_to_hf.py

ccacc3b

Update convert_glm_weights_to_hf.py

bd9b9ee

Update convert_glm_weights_to_hf.py

bba6b12

Update convert_glm_weights_to_hf.py

13934a8

Update convert_glm_weights_to_hf.py

678062d

Update convert_glm_weights_to_hf.py

967944f

Update convert_glm_weights_to_hf.py

2588ee7

Correct the rotary embeddings

8756c10

Cyrilvallez added 13 commits September 30, 2024 16:18

Remove apply_residual_connection_post_layernorm (always false)

e633c22

remove use_rms_norm (always true)

ea3ee4e

remove past_layer_norm (always true)

c2f0a8d

Update __init__.py

e251352

Update config and license

4fbcfce

start adding tests and doc

a1692ab

Add doc + style

3c80274

Update test_modeling_glm.py

73ecb14

Add dummies

d15d0e5

Update back init (because __all__ is not generated from modular)

eacc0ad

Update convert_glm_weights_to_hf.py

499f7a5

Update convert_glm_weights_to_hf.py

65085d2

apply corrected modular

d273a11

Cyrilvallez force-pushed the glm branch from b10be6a to d273a11 Compare September 30, 2024 14:23

ArthurZucker reviewed Sep 30, 2024

View reviewed changes

Cyrilvallez closed this Sep 30, 2024

		}


		class GlmDecoderLayer(nn.Module):

Conversation

Cyrilvallez commented Sep 26, 2024

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Sep 26, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ArthurZucker Sep 26, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 26, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 26, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 26, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 26, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Sep 30, 2024

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Sep 30, 2024

Uh oh!

ArthurZucker commented Oct 2, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants