Skip to content

Add GLM4 model#33729

Closed
Cyrilvallez wants to merge 39 commits intohuggingface:mainfrom
Cyrilvallez:glm
Closed

Add GLM4 model#33729
Cyrilvallez wants to merge 39 commits intohuggingface:mainfrom
Cyrilvallez:glm

Conversation

@Cyrilvallez
Copy link
Copy Markdown
Member

What does this PR do?

Adds GLM model.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice, you are missing the test files, integration tests etc! (And readme etc)

Comment thread src/transformers/models/glm/configuration_glm.py Outdated
Comment thread src/transformers/models/glm/configuration_glm.py
Comment thread src/transformers/models/glm/configuration_glm.py
initializer_range=0.02,
rms_norm_eps=0.00000015625,
use_rms_norm=True,
apply_residual_connection_post_layernorm=False,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this false for all models? If so, to delete!

self.mlp = GlmMLP(config)
self.input_layernorm = (
GlmRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
if config.use_rms_norm
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check what config uses, but we avoid that in general as well! (code path)

"""

hidden_states_after_norm = self.input_layernorm(hidden_states)
residual = hidden_states_after_norm if self.apply_residual_connection_post_layernorm else hidden_states
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here! check if any released models have both

self.layers = nn.ModuleList(
[GlmDecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
)
if config.post_layer_norm:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread src/transformers/models/glm/modular_glm.py
ArthurZucker and others added 26 commits September 30, 2024 16:03
* HQQ model serialization attempt

* fix hqq dispatch and unexpected keys

* style

* remove check_old_param

* revert to check HQQLinear in quantizer_hqq.py

* revert to check HQQLinear in quantizer_hqq.py

* update HqqConfig default params

* make ci happy

* make ci happy

* revert to HQQLinear check in quantizer_hqq.py

* check hqq_min version 0.2.0

* set axis=1 as default in quantization_config.py

* validate_env with hqq>=0.2.0 version message

* deprecated hqq kwargs message

* make ci happy

* remove run_expected_keys_check hack + bump to 0.2.1 min hqq version

* fix unexpected_keys hqq update

* add pre_quantized check

* add update_expected_keys to base quantizerr

* ci base.py fix?

* ci base.py fix?

* fix "quantization typo" src/transformers/utils/quantization_config.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix post merge

---------

Co-authored-by: Marc Sun <marc@huggingface.co>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something went wrong with the rebasing / merging as you have unrelated changes!

}


class GlmDecoderLayer(nn.Module):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one looks fairly classic, I would have supposed you don't need the forward (unless the issue is with the name of layers?)

@Cyrilvallez
Copy link
Copy Markdown
Member Author

Something went wrong with the rebasing / merging as you have unrelated changes!

Yes, currently looking at it

@ArthurZucker
Copy link
Copy Markdown
Collaborator

Superseed by #33823

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants