Conversation
|
Hey! 🤗 Thanks for your contribution to the Before merging this pull request, slow tests CI should be triggered. To enable this:
(For maintainers) The documentation for slow tests CI on PRs is here. |
207ec14 to
152569e
Compare
| @@ -0,0 +1,27 @@ | |||
| # Copyright 2020 The HuggingFace Team. All rights reserved. | |||
There was a problem hiding this comment.
| # Copyright 2020 The HuggingFace Team. All rights reserved. | |
| # Copyright 2024 The HuggingFace Team. All rights reserved. |
| STATE_DICT_MAPPING = { | ||
| "transformer.output_layer.": "lm_head.", | ||
| "transformer.": "model.", | ||
| ".embedding.word_embeddings.": ".embed_tokens.", | ||
| ".encoder.final_layernorm.": ".norm.", | ||
| ".encoder.layers.": ".layers.", | ||
| "rotary_pos_embed.": "rotary_emb.", | ||
| "self_attention.": "self_attn.", | ||
| "query_key_value.": "qkv_proj.", | ||
| "dense.": "o_proj.", | ||
| "dense_h_to_4h.": "gate_up_proj.", | ||
| "dense_4h_to_h.": "down_proj.", | ||
| } |
There was a problem hiding this comment.
cool! Let's setup good standards however, see MLLAMA, full explicit regex are more informative IMO! 🤗
| vocab_size=original_config.pop("padded_vocab_size"), | ||
| hidden_size=original_config.pop("hidden_size"), | ||
| intermediate_size=original_config.pop("ffn_hidden_size"), | ||
| num_hidden_layers=original_config.pop("num_layers"), | ||
| num_attention_heads=num_attention_heads, | ||
| num_key_value_heads=( | ||
| num_attention_heads | ||
| if not original_config.pop("multi_query_attention") | ||
| else original_config.pop("multi_query_group_num") | ||
| ), | ||
| attention_dropout=original_config.pop("attention_dropout"), | ||
| max_position_embeddings=original_config.pop("seq_length"), | ||
| rms_norm_eps=original_config.pop("layernorm_epsilon"), | ||
| rope_theta=10000.0 * original_config.pop("rope_ratio", 1), | ||
| use_cache=original_config.pop("use_cache"), | ||
| head_dim=original_config.pop("kv_channels"), | ||
| attention_bias=original_config.pop("add_qkv_bias"), | ||
| eos_token_id=original_config.pop("eos_token_id"), | ||
| pad_token_id=original_config.pop("pad_token_id"), | ||
| tie_word_embeddings=original_config.pop("tie_word_embeddings"), |
There was a problem hiding this comment.
Let's try to use ** here for attributes that have the same name
There was a problem hiding this comment.
I didn't to avoid adding unused fields, but I refactored to make that block nicer to read.
| pass | ||
|
|
||
|
|
||
| class GlmSdpaAttention(GlmAttention, GraniteSdpaAttention): |
There was a problem hiding this comment.
| class GlmSdpaAttention(GlmAttention, GraniteSdpaAttention): | |
| class GlmSdpaAttention(GraniteSdpaAttention): |
There was a problem hiding this comment.
I think this should be enough
| @require_torch_sdpa | ||
| @slow | ||
| @is_flaky | ||
| def test_eager_matches_sdpa_inference(self, torch_dtype: str): |
There was a problem hiding this comment.
why do we have to overwrite this one?
There was a problem hiding this comment.
Unfortunately, based on the random inputs there may be some times when one of the cases fail - I overwrote it to add the flaky decorator (which allows the test to consistently pass)
There was a problem hiding this comment.
Cool! In general the least we have to overwrite the better!
There was a problem hiding this comment.
meaning are there ways to remove some of the tests you added?
There was a problem hiding this comment.
Unfortunately no -- based on the random seed, some are failing from time to time, and they need to be flaky to consistently pass
|
Ready for last review @ArthurZucker, |
|
Thank you very much for your help. I also saw this huggingface PR. Thank you again for your support! |
|
Of course! |
ArthurZucker
left a comment
There was a problem hiding this comment.
LGTM anything missing before we merge?
No, only issue are the docstrings in the configuration, but this will be solved with the auto-docstrings. In the meantime, I just moved the config outside modular to please the CIs. |
|
Confimed that slow tests pass for the model. Merging. |
* Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md
|
@Cyrilvallez Hi Cyril, you PR for the 1M version of the model got an unexpected generation. Please refer to here for more information: https://huggingface.co/THUDM/glm-4-9b-chat-1m/discussions/17. |
* Create modular_glm.py * Update modular_glm.py * Finalize architecture without all attentions * Add all attentions modules * Finalize modular * Update given last version * Last update * Finalize model * Finalize converter * Update convert_glm_weights_to_hf.py * style * style * Create __init__.py * Aff all inits * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Correct the rotary embeddings * Remove apply_residual_connection_post_layernorm (always false) * remove use_rms_norm (always true) * remove past_layer_norm (always true) * Update __init__.py * Update config and license * start adding tests and doc * Add doc + style * Update test_modeling_glm.py * Add dummies * Apply correct modeling * Refactor attention to follow llama * Update __init__.py * Update convert_glm_weights_to_hf.py * Correct bias * remove linear_bias and pdrop (never used) * apply modular * Simplify converter * remove dummies + style * add model_input_names * Add pretraining_tp to config for when eager attention is used * Update modular to remove all pretraining_tp * Update test_modeling_glm.py * Update the __all__ * Update __all__ * Update __init__.py * Update test_modeling_glm.py * add revisions * Add the correct repos and revisions * style * Update __init__.py * update exports * remove import of modular files * style * Apply Llama changes + refine converter * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * Update convert_glm_weights_to_hf.py * style * Use new modular converter * add pretrainedmodel to init * style * Update test_modeling_glm.py * Move config outside modular to please CI about docstrings * Add dummies to please CI * Update glm.md * Update glm.md
GLM model!