-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLM] Unify pipeline model with PretrainModelPipe #7095
Conversation
Thanks for your contribution! |
Codecov Report
@@ Coverage Diff @@
## develop #7095 +/- ##
===========================================
- Coverage 59.84% 59.55% -0.29%
===========================================
Files 557 563 +6
Lines 82150 82775 +625
===========================================
+ Hits 49161 49299 +138
- Misses 32989 33476 +487
|
if not model_args.continue_training: | ||
config.max_position_embeddings = max(config.max_position_embeddings, data_args.max_seq_length) | ||
|
||
if not model_args.continue_training: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的vocab size的改动目的是什么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同时改动vocab size之后会对后续热启word embedding有影响吗
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
适配TP,随机初始化的才改
@@ -268,7 +173,7 @@ def _logits_helper(embedding, output): | |||
shared_weight_attr="embedding_weight", | |||
config=config, | |||
), | |||
"gpt", | |||
"gpt.embeddings", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
影响精度
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
Others
Description
Unify pipeline model with PretrainModelPipe