-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: NVIDIA/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix: Resolve multimodal model errors and update README usage instructions
#1286
opened Nov 13, 2024 by
singleheart
Loading…
Fix a bug in optimizer's mix_lr/max_lr when args.override_opt_param_scheduler==True
#1284
opened Nov 12, 2024 by
lyuwen
Loading…
fix: remove unnecessary trailing comma in statement
#1265
opened Oct 29, 2024 by
singleheart
Loading…
Enabling LR scaling for a specific layer (ex. down-projection...) during pretraining
#1262
opened Oct 28, 2024 by
dhia680
Loading…
[ENHANCEMENT] Add support for Apex RMSNorm for use in qk-norm
#1261
opened Oct 28, 2024 by
wdevazelhes
Loading…
Make it an option to use TransformerEngine activation function in FFN block
#1233
opened Oct 21, 2024 by
guyueh1
Loading…
support qwen2 and siglip weight conversion script to enable training …
#1221
opened Oct 16, 2024 by
tao-githup
Loading…
[Functions] Support Packed_seq_params in Megatron-LM
#1215
opened Oct 12, 2024 by
Baibaifan
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.