-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
context parallelism #7739
context parallelism #7739
Commits on Jun 6, 2023
-
make nemo recognize sequence_parallel_size
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for afce64e - Browse repository at this point
Copy the full SHA afce64eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 3f98473 - Browse repository at this point
Copy the full SHA 3f98473View commit details -
add helper functions to set up SP running in TE
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e313000 - Browse repository at this point
Copy the full SHA e313000View commit details -
Configuration menu - View commit details
-
Copy full SHA for 52ff102 - Browse repository at this point
Copy the full SHA 52ff102View commit details
Commits on Jun 8, 2023
-
slice seq length for a specific rank
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5580955 - Browse repository at this point
Copy the full SHA 5580955View commit details -
Configuration menu - View commit details
-
Copy full SHA for 887c615 - Browse repository at this point
Copy the full SHA 887c615View commit details -
fix data_parallel_size calculation
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for ebd6323 - Browse repository at this point
Copy the full SHA ebd6323View commit details -
Configuration menu - View commit details
-
Copy full SHA for 58cca3d - Browse repository at this point
Copy the full SHA 58cca3dView commit details -
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 87f027a - Browse repository at this point
Copy the full SHA 87f027aView commit details -
pass sp_global_ranks to TE transformer layer
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 9ebfcf7 - Browse repository at this point
Copy the full SHA 9ebfcf7View commit details
Commits on Jun 9, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 728fd43 - Browse repository at this point
Copy the full SHA 728fd43View commit details
Commits on Jun 13, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 66615e8 - Browse repository at this point
Copy the full SHA 66615e8View commit details
Commits on Jun 17, 2023
-
fix attn_mask split across seq-length dim
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e1f5eb7 - Browse repository at this point
Copy the full SHA e1f5eb7View commit details -
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for cf0c75c - Browse repository at this point
Copy the full SHA cf0c75cView commit details
Commits on Jun 21, 2023
-
Configuration menu - View commit details
-
Copy full SHA for b57e218 - Browse repository at this point
Copy the full SHA b57e218View commit details -
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 69f4ae8 - Browse repository at this point
Copy the full SHA 69f4ae8View commit details
Commits on Jun 22, 2023
-
Configuration menu - View commit details
-
Copy full SHA for a38dd9a - Browse repository at this point
Copy the full SHA a38dd9aView commit details -
Configuration menu - View commit details
-
Copy full SHA for b31e31f - Browse repository at this point
Copy the full SHA b31e31fView commit details -
rename sequence_parallelism to context_parallelism
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 8ac42f1 - Browse repository at this point
Copy the full SHA 8ac42f1View commit details
Commits on Jun 24, 2023
-
Configuration menu - View commit details
-
Copy full SHA for f7c9b5b - Browse repository at this point
Copy the full SHA f7c9b5bView commit details -
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 49b1052 - Browse repository at this point
Copy the full SHA 49b1052View commit details
Commits on Aug 1, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ae889fc - Browse repository at this point
Copy the full SHA ae889fcView commit details
Commits on Aug 3, 2023
-
make sure do not call megatron-core parallel_state while cp_size is 1
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2c43687 - Browse repository at this point
Copy the full SHA 2c43687View commit details -
Configuration menu - View commit details
-
Copy full SHA for 25bf369 - Browse repository at this point
Copy the full SHA 25bf369View commit details -
slice position embedding for different CP rank
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 61af551 - Browse repository at this point
Copy the full SHA 61af551View commit details -
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for dc8a540 - Browse repository at this point
Copy the full SHA dc8a540View commit details -
Configuration menu - View commit details
-
Copy full SHA for 46479c6 - Browse repository at this point
Copy the full SHA 46479c6View commit details
Commits on Aug 4, 2023
-
Configuration menu - View commit details
-
Copy full SHA for b64b563 - Browse repository at this point
Copy the full SHA b64b563View commit details
Commits on Aug 6, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 0362de6 - Browse repository at this point
Copy the full SHA 0362de6View commit details -
Configuration menu - View commit details
-
Copy full SHA for e1654fb - Browse repository at this point
Copy the full SHA e1654fbView commit details
Commits on Aug 8, 2023
-
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4f0a3be - Browse repository at this point
Copy the full SHA 4f0a3beView commit details -
Configuration menu - View commit details
-
Copy full SHA for c46b42e - Browse repository at this point
Copy the full SHA c46b42eView commit details
Commits on Aug 21, 2023
-
Configuration menu - View commit details
-
Copy full SHA for b6db8f3 - Browse repository at this point
Copy the full SHA b6db8f3View commit details
Commits on Aug 22, 2023
-
do not load attention mask if it's not needed
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 4076d06 - Browse repository at this point
Copy the full SHA 4076d06View commit details -
cherry pick attention mask data loader skip
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3353e13 - Browse repository at this point
Copy the full SHA 3353e13View commit details
Commits on Aug 23, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 433f6a7 - Browse repository at this point
Copy the full SHA 433f6a7View commit details
Commits on Aug 25, 2023
-
Configuration menu - View commit details
-
Copy full SHA for c4592e8 - Browse repository at this point
Copy the full SHA c4592e8View commit details
Commits on Sep 6, 2023
-
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 5efaa76 - Browse repository at this point
Copy the full SHA 5efaa76View commit details
Commits on Sep 14, 2023
-
address naming confusion of mixed dp and cp
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 006677d - Browse repository at this point
Copy the full SHA 006677dView commit details -
Configuration menu - View commit details
-
Copy full SHA for d64b85d - Browse repository at this point
Copy the full SHA d64b85dView commit details
Commits on Sep 25, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 499f0d6 - Browse repository at this point
Copy the full SHA 499f0d6View commit details
Commits on Sep 30, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 693b8b7 - Browse repository at this point
Copy the full SHA 693b8b7View commit details
Commits on Oct 3, 2023
-
rewrite cp code by assuming with_context_parallel=False
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0f7d079 - Browse repository at this point
Copy the full SHA 0f7d079View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4dcfdb6 - Browse repository at this point
Copy the full SHA 4dcfdb6View commit details -
pop context_parallel from dist opt kwargs
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 3351953 - Browse repository at this point
Copy the full SHA 3351953View commit details
Commits on Oct 5, 2023
-
make sure amax reduction group is aware of context parallelism
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 08f785b - Browse repository at this point
Copy the full SHA 08f785bView commit details -
remove use_fp8 from initialize_model_parallel
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for e277b3d - Browse repository at this point
Copy the full SHA e277b3dView commit details -
Configuration menu - View commit details
-
Copy full SHA for a27155c - Browse repository at this point
Copy the full SHA a27155cView commit details
Commits on Oct 6, 2023
-
make implementaitons of setup_transformer_engine_tp_groups and setup_…
…transformer_engine_cp_running consistent Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for dc65d34 - Browse repository at this point
Copy the full SHA dc65d34View commit details
Commits on Oct 10, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 42a6b83 - Browse repository at this point
Copy the full SHA 42a6b83View commit details
Commits on Oct 11, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 5013189 - Browse repository at this point
Copy the full SHA 5013189View commit details
Commits on Oct 13, 2023
-
make loss logging broadcast aware of cp
Signed-off-by: xren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 52dd50b - Browse repository at this point
Copy the full SHA 52dd50bView commit details -
Configuration menu - View commit details
-
Copy full SHA for b61fa4e - Browse repository at this point
Copy the full SHA b61fa4eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 52381e8 - Browse repository at this point
Copy the full SHA 52381e8View commit details -
Merge branch 'xren/context_parallelism' of github.com:xrennvidia/NeMo…
… into xren/context_parallelism
Configuration menu - View commit details
-
Copy full SHA for fb9cc3d - Browse repository at this point
Copy the full SHA fb9cc3dView commit details
Commits on Oct 14, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 1b92952 - Browse repository at this point
Copy the full SHA 1b92952View commit details -
Configuration menu - View commit details
-
Copy full SHA for e394392 - Browse repository at this point
Copy the full SHA e394392View commit details
Commits on Oct 16, 2023
-
import transformer layer specs from MCore
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f9bf0d8 - Browse repository at this point
Copy the full SHA f9bf0d8View commit details
Commits on Oct 17, 2023
-
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 1f8815f - Browse repository at this point
Copy the full SHA 1f8815fView commit details -
Configuration menu - View commit details
-
Copy full SHA for f1bc1a7 - Browse repository at this point
Copy the full SHA f1bc1a7View commit details -
Configuration menu - View commit details
-
Copy full SHA for a40b183 - Browse repository at this point
Copy the full SHA a40b183View commit details -
add add context_parallel into the kwargs of dist opt
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for d15ae17 - Browse repository at this point
Copy the full SHA d15ae17View commit details
Commits on Oct 19, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 8ae9061 - Browse repository at this point
Copy the full SHA 8ae9061View commit details -
Merge branch 'xren/context_parallelism' of github.com:NVIDIA/NeMo int…
…o xren/context_parallelism
Configuration menu - View commit details
-
Copy full SHA for 6be25b9 - Browse repository at this point
Copy the full SHA 6be25b9View commit details
Commits on Oct 25, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 4cbdb0e - Browse repository at this point
Copy the full SHA 4cbdb0eView commit details -
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 55b7e13 - Browse repository at this point
Copy the full SHA 55b7e13View commit details -
[pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Configuration menu - View commit details
-
Copy full SHA for 840103e - Browse repository at this point
Copy the full SHA 840103eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 03b2922 - Browse repository at this point
Copy the full SHA 03b2922View commit details -
Merge branch 'xren/context_parallelism' of github.com:NVIDIA/NeMo int…
…o xren/context_parallelism
Configuration menu - View commit details
-
Copy full SHA for 14a589e - Browse repository at this point
Copy the full SHA 14a589eView commit details -
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 7c5b9c1 - Browse repository at this point
Copy the full SHA 7c5b9c1View commit details
Commits on Nov 1, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 50d0385 - Browse repository at this point
Copy the full SHA 50d0385View commit details
Commits on Nov 9, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 45002d4 - Browse repository at this point
Copy the full SHA 45002d4View commit details
Commits on Nov 17, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 319e659 - Browse repository at this point
Copy the full SHA 319e659View commit details
Commits on Nov 18, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 071d234 - Browse repository at this point
Copy the full SHA 071d234View commit details
Commits on Nov 22, 2023
-
Configuration menu - View commit details
-
Copy full SHA for baafb02 - Browse repository at this point
Copy the full SHA baafb02View commit details
Commits on Nov 23, 2023
-
Configuration menu - View commit details
-
Copy full SHA for bf100fc - Browse repository at this point
Copy the full SHA bf100fcView commit details
Commits on Nov 27, 2023
-
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2da819e - Browse repository at this point
Copy the full SHA 2da819eView commit details
Commits on Dec 4, 2023
-
Configuration menu - View commit details
-
Copy full SHA for cd7021a - Browse repository at this point
Copy the full SHA cd7021aView commit details -
recover seq-length which has been fixed in mcore
Signed-off-by: Xiaowei Ren <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 22eeaf9 - Browse repository at this point
Copy the full SHA 22eeaf9View commit details
Commits on Dec 16, 2023
-
Configuration menu - View commit details
-
Copy full SHA for b56ce02 - Browse repository at this point
Copy the full SHA b56ce02View commit details
Commits on Dec 18, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 3a29733 - Browse repository at this point
Copy the full SHA 3a29733View commit details
Commits on Dec 19, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 5d25e67 - Browse repository at this point
Copy the full SHA 5d25e67View commit details
Commits on Dec 21, 2023
-
Configuration menu - View commit details
-
Copy full SHA for ead55a0 - Browse repository at this point
Copy the full SHA ead55a0View commit details
Commits on Jan 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 3a36003 - Browse repository at this point
Copy the full SHA 3a36003View commit details
Commits on Jan 3, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 2d42b1c - Browse repository at this point
Copy the full SHA 2d42b1cView commit details
Commits on Jan 4, 2024
-
Configuration menu - View commit details
-
Copy full SHA for f66a5aa - Browse repository at this point
Copy the full SHA f66a5aaView commit details
Commits on Jan 5, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 5d464c9 - Browse repository at this point
Copy the full SHA 5d464c9View commit details
Commits on Jan 9, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 2c9c95e - Browse repository at this point
Copy the full SHA 2c9c95eView commit details