Skip to content

cp dataloader #3626

Closed
SunMarc wants to merge 3 commits into
mainfrom
cp-dataloader
Closed

cp dataloader #3626
SunMarc wants to merge 3 commits into
mainfrom
cp-dataloader

Conversation

@SunMarc
Copy link
Copy Markdown
Member

@SunMarc SunMarc commented Jun 12, 2025

What does this PR do?

To try CP support for dataloader. Make sure to set dispatch_batches to False and split_batches to False in accelerate config

cc @qgallouedec

@qgallouedec qgallouedec marked this pull request as draft June 12, 2025 17:23
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link
Copy Markdown
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions Bot closed this Jul 21, 2025
@SunMarc SunMarc reopened this Jul 22, 2025
Comment on lines 1138 to +1144
process_index = process_index // submesh_tp_size
num_processes = submesh_fsdp_size * submesh_dp_size

if cp:
process_index = 0
num_processes = 1

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does only 1 process break n-d parallel? Maybe something like?

Suggested change
process_index = process_index // submesh_tp_size
num_processes = submesh_fsdp_size * submesh_dp_size
if cp:
process_index = 0
num_processes = 1
process_index = process_index // (submesh_tp_size * submesh_cp_size)
num_processes = submesh_fsdp_size * submesh_dp_size // (submesh_tp_size * submesh_cp_size)
if cp:
process_index = 0
num_processes = 1

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed we will have something like that. I just opened this PR to not forget about this but we will upstream the changes to main in another pr when n-d parallelism pr will be finished.

@github-actions github-actions Bot closed this Jul 30, 2025
@SunMarc
Copy link
Copy Markdown
Member Author

SunMarc commented Jul 30, 2025

this should cover this PR #3682

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants