Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend memory freeing to other PipelinedForwards #1971

Closed
wants to merge 3 commits into from

Conversation

dstaay-fb
Copy link
Contributor

Summary:
Biggest win in semi-sync pipeline.

Post diff

TrainPipelineBase | Runtime (P90): 10.098 s | Memory (P90): 8.418 GB
TrainPipelineSparseDist | Runtime (P90): 10.050 s | Memory (P90): 8.655 GB
TrainPipelineSemiSync | Runtime (P90): 9.541 s | Memory (P90): 10.332 GB
PrefetchTrainPipelineSparseDist | Runtime (P90): 10.063 s | Memory (P90): 8.918 GB

Pre diff
TrainPipelineBase | Runtime (P90): 10.125 s | Memory (P90): 8.418 GB
TrainPipelineSparseDist | Runtime (P90): 10.033 s | Memory (P90): 8.654 GB
TrainPipelineSemiSync | Runtime (P90): 9.529 s | Memory (P90): 11.932 GB
PrefetchTrainPipelineSparseDist | Runtime (P90): 10.109 s | Memory (P90): 8.910 GB

Differential Revision: D57169568

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 9, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57169568

dstaay-fb added 3 commits May 9, 2024 14:05
Summary:
X-link: facebookresearch/recipes#43


As users highlighted, TrainPipeline refactoring introduced memory regression ~2% due to more context management for code readability.  This results in higher peak memory (takes longer for a context to drop out of refcount)

relatively easy to get a lot more aggressive about releasing memory stored in TrainPipelineContext.

broader internal discusion:
https://fb.workplace.com/groups/970281557043698/permalink/1664528510952329/

Reviewed By: joshuadeng, henrylhtsang

Differential Revision:
D57123339

Privacy Context Container: 1203980333745195
Summary:

PrefetchTrainPipelineSparseDist - use legacy TrainPipeline API and will refactor newer internals assuming memory neutral / or better.

Differential Revision: D57143337
Summary:

Biggest win in semi-sync pipeline.

Post diff

  TrainPipelineBase                   | Runtime (P90): 10.098 s | Memory (P90): 8.418 GB
  TrainPipelineSparseDist             | Runtime (P90): 10.050 s | Memory (P90): 8.655 GB
  TrainPipelineSemiSync               | Runtime (P90): 9.541 s | Memory (P90): 10.332 GB
  PrefetchTrainPipelineSparseDist     | Runtime (P90): 10.063 s | Memory (P90): 8.918 GB

Pre diff
  TrainPipelineBase                   | Runtime (P90): 10.125 s | Memory (P90): 8.418 GB
  TrainPipelineSparseDist             | Runtime (P90): 10.033 s | Memory (P90): 8.654 GB
  TrainPipelineSemiSync               | Runtime (P90): 9.529 s | Memory (P90): 11.932 GB
  PrefetchTrainPipelineSparseDist     | Runtime (P90): 10.109 s | Memory (P90): 8.910 GB

Reviewed By: henrylhtsang

Differential Revision: D57169568
@dstaay-fb dstaay-fb force-pushed the export-D57169568 branch from 4e8a19a to a3f26da Compare May 9, 2024 21:05
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57169568

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants