-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multinode cloud component #15965
Conversation
⚡ Required checks status: All passing 🟢Groups summary🟢 lightning_app: Tests workflow
These checks are required after the changes to 🟢 lightning_app: Examples
These checks are required after the changes to 🟢 lightning_app: Azure
These checks are required after the changes to 🟢 lightning_app: Docs
These checks are required after the changes to 🟢 mypy
These checks are required after the changes to 🟢 installThese checks are required after the changes to Thank you for your contribution! 💜
|
@justusschock This should probably apply to more components. |
* fix multinode cloud component * add tests (cherry picked from commit d21b899)
* Apply dynamo to training_step, validation_step, test_step, predict_step (#15957) * Apply dynamo to training_step, validation_step, test_step, predict_step * Add entry to CHANGELOG.md (cherry picked from commit edc9986) * [App] Resolve run installation (#15974) (cherry picked from commit dd83587) * App: Move AutoScaler dependency to extra requirements (#15971) * Make autoscaler dependency optional * update chglog * dont directly import aiohttp (cherry picked from commit 346e936) # Conflicts: # requirements/app/base.txt # src/lightning_app/CHANGELOG.md * Avoid using the same port number for autoscaler works (#15966) * dont hardcode port in python server * add another chglog (cherry picked from commit a72d268) * Fix `action_name` usage in `XLAProfiler` (#15886) * Fix `action_name` usage in `XLAProfiler` * add changelog * Update src/pytorch_ligh * Update xla.py Co-authored-by: awaelchli <[email protected]> Co-authored-by: Jirka Borovec <[email protected]> (cherry picked from commit c748f82) * Fix multinode cloud component (#15965) * fix multinode cloud component * add tests (cherry picked from commit d21b899) * ci: update signaling (#15981) * ci: update signaling * config (cherry picked from commit e56e7f1) * Fix cloudcomputes registration for structures (#15964) * fix cloudcomputes * updates cloudcompute registration * changelog (cherry picked from commit 90a4c02) * Document running dev lightning on the cloud (#15962) * document running dev lightning on the cloud * document running dev lightning on the cloud * Update .github/CONTRIBUTING.md Co-authored-by: Noha Alon <[email protected]> * document running dev lightning on the cloud * git clone & pip install -e * Update .github/CONTRIBUTING.md Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: Noha Alon <[email protected]> Co-authored-by: Jirka Borovec <[email protected]> (cherry picked from commit cfd00d3) * [App] Install exact version whn upgrading and not when testing (#15984) * [App] Install exact version whn upgrading and not when testing * Update CHANGELOG.md Co-authored-by: Jirka Borovec <[email protected]> (cherry picked from commit 1657ea8) * releasing 1.8.4.post0 Co-authored-by: Luca Antiga <[email protected]> Co-authored-by: thomas chaton <[email protected]> Co-authored-by: Akihiro Nitta <[email protected]> Co-authored-by: Liyang90 <[email protected]> Co-authored-by: Justus Schock <[email protected]> Co-authored-by: Ethan Harris <[email protected]>
What does this PR do?
This PR ensures that multinode uses separate unique cloud computes for each spawned work.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃
cc @Borda