-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP Batch Integration: launch jobs directly on GCP Batch #621
Conversation
@priyaramani has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2 similar comments
@priyaramani has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@priyaramani has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Codecov Report
@@ Coverage Diff @@
## main #621 +/- ##
==========================================
- Coverage 94.60% 94.50% -0.10%
==========================================
Files 69 71 +2
Lines 4908 5024 +116
==========================================
+ Hits 4643 4748 +105
- Misses 265 276 +11
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: 82fe7d50668871a31805116eb77b2318bb823abf
78e1ab7
to
947e614
Compare
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: 0a9afc9b2fe585ed9fbd30e6c06ed9fe7794db7a
947e614
to
1830ea0
Compare
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: 11c7bbb81bffc959585d1120bc4a57a1e8b19d71
1830ea0
to
1aa6bf3
Compare
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: 2ed7577632df1118aa414f67ca3525720790ac04
1aa6bf3
to
758f421
Compare
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: ab6b2d2e2ac7aeaceb8904fe643537d33d90b8ef
758f421
to
4fcb495
Compare
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: 5047dc749a629bce232d77d11e0c3cd6a2de1253
This pull request was exported from Phabricator. Differential Revision: D40486955 |
4fcb495
to
abfbaf7
Compare
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: fd5b3025b3debb78276ea41a5a7a1d26ee6d711a
abfbaf7
to
5bde6e5
Compare
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Summary: Support directly scheduling jobs on GCP Batch - Native support for launching Pytorch jobs on GCP: Currently you could use TorchX to launch training jobs on Kubernetes on GCP for which you need to set up Kube clusters etc, or use GCP managed services like Vertex AI. With this integration, the overhead to setup other services goes away and customers can directly launch their training jobs from TorchX on GCP schedulers. - Cloud agnostic interface: In addition to current Pytorch customers using GCP, this adds flexibility for customers using one cloud provider to explore others as this adds the ability to easily migrate their Pytorch jobs from one platform to another. Pull Request resolved: #621 Test Plan: Unit tests ![Screen Shot 2022-10-18 at 12 30 38 PM](https://user-images.githubusercontent.com/87679608/196532219-8da3df5c-3053-4800-9cc3-8b2f4c52acea.png) Reviewed By: d4l3k Differential Revision: D40486955 Pulled By: priyaramani fbshipit-source-id: 742222936a97767891a03eae9ccd7c488665da70
5bde6e5
to
1c4fedb
Compare
This pull request was exported from Phabricator. Differential Revision: D40486955 |
Support directly scheduling jobs on GCP Batch
Addresses #410
Test plan:
Unit tests