-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lightning Lite core and tests #10175
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome !
… Remove tpu spawn
…Lightning/pytorch-lightning into lightning-lite/lite-core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Requesting changes to allow myself to fully review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if someone calls run
twice, does the full accelerator setup happen again? if so, is that always desirable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀🚀🚀🚀
Why do we need a run() method and an abstract class? It looks completely out of place to me and misleading. I really like the simplicitly of LightningLite as a concrete class. I believe it would be great if we could remove the "run" method from here and make LightingLite a concrete class. |
The problem is that the current POC doesn't exercise this use case. It's bad because it's not clear and i don't think it can be done in a clean way. If we ask the user to implement run() and then we ask the user to directly call run() there's no opportunity to incercept anything in between. Does it mean that we will have to add yet another wrapper around LighntiungLite as well? I'd be inclined to ask you folks to exercise these use cases before moving ahead with this prototype and closing on a public API because I'm almost certain it would have to change in order to accomodate these very use cases. |
Why would we have to intercept anything there? All our logic lives either before or after the user-defined run-method.
Definitely not. This is meant as an entry point to lightning for people not willing to fully convert. It will provide some features like hardware acceleration but it won't at any point offer all the features Lightning does. That was never the plan. |
@aazzolini The reason we have a run method under a class is to avoid an inconvenience for the user to manually change their code if they switch from (for example) Lite(gpus=8) to Lite(tpu_cores=8). What would happen if we didn't have a run() method? Let's see: # python script
# training on GPUs
def main():
lite = LightningLite(gpus=2) # no run method
lite.setup()
# train loop
if __name__ == "__main__"
main() Everything works here. It runs as a script or even in a Jupyter notebook. The user want's to try if TPUs have better performance for their use case. They think all they have to do is
Will that wok? No. The TPU code here would require a function to spawn processes from. This is annoying, because we would need to rewrite the code into this: def main():
lite = LightningLite(tpu_cores=?) # no run method
lite.setup()
# train loop
if __name__ == "__main__"
xmp.spawn(main, ..., nprocs=8, method="fork") Lightning does all that for you and can do so because the users code is not just "anywhere" but in the LightningModule. We wanted to have the same convenience with the run() method in Lite. In the above case, under the hood, Lite will do this:
By forcing the user to put their code into the run method, Lite can consistently choose the right operation based on the accelerator choices of the user. How much effort is it to convert the above code to a class + run method? class Lite(LightningLite):
# rename main to run and indent your code. Done!
def run():
...
if __name__ == "__main__"
# main()
lite = Lite(gpus=...) Plus (big plus imo): No launcher utilities. The user just runs python train.py as they would normally, or run their cell in the notebook as usual. Beyond that, there are benefits to converting to Lightning from here, as there is already a class + method which mimics what the LightningModule would look like. |
…Lightning/pytorch-lightning into lightning-lite/lite-core
What does this PR do?
Part of #9987
This is the V1 for the new Lightning Lite package. This PR adds the main Lite components and tests. Planned to be released as part of 1.5.
For the docs, see the follow up #10176.
Demo
Before submitting
PR review
Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
I made sure I had fun coding 🙃
Part of #1 (it's a lie, this is just here to avoid noisy GitHub bot)