Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(wandb): log models as artifacts #6231

Merged
merged 60 commits into from
May 27, 2021

Conversation

borisdayma
Copy link
Contributor

@borisdayma borisdayma commented Feb 26, 2021

What does this PR do?

Fixes #4903

should be merged after #5931

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Feb 26, 2021

Hello @borisdayma! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-05-27 09:40:00 UTC

@codecov
Copy link

codecov bot commented Feb 26, 2021

Codecov Report

Merging #6231 (e0f302f) into master (04dcb17) will decrease coverage by 5%.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master   #6231    +/-   ##
=======================================
- Coverage      92%     88%    -5%     
=======================================
  Files         199     199            
  Lines       12962   12995    +33     
=======================================
- Hits        11967   11409   -558     
- Misses        995    1586   +591     

@borisdayma
Copy link
Contributor Author

I started from my other ongoing PR so you can see the changes here

@awaelchli awaelchli added logger Related to the Loggers feature Is an improvement or enhancement labels Feb 27, 2021
@awaelchli awaelchli added this to the 1.3 milestone Feb 27, 2021
@borisdayma
Copy link
Contributor Author

The history looks a bit messed up so I'll try to fix it or force push from a clean master state (since there's only one real commit so far).

@borisdayma
Copy link
Contributor Author

Here are examples:

The idea is to avoid putting too much metadata that would hide more meaningful data.

@borisdayma
Copy link
Contributor Author

borisdayma commented Mar 4, 2021

Here is current behavior:

  • default: log only last model as artifact → example artifact
  • custom ModelCheckpoint: according to options (looks through last_model_path, best_model_path and best_k_models) → example artifact

We can see in the 2nd example that 3 model were logged:

  • versions are ordered by timestamp
  • latest and best aliases are automatically set
  • file within artifact is called model.ckpt

This would let users access the best model from a specific run with a command such as:
artifact = run.use_artifact('USER/PROJECT/RUN_NAME:best', type='model')

Ideally in a future PR, it would be nice to have an option log_model='all' (in particular for long training that may crash) which probably means allowing loggers to have functionalities similar to other callbacks.

Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. Just some more comments :)

Will the artifacts be uploaded and updated continuously? E.g. will I at every point of the training have a best model artifact at wandb and if I get a new one it will replace the old one?

pytorch_lightning/loggers/base.py Outdated Show resolved Hide resolved
pytorch_lightning/loggers/base.py Outdated Show resolved Hide resolved
pytorch_lightning/loggers/wandb.py Outdated Show resolved Hide resolved
pytorch_lightning/loggers/wandb.py Outdated Show resolved Hide resolved
Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

self.logger.experiment.log({
"generated_images": [wandb.Image(some_img, caption="...")]
self.log({
"generated_images": [wandb.Image(some_img, caption="...")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.log_images could be the API used to log every image related artefacts.

@borisdayma
Copy link
Contributor Author

Hi guys, just checking if anything else is needed from me.
I know quite a few people already who are interested in accessing this feature.

@edenlightning
Copy link
Contributor

Apologize for the late reply Boris!
We want to spend some time thinking about how to make this a robust API that would work well for all supported loggers. We have 1.3 release coming soon which is too late to add this in, so we will prioritize this right after the release.

Thanks again for your contribution, and sorry we haven't been able to address quicker.

@edenlightning edenlightning modified the milestones: 1.3, 1.4 Apr 16, 2021
@borisdayma
Copy link
Contributor Author

borisdayma commented Apr 28, 2021

Hi, I'm just checking if there's any update or if I can do anything from my side.
It would be great to implement this feature as it will close a lot of current open issues (and make the integration so much better!)

@borisdayma
Copy link
Contributor Author

borisdayma commented May 11, 2021

Hey guys, just wondering if you would be ok using my current version?

Right now I just need a callback for after_save_checkpoint (which is those 2 lines + associated tests) so whenever you have a more robust one, it will be easy to switch it out.

@Borda
Copy link
Member

Borda commented May 11, 2021

@awaelchli how is it going here, I saw you had most comments so mind giving your decision? 🐰
@borisdayma mind resolve conflicts?

@borisdayma borisdayma requested a review from kaushikb11 as a code owner May 14, 2021 00:24
@mergify mergify bot removed the has conflicts label May 14, 2021
@borisdayma
Copy link
Contributor Author

Ok I resolved conflicts

@borisdayma
Copy link
Contributor Author

Hi, just checking if i need to do anything else on my side?

@borisdayma
Copy link
Contributor Author

Hey guys, just checking on the progress here.
There's a few issues linked to this PR so I'd love to know what is still left to do.

@mergify mergify bot removed the has conflicts label May 26, 2021
@awaelchli awaelchli merged commit 9097347 into Lightning-AI:master May 27, 2021
@borisdayma
Copy link
Contributor Author

Thanks guys!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Includes a design discussion feature Is an improvement or enhancement logger Related to the Loggers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WandbLogger does not mark uploaded model as 'artifact'
8 participants