Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Adds a template task and docs #306

Merged
merged 60 commits into from
May 19, 2021
Merged

Adds a template task and docs #306

merged 60 commits into from
May 19, 2021

Conversation

ethanwharris
Copy link
Collaborator

@ethanwharris ethanwharris commented May 17, 2021

What does this PR do?

Fixes # (issue)

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests? [not needed for typos/docs]
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented May 17, 2021

Hello @ethanwharris! Thanks for updating this PR.

Line 228:13: W503 line break before binary operator

Comment last updated at 2021-05-19 18:39:05 UTC

@codecov
Copy link

codecov bot commented May 17, 2021

Codecov Report

Merging #306 (c21e816) into master (65c658b) will increase coverage by 0.10%.
The diff coverage is 90.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #306      +/-   ##
==========================================
+ Coverage   86.91%   87.02%   +0.10%     
==========================================
  Files          78       83       +5     
  Lines        4021     4130     +109     
==========================================
+ Hits         3495     3594      +99     
- Misses        526      536      +10     
Flag Coverage Δ
unittests 87.02% <90.82%> (+0.10%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
flash/template/classification/data.py 86.88% <86.88%> (ø)
flash/template/classification/backbones.py 88.88% <88.88%> (ø)
flash/template/classification/model.py 97.22% <97.22%> (ø)
flash/template/__init__.py 100.00% <100.00%> (ø)
flash/template/classification/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 65c658b...c21e816. Read the comment docs.

@ethanwharris ethanwharris marked this pull request as ready for review May 17, 2021 18:10
docs/source/template/data.rst Outdated Show resolved Hide resolved
docs/source/template/examples.rst Outdated Show resolved Hide resolved
In this section, we briefly describe the data, and then ``literalinclude`` our finetuning example.

Now we'll train on Fisher's classic iris data.
It contains 150 records with four features (sepal length, sepal width, petal length, and petal width) in three classes (species of Iris: setosa, virginica and versicolor).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include link to images to make your description better.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just tabular data, so I'm not sure what images we would show here

docs/source/template/data.rst Outdated Show resolved Hide resolved
@ethanwharris
Copy link
Collaborator Author

@edenlightning thanks for the review! - summary of main changes made:

  • Stopped using auto-docs in the tutorials, now just has code snippets where needed
  • Made backbones no longer optional, just its own page now with a full example for the template task
  • Added links to the files in github (these will only work once it's merged)

:dedent: 4
:pyobject: TemplateSKLearnDataSource.predict_load_data

DataSource vs Dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rephrase it to see if I understand it correctly:
A DataSource has a similar function as Dataset except that it includes preprocessing methods, generates a Dataset when we call load_data, and will generate (possibly different) Datasets for training, validation etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be useful to understand how it is different from torch.utils.DataLoader, since Dataset only requires getitem, but Dataloader also does some preprocessing, although I think does not distinguish between training, validation ...
Also similar to https://docs.fast.ai/data.load.html no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The high-level view is this:

  • DataSource is used to generate multiple datasets (e.g. train, test, val, predict)
  • The preprocessing methods are stored in Preprocess
  • When the dataloader is created, the preprocess transforms are injected into the workers and the model so that they are all called in the right place

So DataSource, Preprocess, DataPipeline is really just a different way of creating a DataSet and DataLoader (not a replacement). Can't speak to similarity with Fast AI as I'm not very familiar with it. Hope that helps!

If the library that your :class:`~flash.core.data.model.Task` is based on provides a custom dataset, you don't need to re-write it as a :class:`~flash.core.data.data_source.DataSource`.
For example, the :meth:`~flash.core.data.data_source.DataSource.load_data` of the ``VideoClassificationPathsDataSource`` just creates an :class:`~pytorchvideo.data.EncodedVideoDataset` from the given folder.
Here's how it looks (from `video/classification.data.py <https://github.com/PyTorchLightning/lightning-flash/blob/master/flash/video/classification/data.py>`_):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could give a simpler example for something like
https://archive.ics.uci.edu/ml/datasets/iris
I find the above example to have more code than needed

Copy link
Contributor

@edenlightning edenlightning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the changes! I think it looks great :)
Just a couple more nits

docs/source/template/backbones.rst Outdated Show resolved Hide resolved
docs/source/template/optional.rst Outdated Show resolved Hide resolved
docs/source/template/optional.rst Outdated Show resolved Hide resolved
docs/source/template/optional.rst Outdated Show resolved Hide resolved
docs/source/template/task.rst Outdated Show resolved Hide resolved
docs/source/template/task.rst Outdated Show resolved Hide resolved
docs/source/template/task.rst Show resolved Hide resolved
docs/source/template/task.rst Show resolved Hide resolved
docs/source/template/task.rst Outdated Show resolved Hide resolved
docs/source/template/task.rst Outdated Show resolved Hide resolved
@tchaton tchaton merged commit cb7f906 into master May 19, 2021
@tchaton tchaton deleted the feature/template branch May 19, 2021 18:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants