Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature - ArrayDataset #872

Merged
merged 10 commits into from
Aug 26, 2022

Conversation

Ce11an
Copy link
Contributor

@Ce11an Ce11an commented Aug 24, 2022

What does this PR do?

Part of #839

  • pl_bolts.datamodules.sklearn_datamodule.TensorDataset

Summary

  • Replaced TensorDataset with ArrayDataset. (breaking change ❗ )
  • Moved ArrayDataset to datasets module.

The ArrayDataset allows you to input any number of ARRAYS along with callable transforms using the DataModel. The ArrayDataset allows for the following types:

  • Lists / Nested Lists
  • Numpy Arrays
  • Torch Tensors

This eliminates the need for the TensorDataset.

Transforms are called during each __getitem__. @otaj and I have discussed the potential inefficiencies of this. However, we have decided to stay consistent with the rest of bolts and vision. Happily open to having that discussion again, though!

Also, after discussing with @otaj, we thought it would be best to separate #846.

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests? [not needed for typos/docs]
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Of course 🥳

@Ce11an Ce11an mentioned this pull request Aug 24, 2022
8 tasks
@Ce11an Ce11an marked this pull request as ready for review August 24, 2022 18:14
pl_bolts/datasets/array_dataset.py Outdated Show resolved Hide resolved
pl_bolts/datasets/base_dataset.py Outdated Show resolved Hide resolved
pl_bolts/datasets/utils.py Outdated Show resolved Hide resolved
pl_bolts/utils/types.py Outdated Show resolved Hide resolved
pl_bolts/datasets/base_dataset.py Outdated Show resolved Hide resolved
tests/datasets/test_array_dataset.py Outdated Show resolved Hide resolved
@Ce11an
Copy link
Contributor Author

Ce11an commented Aug 25, 2022

Hey 👋🏻

Sorry, I think someone needs to approve the workflows again. I have resolved @otaj's comments. Thanks 😃

@Ce11an Ce11an requested a review from otaj August 25, 2022 21:16
@otaj otaj mentioned this pull request Aug 26, 2022
@Ce11an
Copy link
Contributor Author

Ce11an commented Aug 26, 2022

Thanks, @otaj!

mypy does not support recursive types. Currently, there is an experimental feature: python/mypy#731

What would you recommend?

@otaj
Copy link
Contributor

otaj commented Aug 26, 2022

Thanks, @otaj!

mypy does not support recursive types. Currently, there is an experimental feature: python/mypy#731

What would you recommend?

Yeah, I sadly know about it, mypy can sometimes be a bit of a pain. I don't know if mypy has problem only with definition line or with all the lines where the recursive type is being used. If it's the latter, then we probably need to abandon recursive type, since it doesn't make sense to ignore everything. But, if it's only the former case, then I'm perfectly fine with just type: ignore on that line

@Ce11an
Copy link
Contributor Author

Ce11an commented Aug 26, 2022

There were two issues. I have had to add float type and ignore TArrays in mypy.

Are you happy to use float to represent int? Pep 484

@otaj otaj enabled auto-merge (squash) August 26, 2022 12:28
@otaj otaj merged commit bcbbf6a into Lightning-Universe:master Aug 26, 2022
@mergify mergify bot added the ready label Aug 26, 2022
@Ce11an Ce11an deleted the feature/839-tensor-dataset branch November 27, 2022 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants