Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs revamp 2/N] New doc for managing data #8034

Merged
merged 87 commits into from
Jul 22, 2021
Merged
Show file tree
Hide file tree
Changes from 74 commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
0e4f237
amp
edenlightning May 11, 2021
a871f88
amp
edenlightning May 11, 2021
2ef419a
docs
edenlightning May 20, 2021
b9f6d8a
add guides
edenlightning May 20, 2021
6576e0a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 20, 2021
2275fbd
amp
edenlightning May 11, 2021
7f22a03
amp
edenlightning May 11, 2021
f69fe0b
docs
edenlightning May 20, 2021
8ad3dee
add guides
edenlightning May 20, 2021
6421312
speed guides
edenlightning May 23, 2021
4b8f800
speed guides
edenlightning May 23, 2021
f9c2f78
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 23, 2021
fe538ca
Delete ds.txt
edenlightning May 23, 2021
8c816d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 23, 2021
9ddb797
Update conf.py
edenlightning May 23, 2021
6e6de80
Update docs.txt
edenlightning May 23, 2021
038994e
remove 16 bit
edenlightning May 23, 2021
1ecd72a
remove 16 bit
edenlightning May 23, 2021
65c2e51
Merge branch 'docs/new-speed' of https://github.com/PyTorchLightning/…
edenlightning May 23, 2021
beb9ff5
remove finetune from speed guide
edenlightning May 27, 2021
649bc1b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] May 23, 2021
1d423d9
speed
edenlightning Jun 7, 2021
41f16bf
speed
edenlightning Jun 7, 2021
84c87d4
Merge branch 'master' of https://github.com/PyTorchLightning/pytorch-…
edenlightning Jun 7, 2021
a6480d1
speed
edenlightning Jun 7, 2021
4a66166
speed
edenlightning Jun 7, 2021
6acdde7
speed
edenlightning Jun 7, 2021
d4a8151
speed
edenlightning Jun 7, 2021
6acbe13
speed
edenlightning Jun 7, 2021
c44ffb7
speed
edenlightning Jun 9, 2021
71adc47
speed
edenlightning Jun 9, 2021
46530f5
speed
edenlightning Jun 10, 2021
3f65a18
speed
edenlightning Jun 10, 2021
5c078d9
speed
edenlightning Jun 10, 2021
a45d0e6
remove early stopping from speed guide
edenlightning Jun 10, 2021
bb023e7
remove early stopping from speed guide
edenlightning Jun 10, 2021
6d7dfc1
remove early stopping from speed guide
edenlightning Jun 10, 2021
4df8873
fix label
edenlightning Jun 10, 2021
72ea29c
fix sync
edenlightning Jun 10, 2021
2eb26f2
reviews
edenlightning Jun 14, 2021
5bcc732
Update trainer.rst
edenlightning Jun 16, 2021
7b43c51
Update trainer.rst
edenlightning Jun 16, 2021
bd7d3c9
Update speed.rst
edenlightning Jun 16, 2021
f649bf1
Apply suggestions from code review
edenlightning Jun 16, 2021
0457a5a
managing data
edenlightning Jun 18, 2021
cbd396f
managing data
edenlightning Jun 18, 2021
dae8d92
amp
edenlightning May 11, 2021
a623911
amp
edenlightning May 11, 2021
5cdf884
docs
edenlightning May 20, 2021
10192e3
sync
edenlightning Jun 23, 2021
b96492b
sync
edenlightning Jun 23, 2021
3610eea
amp
edenlightning May 11, 2021
9101939
amp
edenlightning May 11, 2021
1e46d1e
add data guide
edenlightning May 26, 2021
707da9f
from review
edenlightning Jun 23, 2021
7601280
Apply suggestions from code review
edenlightning Jun 23, 2021
3cef25b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 23, 2021
bd66088
Apply suggestions from code review
edenlightning Jun 25, 2021
eb408a1
from review
edenlightning Jun 25, 2021
af0465c
from review
edenlightning Jun 25, 2021
87f2206
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 25, 2021
42bc6d0
add data guide
edenlightning Jun 25, 2021
1a8daa3
add data guide
edenlightning Jun 25, 2021
825e0f5
add data guide
edenlightning Jun 25, 2021
6af131e
Merge branch 'docs/new-data' of https://github.com/edenlightning/pyto…
edenlightning Jun 25, 2021
1f9b017
add data guide
edenlightning Jun 25, 2021
ff1c338
sync issues
edenlightning Jun 25, 2021
47f3243
from reviw
edenlightning Jun 25, 2021
2c0172f
Update docs/source/guides/data.rst
edenlightning Jul 8, 2021
ca7215a
Merge branch 'master' into docs/new-data
awaelchli Jul 8, 2021
298d45c
add info if import fails
awaelchli Jul 8, 2021
28fcb32
fix cross referencing
awaelchli Jul 8, 2021
577bb6f
Add Datamodule motivation
edenlightning Jul 9, 2021
87804a8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jul 9, 2021
5747f0f
resolve comments
tchaton Jul 21, 2021
12d6a85
improve description
tchaton Jul 21, 2021
5b99426
apply comments
tchaton Jul 21, 2021
f679f80
Update docs/source/guides/data.rst
ethanwharris Jul 21, 2021
bf7d16e
update
tchaton Jul 21, 2021
055e718
Merge commit 'refs/pull/8034/head' of https://github.com/PyTorchLight…
tchaton Jul 21, 2021
25bd7ca
Update docs/source/guides/data.rst
ethanwharris Jul 21, 2021
a92ca56
Apply suggestions from code review
ethanwharris Jul 21, 2021
44506b2
Update docs/source/guides/data.rst
ethanwharris Jul 21, 2021
d809746
Apply suggestions from code review
ethanwharris Jul 21, 2021
decd667
Fix doctest
ethanwharris Jul 21, 2021
a3c3581
_notebooks
awaelchli Jul 22, 2021
8aa2880
reset _notebooks
awaelchli Jul 22, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 0 additions & 162 deletions docs/source/advanced/multiple_loaders.rst

This file was deleted.

76 changes: 0 additions & 76 deletions docs/source/advanced/sequences.rst
Original file line number Diff line number Diff line change
@@ -1,37 +1,6 @@
.. testsetup:: *

from torch.utils.data import IterableDataset
from pytorch_lightning.trainer.trainer import Trainer

.. _sequences:

Sequential Data
================
Lightning has built in support for dealing with sequential data.

----------

Packed sequences as inputs
--------------------------
When using PackedSequence, do 2 things:

1. Return either a padded tensor in dataset or a list of variable length tensors in the dataloader collate_fn (example shows the list implementation).
2. Pack the sequence in forward or training and validation steps depending on use case.

.. testcode::

# For use in dataloader
def collate_fn(batch):
x = [item[0] for item in batch]
y = [item[1] for item in batch]
return x, y

# In module
def training_step(self, batch, batch_nb):
x = rnn.pack_sequence(batch[0], enforce_sorted=False)
y = rnn.pack_sequence(batch[1], enforce_sorted=False)

----------

Truncated Backpropagation Through Time
--------------------------------------
Expand Down Expand Up @@ -64,48 +33,3 @@ Lightning can handle TBTT automatically via this flag.

.. note:: If you need to modify how the batch is split,
override :meth:`pytorch_lightning.core.LightningModule.tbptt_split_batch`.

----------

Iterable Datasets
-----------------
Lightning supports using IterableDatasets as well as map-style Datasets. IterableDatasets provide a more natural
option when using sequential data.

.. note:: When using an IterableDataset you must set the ``val_check_interval`` to 1.0 (the default) or an int
(specifying the number of training batches to run before validation) when initializing the Trainer. This is
because the IterableDataset does not have a ``__len__`` and Lightning requires this to calculate the validation
interval when ``val_check_interval`` is less than one. Similarly, you can set ``limit_{mode}_batches`` to a float or
an int. If it is set to 0.0 or 0 it will set ``num_{mode}_batches`` to 0, if it is an int it will set ``num_{mode}_batches``
to ``limit_{mode}_batches``, if it is set to 1.0 it will run for the whole dataset, otherwise it will throw an exception.
Here mode can be train/val/test.

.. testcode::

# IterableDataset
class CustomDataset(IterableDataset):

def __init__(self, data):
self.data_source

def __iter__(self):
return iter(self.data_source)

# Setup DataLoader
def train_dataloader(self):
seq_data = ['A', 'long', 'time', 'ago', 'in', 'a', 'galaxy', 'far', 'far', 'away']
iterable_dataset = CustomDataset(seq_data)

dataloader = DataLoader(dataset=iterable_dataset, batch_size=5)
return dataloader

.. testcode::

# Set val_check_interval
trainer = Trainer(val_check_interval=100)

# Set limit_val_batches to 0.0 or 0
trainer = Trainer(limit_val_batches=0.0)

# Set limit_val_batches as an int
trainer = Trainer(limit_val_batches=100)
2 changes: 2 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
sys.path.insert(0, os.path.abspath(PATH_ROOT))
sys.path.append(os.path.join(PATH_RAW_NB, '.actions'))

# if you run into an import error here, try to execute this command:
# git submodule update --init --recursive
from helpers import HelperCLI # noqa: E401 E402

FOLDER_GENERATED = 'generated'
Expand Down
Loading