Skip to content

Commit

Permalink
Merge branch 'awslabs:dev' into split-example
Browse files Browse the repository at this point in the history
  • Loading branch information
npnv authored and Chen committed Aug 5, 2022
2 parents 117db36 + fb6293e commit 2cddc85
Show file tree
Hide file tree
Showing 8 changed files with 12 additions and 7 deletions.
3 changes: 3 additions & 0 deletions docs/md2ipynb.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ def check_github_event(default):


def run_notebook(text, kernel_name, timeout) -> str:
# We add two blank lines at the end to ensure
# that the final cell also runs.
text += "\n" * 2
notebook = notedown.MarkdownReader().reads(text)

kwargs = {}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -234,4 +234,4 @@ trainer = Trainer(epochs=5, callbacks=[es_callback])
estimator.trainer = trainer

pred = estimator.train(dataset.train)
```
```
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,13 @@ from gluonts.dataset.split.splitter import split
```

This needs to be given:

- the `dataset` that we want to split;
- an `offset` or a `date`, but not both simultaneously. These two arguments are provided for the function to know how to slice training and test data, based on a fixed integer offset or a ``pandas.Period``, respectively.

As a result, the `split` method returns the splited dataset, consisting of the training data `training_dataset` and a "test template" that knows how to generate input/output test pairs.

## Data loading and processing
## Loading a dataset


```python
Expand All @@ -24,8 +25,6 @@ plt.rcParams["axes.grid"] = True
plt.rcParams["figure.figsize"] = (20,3)
```

### Get some datasets

For our examples, we will use data from the following `csv` file, which is originally sampled every 5 minutes, but we resample at hourly frequency. Note that this makes for a dataset consisting of a single time series, but everything we show here applies to any dataset, regardless of how many series it contains.


Expand All @@ -51,7 +50,7 @@ from gluonts.dataset.pandas import PandasDataset
dataset = PandasDataset(df, target="value")
```

## Specific splitting examples
## Train/test splits

Let's define a few helper functions to visualize data splits.

Expand Down
1 change: 1 addition & 0 deletions docs/tutorials/data_manipulation/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@
```{toctree}
:maxdepth: 1
pandasdataframes
dataset_splitting_example
synthetic_data_generation
```
1 change: 1 addition & 0 deletions requirements/requirements-articial-dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
holidays >= 0.9
2 changes: 2 additions & 0 deletions requirements/requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@ myst-parser
click
orjson
black
holidays~=0.9
matplotlib
1 change: 1 addition & 0 deletions requirements/requirements-test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ pytest~=5.0
ujson
orjson
requests
holidays~=0.9
2 changes: 0 additions & 2 deletions requirements/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
holidays>=0.9
matplotlib~=3.0
numpy~=1.16
pandas~=1.0
pydantic~=1.7
Expand Down

0 comments on commit 2cddc85

Please sign in to comment.