Skip to content
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,9 @@ class TimeSeriesTransformerConfig(PretrainedConfig):
documentation from [`PretrainedConfig`] for more information.

Args:
prediction_length (`int`):
The prediction length for the decoder. In other words, the prediction horizon of the model.
prediction_length (`int`, defaults to 24):
The prediction length for the decoder. In other words, the prediction horizon of the model. This value is
typically dictated by the dataset and we recommend to change it appropriately.
context_length (`int`, *optional*, defaults to `prediction_length`):
The context length for the encoder. If `None`, the context length will be the same as the
`prediction_length`.
Expand All @@ -60,8 +61,8 @@ class TimeSeriesTransformerConfig(PretrainedConfig):
Whether to scale the input targets via "mean" scaler, "std" scaler or no scaler if `None`. If `True`, the
scaler is set to "mean".
lags_sequence (`list[int]`, *optional*, defaults to `[1, 2, 3, 4, 5, 6, 7]`):
The lags of the input time series as covariates often dictated by the frequency. Default is `[1, 2, 3, 4,
5, 6, 7]`.
The lags of the input time series as covariates often dictated by the frequency of the data. Default is
`[1, 2, 3, 4, 5, 6, 7]` but we recommend to change it based on the dataset appropriately.
num_time_features (`int`, *optional*, defaults to 0):
The number of time features in the input time series.
num_dynamic_real_features (`int`, *optional*, defaults to 0):
Expand Down Expand Up @@ -117,8 +118,8 @@ class TimeSeriesTransformerConfig(PretrainedConfig):
```python
>>> from transformers import TimeSeriesTransformerConfig, TimeSeriesTransformerModel

>>> # Initializing a default Time Series Transformer configuration
>>> configuration = TimeSeriesTransformerConfig()
>>> # Initializing a Time Series Transformer configuration with 12 time steps for prediction
>>> configuration = TimeSeriesTransformerConfig(prediction_length=12)

>>> # Randomly initializing a model (with random weights) from the configuration
>>> model = TimeSeriesTransformerModel(configuration)
Expand All @@ -135,7 +136,7 @@ class TimeSeriesTransformerConfig(PretrainedConfig):

def __init__(
self,
prediction_length: Optional[int] = None,
prediction_length: int = 24,

@ydshieh ydshieh Feb 24, 2023

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgugger: @kashif originally thought a few arguments of this config's __init__ should be positional arguments.

  • It seems we can't make positional arugments technically, as to_json_string will call to_diff_dict then self.__class__().to_dict() (so we always need to make config_class() works without specifying any arg)
  • And we also want it works this way, so easier for uers to get started (?)
  • We use the values from a pretrained model as the default values for arguments of a config class
    • (TimeSeriesTransformerModel has no real, well-known checkpoint, see below)

Is my understanding above correct?

As for TimeSeriesTransformerModel, there is no checkpoint, it's just @kashif trains a model on a tourism dataset quickly (3-5 minutes) and uploaded. And we just set values like prediction_length (int, defaults to 24): according to that one.
He says 24 has no real meaning, and the plan is to train models on all the time series datasets.

But I think it is fine (and we have to) put some default values anyway, and users can change them for their own datasets.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not leave it as None then?

Yes we can't have config arguments that are positional, since the Transformers library is a library of pretrained model. So every argument of the config has a default corresponding to a canonical checkpoint associated to that model. This is quite an exceptional situation here for time series, as normally a model with no canonical checkpoint is not even accepted in the library ;-)

Leaving the argument as None is probably the way to go, and you can then assert in the config code if you want to force the user to always pass it.

@ydshieh ydshieh Feb 24, 2023

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you for spending time explain to me (us)❤️

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a problem!

context_length: Optional[int] = None,
distribution_output: str = "student_t",
loss: str = "nll",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,7 @@ def test_inference_no_head(self):
self.assertEqual(output.shape, expected_shape)

expected_slice = torch.tensor(
[[-0.6322, -1.5771, -0.9340], [-0.1011, -1.0263, -0.7208], [0.4979, -0.6487, -0.7189]], device=torch_device
[[0.8196, -1.5131, 1.4620], [1.1268, -1.3238, 1.5997], [1.5098, -1.0715, 1.7359]], device=torch_device
)
self.assertTrue(torch.allclose(output[0, :3, :3], expected_slice, atol=TOLERANCE))

Expand All @@ -423,7 +423,7 @@ def test_inference_head(self):
self.assertEqual(output.shape, expected_shape)

expected_slice = torch.tensor(
[[0.8177, -1.7989, -0.3127], [1.6964, -1.0607, -0.1749], [1.8395, 0.1110, 0.0263]], device=torch_device
[[-1.2957, -1.0280, -0.6045], [-0.7017, -0.8193, -0.3717], [-1.0449, -0.8149, 0.1405]], device=torch_device
)
self.assertTrue(torch.allclose(output[0, :3, :3], expected_slice, atol=TOLERANCE))

Expand All @@ -444,6 +444,6 @@ def test_seq_to_seq_generation(self):
expected_shape = torch.Size((64, model.config.num_parallel_samples, model.config.prediction_length))
self.assertEqual(outputs.sequences.shape, expected_shape)

expected_slice = torch.tensor([3883.5037, 4630.2251, 7562.1338], device=torch_device)
expected_slice = torch.tensor([2825.2749, 3584.9207, 6763.9951], device=torch_device)
mean_prediction = outputs.sequences.mean(dim=1)
self.assertTrue(torch.allclose(mean_prediction[0, -3:], expected_slice, rtol=1e-1))