DeepAR: Calculate loss only using observed values #3205

Serendipity31 · 2024-07-25T11:03:33Z

Serendipity31
Jul 25, 2024

In the loss() function, what is currently line 585 seems to zero out the loss values associated with missing observations.

However, these observations still seem to affect the calculation of average loss.

The return from this function applies torch.mean() to the loss_values. It seems like this function includes any resulting zero values within the operation. That is, it seems like the any zeroes that exist in loss_values because of missing observations affect the average loss that is returned by loss().

Since the goal of training is to minimise the loss across time, given known data, wouldn't it be conceptually better to exclude the loss values associated with missing observations from the average loss calculation entirely?

Serendipity31 · 2024-07-27T16:56:53Z

Serendipity31
Jul 27, 2024
Author

To expand on this, here is a post where someone tested the impact that a binary mask had on gradients and reported loss. The post pertains to Keras and Tensorflow (which I know might be different than Lightning), but the outcome was:

The reported average loss values reported were influenced by the masked values (i.e. the zeros introduced by the mask affected the loss reported in the loss curve)
The gradients were not influenced by masked values (which they inferred from the observation that the final model weights were the same when the masked value was 0 or another number)

It would be useful for the documentation to establish unambiguously whether this also happens with the ObservedValueIndicator mask used in DeepAR. My understanding is that the desired outcome is that the process of updating model parameters 'skips' the missing values, but it's very unclear if the way the mask is used within loss() actually accomplishes this.

0 replies

Serendipity31 · 2024-07-27T16:56:56Z

Serendipity31
Jul 27, 2024
Author

It would also be really useful for the documentation to clarify how the gradients are dealt with each iteration. Are they averaged across all the series in the mini-batch?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepAR: Calculate loss only using observed values #3205

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

DeepAR: Calculate loss only using observed values #3205

Serendipity31 Jul 25, 2024

Replies: 2 comments

Serendipity31 Jul 27, 2024 Author

Serendipity31 Jul 27, 2024 Author

Serendipity31
Jul 25, 2024

Serendipity31
Jul 27, 2024
Author

Serendipity31
Jul 27, 2024
Author