DeepAR: Calculate loss only using observed values #3205
Replies: 2 comments
-
To expand on this, here is a post where someone tested the impact that a binary mask had on gradients and reported loss. The post pertains to Keras and Tensorflow (which I know might be different than Lightning), but the outcome was:
It would be useful for the documentation to establish unambiguously whether this also happens with the |
Beta Was this translation helpful? Give feedback.
-
It would also be really useful for the documentation to clarify how the gradients are dealt with each iteration. Are they averaged across all the series in the mini-batch? |
Beta Was this translation helpful? Give feedback.
-
In the
loss()
function, what is currently line 585 seems to zero out the loss values associated with missing observations.However, these observations still seem to affect the calculation of average loss.
The return from this function applies
torch.mean()
to theloss_values
. It seems like this function includes any resulting zero values within the operation. That is, it seems like the any zeroes that exist inloss_values
because of missing observations affect the average loss that is returned byloss()
.Since the goal of training is to minimise the loss across time, given known data, wouldn't it be conceptually better to exclude the loss values associated with missing observations from the average loss calculation entirely?
Beta Was this translation helpful? Give feedback.
All reactions