You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The idea is to improve the performance of vec_td_lambda_return_estimate in the lambda/gamma are scalars (or tensors with a single unique value).
The vectorized version of TD lambda works by building a cumprod of the gamma decay
Description
The idea is to improve the performance of
vec_td_lambda_return_estimate
in the lambda/gamma are scalars (or tensors with a single unique value).The vectorized version of TD lambda works by building a cumprod of the gamma decay
and applying conv1d to it which results in
In principal, the same idea used in #1142 should be applicable.
Given consecutive trajectories in the form of:
the traces are split into:
Apply the filter to
r_transformed
to calculate the TD lambda return.Finally,
vec_td_lambda_return_estimate
should use this case if applicable.The text was updated successfully, but these errors were encountered: