[Performance] Accelerate GAE #1142

Blonck · 2023-05-09T08:09:53Z

Description

An optimized vecotrized version for the generalized advantage estimation is used in case gamma and lambda are scalars.

Motivation and Context

When handling consecutive trajectories of the form

reward = [r00, r01, r02, r03, r10, r11]
done = [False, False, False, True, False, False]

, vec_generalized_advantage_estimate
needs to build a giant gamma tensor of size [Batch, T, T] with a decayed gamma tensor that suits each trajectory. Thus it needs to allocate a big tensor [B, T, T] and do a heavy matrix multiplication. In case gamma and lambda are scalars, this can be optimized by building a single tensor of the form

r_transformed = [
    [r00, r01, r02, r03]
    [r10, r11, 0, 0]
]

and applying the gamma filter [r00 + gamma r01 + gamma ** 2 r02 + ..., ro1 + gamma r02 + gamma ** 2 r03 + ...,] to calculate the GAE.

close #1052

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

[ x] Bug fix (non-breaking change which fixes an issue)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

[x ] I have read the CONTRIBUTION guide (required)
My change requires a change to the documentation.
[ x] I have updated the tests accordingly (required for a bug fix or a new feature).
I have updated the documentation accordingly.

An optimized vecotrized version for the generalized advantage estimation is used in case gamma and lambda are scalars. When handling consecutive trajectories of the form ``` reward = [r00, r01, r02, r03, r10, r11] done = [False, False, False, True, False, False] ``` , `vec_generalized_advantage_estimate` needs to build a giant famma tensor of size [Batch, T, T] with a decayed gamma tensor that suits each trajectory. Thus it needs to allocate a big tensor `[B, T, T]` and do a heavy matrix multiplication. In case gamma and lambda are scalars, this can be optimized by building a single tensor of the form ``` r_transformed = [[r00, r01, r02, r03] [r10, r11, 0, 0]] ``` and applying the gamma filter `[r00 + gamma r01 + gamma ** 2 r02 + ..., ro1 + gamma r02 + gamma ** 2 r03 + ...,]` to calculate the GAE.

* move helper methods to util * reuse existing helper methods * remove wip file

torchrl/objectives/value/functional.py

In case gamma and lmbda are scalars, `fast_vec_gae` should be always faster than `vec_generalized_advantage_estimate` if len(T) is large enough.

…lementation" This reverts commit 245e68f.

in case there is only one split, _inv_pad_sequence can skip its calculation.

…ent.

vmoens · 2023-05-10T07:40:26Z

benchmarks/test_objectives_benchmarks.py

+        gamma = torch.full(size, gamma)
+    lmbda = 0.95
+
+    benchmark(


Why not using benchmark.pedantic to get some extra options?
I'm open to use plain benchmark if you think it's a better fit

My guess was to use the automatic calibration is better here, when I have no need of the fine-grained control of benchmark.pedantic. The pytest-benchmark docu says roughly: "don't use pedantic if you don't need it".

vmoens · 2023-05-10T08:55:29Z

torchrl/objectives/value/functional.py

+    if reward.ndim > 2:
+        done = done.transpose(-2, -1)
+        reward = reward.transpose(-2, -1)
+        state_value = state_value.transpose(-2, -1)
+        next_state_value = next_state_value.transpose(-2, -1)


what happens if the reward has 2 dimensions? Don't we want to swap them>

You are right, that does not make sense. In particular, there is the following line in vec_generalized_advantage_estimate

*batch_size, time_steps, lastdim = not_done.shape

which ensures that reward and the other tensors have at least 3 dimensions, so the checks are never executed.

I will remove the checks, although I need to remember why I introduced them in the first place.

vmoens

LGTM thanks so much for this contribution!

vmoens

LGTM thanks so much for this contribution!

vmoens and others added 8 commits April 6, 2023 15:56

init

b40e9a9

Merge branch 'main' into fix_vec_speed

1669802

init

fe31219

Merge branch 'main' into fix_vec_speed

658380e

Measure performance difference

70bd48d

Refactor code:

268056e

* move helper methods to util * reuse existing helper methods * remove wip file

Remove old lines from test_objectives_benchmarks.py

0c6dc55

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 9, 2023

vmoens reviewed May 9, 2023

View reviewed changes

torchrl/objectives/value/functional.py Show resolved Hide resolved

vmoens added the performance Performance issue or suggestion for improvement label May 9, 2023

vmoens changed the title ~~Accelerate GAE~~ [Performance] Accelerate GAE May 9, 2023

Blonck added 7 commits May 9, 2023 13:50

Add hard check that _fast_vec_gae is faster than original implementation

245e68f

In case gamma and lmbda are scalars, `fast_vec_gae` should be always faster than `vec_generalized_advantage_estimate` if len(T) is large enough.

Revert "Add hard check that _fast_vec_gae is faster than original imp…

cd65976

…lementation" This reverts commit 245e68f.

Fix bug in _inv_pad_sequence and simplify

9c32f01

Add performance cmp of GAE algorithms

41945f5

Merge branch 'main' into fix_vec_speed_03

88b9788

Improve perfomance of _inv_pad_sequence

291386a

in case there is only one split, _inv_pad_sequence can skip its calculation.

Call _fast_vec_gae only if gamma/lmbda are tensors with only one elem…

6d5da1b

…ent.

vmoens reviewed May 10, 2023

View reviewed changes

Replace .view with .replace in _inv_pad_sequence

4a634c6

vmoens reviewed May 10, 2023

View reviewed changes

Blonck added 2 commits May 10, 2023 12:17

Remove redundant ndim checks in vec_generalized_advantage_estimate

914d44b

Merge branch 'main' into fix_vec_speed_03

a921333

vmoens approved these changes May 10, 2023

View reviewed changes

vmoens merged commit 9402664 into pytorch:main May 10, 2023

Blonck mentioned this pull request May 12, 2023

[Performance] Accelerate TD lambda return estimate #1148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Accelerate GAE #1142

[Performance] Accelerate GAE #1142

Blonck commented May 9, 2023

vmoens May 10, 2023

Blonck May 10, 2023

vmoens May 10, 2023

Blonck May 10, 2023

vmoens left a comment

vmoens left a comment

[Performance] Accelerate GAE #1142

[Performance] Accelerate GAE #1142

Conversation

Blonck commented May 9, 2023

Description

Motivation and Context

Types of changes

Checklist

vmoens May 10, 2023

Choose a reason for hiding this comment

Blonck May 10, 2023

Choose a reason for hiding this comment

vmoens May 10, 2023

Choose a reason for hiding this comment

Blonck May 10, 2023

Choose a reason for hiding this comment

vmoens left a comment

Choose a reason for hiding this comment

vmoens left a comment

Choose a reason for hiding this comment