Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] PPOTrainer #2550

Open
wants to merge 1 commit into
base: gh/vmoens/35/base
Choose a base branch
from
Open

[Feature] PPOTrainer #2550

wants to merge 1 commit into from

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 11, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 11, 2024
ghstack-source-id: ddd00c7ffb309d9fb845cdf8392c46774cb12b01
Pull Request resolved: #2550
Copy link

pytorch-bot bot commented Nov 11, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2550

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 9 Unrelated Failures

As of commit 01c4f88 with merge base 19dbeeb (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 11, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4240s 0.4225s 2.3671 Ops/s 2.2653 Ops/s $\color{#35bf28}+4.49\%$
test_transformed 0.6819s 0.6070s 1.6474 Ops/s 1.6861 Ops/s $\color{#d91a1a}-2.29\%$
test_serial 1.3137s 1.3111s 0.7627 Ops/s 0.7427 Ops/s $\color{#35bf28}+2.70\%$
test_parallel 1.2621s 1.2565s 0.7958 Ops/s 0.7785 Ops/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[True-True-True-True-True] 0.1839ms 27.3443μs 36.5708 KOps/s 36.6420 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-True-True-True-False] 49.8240μs 16.0554μs 62.2844 KOps/s 63.2159 KOps/s $\color{#d91a1a}-1.47\%$
test_step_mdp_speed[True-True-True-False-True] 40.6260μs 15.5581μs 64.2751 KOps/s 64.0255 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[True-True-True-False-False] 43.2010μs 9.1450μs 109.3495 KOps/s 110.2737 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-True-False-True-True] 76.6230μs 28.8726μs 34.6349 KOps/s 35.3817 KOps/s $\color{#d91a1a}-2.11\%$
test_step_mdp_speed[True-True-False-True-False] 56.1950μs 17.9164μs 55.8149 KOps/s 57.2928 KOps/s $\color{#d91a1a}-2.58\%$
test_step_mdp_speed[True-True-False-False-True] 48.5300μs 17.3677μs 57.5780 KOps/s 58.3394 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[True-True-False-False-False] 34.4740μs 10.8272μs 92.3599 KOps/s 93.1372 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-False-True-True-True] 96.7710μs 30.8877μs 32.3753 KOps/s 32.5254 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[True-False-True-True-False] 50.0740μs 19.4631μs 51.3792 KOps/s 52.2902 KOps/s $\color{#d91a1a}-1.74\%$
test_step_mdp_speed[True-False-True-False-True] 57.4170μs 17.2669μs 57.9142 KOps/s 58.4894 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[True-False-True-False-False] 0.5846ms 10.7594μs 92.9424 KOps/s 94.7122 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[True-False-False-True-True] 77.2150μs 32.5725μs 30.7008 KOps/s 31.5144 KOps/s $\color{#d91a1a}-2.58\%$
test_step_mdp_speed[True-False-False-True-False] 59.9620μs 21.3492μs 46.8401 KOps/s 47.8787 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[True-False-False-False-True] 59.3040μs 18.7740μs 53.2652 KOps/s 54.2879 KOps/s $\color{#d91a1a}-1.88\%$
test_step_mdp_speed[True-False-False-False-False] 37.3800μs 12.4682μs 80.2042 KOps/s 82.1007 KOps/s $\color{#d91a1a}-2.31\%$
test_step_mdp_speed[False-True-True-True-True] 72.5760μs 31.1102μs 32.1438 KOps/s 32.9177 KOps/s $\color{#d91a1a}-2.35\%$
test_step_mdp_speed[False-True-True-True-False] 47.9790μs 19.5320μs 51.1979 KOps/s 52.2153 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-True-True-False-True] 63.3080μs 19.8179μs 50.4593 KOps/s 51.2217 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[False-True-True-False-False] 35.3060μs 12.1613μs 82.2284 KOps/s 83.1630 KOps/s $\color{#d91a1a}-1.12\%$
test_step_mdp_speed[False-True-False-True-True] 73.7770μs 32.6175μs 30.6584 KOps/s 30.7730 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-True-False-True-False] 61.4550μs 21.1788μs 47.2171 KOps/s 48.0161 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-True-False-False-True] 3.0127ms 21.3547μs 46.8282 KOps/s 47.5423 KOps/s $\color{#d91a1a}-1.50\%$
test_step_mdp_speed[False-True-False-False-False] 37.3900μs 13.7963μs 72.4832 KOps/s 74.0882 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[False-False-True-True-True] 76.9640μs 34.1600μs 29.2740 KOps/s 29.5738 KOps/s $\color{#d91a1a}-1.01\%$
test_step_mdp_speed[False-False-True-True-False] 53.8810μs 22.6934μs 44.0658 KOps/s 44.7939 KOps/s $\color{#d91a1a}-1.63\%$
test_step_mdp_speed[False-False-True-False-True] 95.0370μs 21.1526μs 47.2756 KOps/s 47.5554 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-True-False-False] 44.5530μs 13.7882μs 72.5259 KOps/s 73.5573 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[False-False-False-True-True] 0.6088ms 35.0750μs 28.5103 KOps/s 28.5295 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[False-False-False-True-False] 61.3350μs 23.8046μs 42.0087 KOps/s 41.8723 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-False-False-True] 57.4080μs 22.3602μs 44.7224 KOps/s 44.0265 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-False-False-False-False] 46.6070μs 14.9531μs 66.8755 KOps/s 66.5351 KOps/s $\color{#35bf28}+0.51\%$
test_values[generalized_advantage_estimate-True-True] 13.8782ms 9.7433ms 102.6345 Ops/s 103.5362 Ops/s $\color{#d91a1a}-0.87\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.2371ms 35.5255ms 28.1488 Ops/s 28.0728 Ops/s $\color{#35bf28}+0.27\%$
test_values[td0_return_estimate-False-False] 0.2288ms 0.1671ms 5.9854 KOps/s 5.9640 KOps/s $\color{#35bf28}+0.36\%$
test_values[td1_return_estimate-False-False] 27.0943ms 23.9147ms 41.8153 Ops/s 42.2610 Ops/s $\color{#d91a1a}-1.05\%$
test_values[vec_td1_return_estimate-False-False] 37.5518ms 35.6389ms 28.0592 Ops/s 28.0572 Ops/s $+0.01\%$
test_values[td_lambda_return_estimate-True-False] 39.6772ms 34.4627ms 29.0168 Ops/s 29.2271 Ops/s $\color{#d91a1a}-0.72\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.5893ms 35.6351ms 28.0622 Ops/s 28.0399 Ops/s $\color{#35bf28}+0.08\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.9376ms 8.4189ms 118.7810 Ops/s 119.0850 Ops/s $\color{#d91a1a}-0.26\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.1181ms 1.8272ms 547.2707 Ops/s 559.5163 Ops/s $\color{#d91a1a}-2.19\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5017ms 0.3533ms 2.8303 KOps/s 2.7761 KOps/s $\color{#35bf28}+1.95\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.2055ms 45.6621ms 21.9000 Ops/s 21.2007 Ops/s $\color{#35bf28}+3.30\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.8345ms 3.0337ms 329.6326 Ops/s 304.3972 Ops/s $\textbf{\color{#35bf28}+8.29\%}$
test_dqn_speed[False-None] 6.1107ms 1.3059ms 765.7270 Ops/s 742.3003 Ops/s $\color{#35bf28}+3.16\%$
test_dqn_speed[False-backward] 1.9230ms 1.7836ms 560.6738 Ops/s 553.7313 Ops/s $\color{#35bf28}+1.25\%$
test_dqn_speed[True-None] 0.7070ms 0.4614ms 2.1675 KOps/s 2.1370 KOps/s $\color{#35bf28}+1.43\%$
test_dqn_speed[True-backward] 0.9572ms 0.8806ms 1.1356 KOps/s 1.1269 KOps/s $\color{#35bf28}+0.78\%$
test_dqn_speed[reduce-overhead-None] 0.7697ms 0.4670ms 2.1415 KOps/s 2.1454 KOps/s $\color{#d91a1a}-0.18\%$
test_dqn_speed[reduce-overhead-backward] 0.9497ms 0.8802ms 1.1361 KOps/s 1.1253 KOps/s $\color{#35bf28}+0.95\%$
test_ddpg_speed[False-None] 3.5961ms 2.7348ms 365.6508 Ops/s 361.1955 Ops/s $\color{#35bf28}+1.23\%$
test_ddpg_speed[False-backward] 4.8199ms 4.2186ms 237.0475 Ops/s 256.6051 Ops/s $\textbf{\color{#d91a1a}-7.62\%}$
test_ddpg_speed[True-None] 1.1901ms 1.0023ms 997.6774 Ops/s 996.6380 Ops/s $\color{#35bf28}+0.10\%$
test_ddpg_speed[True-backward] 1.9211ms 1.8870ms 529.9353 Ops/s 525.9675 Ops/s $\color{#35bf28}+0.75\%$
test_ddpg_speed[reduce-overhead-None] 1.6857ms 1.0074ms 992.6174 Ops/s 995.0243 Ops/s $\color{#d91a1a}-0.24\%$
test_ddpg_speed[reduce-overhead-backward] 2.0586ms 1.9073ms 524.2991 Ops/s 527.6762 Ops/s $\color{#d91a1a}-0.64\%$
test_sac_speed[False-None] 8.0381ms 7.6997ms 129.8758 Ops/s 128.6308 Ops/s $\color{#35bf28}+0.97\%$
test_sac_speed[False-backward] 10.9245ms 10.3954ms 96.1962 Ops/s 95.5240 Ops/s $\color{#35bf28}+0.70\%$
test_sac_speed[True-None] 2.4039ms 1.8335ms 545.4122 Ops/s 546.2246 Ops/s $\color{#d91a1a}-0.15\%$
test_sac_speed[True-backward] 3.6005ms 3.5155ms 284.4553 Ops/s 280.5310 Ops/s $\color{#35bf28}+1.40\%$
test_sac_speed[reduce-overhead-None] 2.4020ms 1.8397ms 543.5558 Ops/s 541.5856 Ops/s $\color{#35bf28}+0.36\%$
test_sac_speed[reduce-overhead-backward] 4.1701ms 3.5589ms 280.9875 Ops/s 281.2166 Ops/s $\color{#d91a1a}-0.08\%$
test_redq_speed[False-None] 15.2429ms 12.6203ms 79.2377 Ops/s 76.7288 Ops/s $\color{#35bf28}+3.27\%$
test_redq_speed[False-backward] 23.7788ms 22.1630ms 45.1202 Ops/s 44.8308 Ops/s $\color{#35bf28}+0.65\%$
test_redq_speed[True-None] 5.2605ms 4.5169ms 221.3890 Ops/s 216.8304 Ops/s $\color{#35bf28}+2.10\%$
test_redq_speed[True-backward] 12.5461ms 11.8634ms 84.2932 Ops/s 83.8778 Ops/s $\color{#35bf28}+0.50\%$
test_redq_speed[reduce-overhead-None] 5.6404ms 4.5016ms 222.1438 Ops/s 222.3713 Ops/s $\color{#d91a1a}-0.10\%$
test_redq_speed[reduce-overhead-backward] 12.5815ms 11.8895ms 84.1076 Ops/s 83.3062 Ops/s $\color{#35bf28}+0.96\%$
test_redq_deprec_speed[False-None] 14.1871ms 12.3770ms 80.7951 Ops/s 79.6050 Ops/s $\color{#35bf28}+1.50\%$
test_redq_deprec_speed[False-backward] 19.3877ms 18.2027ms 54.9369 Ops/s 53.6399 Ops/s $\color{#35bf28}+2.42\%$
test_redq_deprec_speed[True-None] 4.2746ms 3.5349ms 282.8952 Ops/s 278.2836 Ops/s $\color{#35bf28}+1.66\%$
test_redq_deprec_speed[True-backward] 8.9182ms 8.2174ms 121.6926 Ops/s 120.7661 Ops/s $\color{#35bf28}+0.77\%$
test_redq_deprec_speed[reduce-overhead-None] 4.3789ms 3.5371ms 282.7196 Ops/s 279.6787 Ops/s $\color{#35bf28}+1.09\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.0650ms 7.8695ms 127.0736 Ops/s 121.7448 Ops/s $\color{#35bf28}+4.38\%$
test_td3_speed[False-None] 7.8299ms 7.5252ms 132.8873 Ops/s 128.4111 Ops/s $\color{#35bf28}+3.49\%$
test_td3_speed[False-backward] 11.3532ms 9.9643ms 100.3586 Ops/s 98.5565 Ops/s $\color{#35bf28}+1.83\%$
test_td3_speed[True-None] 1.8412ms 1.7039ms 586.9021 Ops/s 567.4030 Ops/s $\color{#35bf28}+3.44\%$
test_td3_speed[True-backward] 3.3506ms 3.3006ms 302.9757 Ops/s 295.2702 Ops/s $\color{#35bf28}+2.61\%$
test_td3_speed[reduce-overhead-None] 1.8798ms 1.6991ms 588.5526 Ops/s 568.4988 Ops/s $\color{#35bf28}+3.53\%$
test_td3_speed[reduce-overhead-backward] 3.3713ms 3.2999ms 303.0417 Ops/s 293.3654 Ops/s $\color{#35bf28}+3.30\%$
test_cql_speed[False-None] 37.6068ms 35.3794ms 28.2650 Ops/s 27.9736 Ops/s $\color{#35bf28}+1.04\%$
test_cql_speed[False-backward] 48.8777ms 44.9715ms 22.2363 Ops/s 21.7130 Ops/s $\color{#35bf28}+2.41\%$
test_cql_speed[True-None] 16.5365ms 15.4652ms 64.6612 Ops/s 64.5565 Ops/s $\color{#35bf28}+0.16\%$
test_cql_speed[True-backward] 22.6763ms 21.5344ms 46.4374 Ops/s 44.9057 Ops/s $\color{#35bf28}+3.41\%$
test_cql_speed[reduce-overhead-None] 16.8158ms 15.5540ms 64.2920 Ops/s 64.1603 Ops/s $\color{#35bf28}+0.21\%$
test_cql_speed[reduce-overhead-backward] 22.7110ms 21.7466ms 45.9842 Ops/s 44.9706 Ops/s $\color{#35bf28}+2.25\%$
test_a2c_speed[False-None] 9.2382ms 6.9678ms 143.5171 Ops/s 139.6826 Ops/s $\color{#35bf28}+2.75\%$
test_a2c_speed[False-backward] 14.9978ms 13.9533ms 71.6676 Ops/s 68.9869 Ops/s $\color{#35bf28}+3.89\%$
test_a2c_speed[True-None] 3.7594ms 3.2886ms 304.0817 Ops/s 299.8216 Ops/s $\color{#35bf28}+1.42\%$
test_a2c_speed[True-backward] 10.0485ms 9.5961ms 104.2092 Ops/s 102.5603 Ops/s $\color{#35bf28}+1.61\%$
test_a2c_speed[reduce-overhead-None] 3.8228ms 3.3091ms 302.1949 Ops/s 299.5499 Ops/s $\color{#35bf28}+0.88\%$
test_a2c_speed[reduce-overhead-backward] 10.4006ms 9.5818ms 104.3649 Ops/s 103.3750 Ops/s $\color{#35bf28}+0.96\%$
test_ppo_speed[False-None] 8.7075ms 7.2425ms 138.0747 Ops/s 134.8730 Ops/s $\color{#35bf28}+2.37\%$
test_ppo_speed[False-backward] 16.1837ms 14.4427ms 69.2393 Ops/s 67.3695 Ops/s $\color{#35bf28}+2.78\%$
test_ppo_speed[True-None] 4.1339ms 3.6842ms 271.4276 Ops/s 266.8592 Ops/s $\color{#35bf28}+1.71\%$
test_ppo_speed[True-backward] 10.3992ms 9.4786ms 105.5003 Ops/s 105.1123 Ops/s $\color{#35bf28}+0.37\%$
test_ppo_speed[reduce-overhead-None] 4.3436ms 3.6733ms 272.2357 Ops/s 265.6972 Ops/s $\color{#35bf28}+2.46\%$
test_ppo_speed[reduce-overhead-backward] 9.9317ms 9.5533ms 104.6756 Ops/s 104.9281 Ops/s $\color{#d91a1a}-0.24\%$
test_reinforce_speed[False-None] 8.4819ms 6.5560ms 152.5321 Ops/s 154.5708 Ops/s $\color{#d91a1a}-1.32\%$
test_reinforce_speed[False-backward] 10.4577ms 9.5629ms 104.5709 Ops/s 102.5512 Ops/s $\color{#35bf28}+1.97\%$
test_reinforce_speed[True-None] 3.2427ms 2.6292ms 380.3503 Ops/s 375.9727 Ops/s $\color{#35bf28}+1.16\%$
test_reinforce_speed[True-backward] 8.9270ms 8.4352ms 118.5505 Ops/s 116.7734 Ops/s $\color{#35bf28}+1.52\%$
test_reinforce_speed[reduce-overhead-None] 3.0027ms 2.6138ms 382.5802 Ops/s 371.1413 Ops/s $\color{#35bf28}+3.08\%$
test_reinforce_speed[reduce-overhead-backward] 8.9219ms 8.5218ms 117.3467 Ops/s 117.1854 Ops/s $\color{#35bf28}+0.14\%$
test_iql_speed[False-None] 32.9489ms 31.6473ms 31.5983 Ops/s 31.0286 Ops/s $\color{#35bf28}+1.84\%$
test_iql_speed[False-backward] 46.2511ms 44.5160ms 22.4638 Ops/s 22.0205 Ops/s $\color{#35bf28}+2.01\%$
test_iql_speed[True-None] 11.3772ms 10.5481ms 94.8035 Ops/s 93.7922 Ops/s $\color{#35bf28}+1.08\%$
test_iql_speed[True-backward] 22.0876ms 21.1226ms 47.3427 Ops/s 46.7032 Ops/s $\color{#35bf28}+1.37\%$
test_iql_speed[reduce-overhead-None] 11.5437ms 10.5664ms 94.6398 Ops/s 94.2024 Ops/s $\color{#35bf28}+0.46\%$
test_iql_speed[reduce-overhead-backward] 22.3145ms 21.1813ms 47.2116 Ops/s 46.6374 Ops/s $\color{#35bf28}+1.23\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.9507ms 4.7441ms 210.7867 Ops/s 202.2846 Ops/s $\color{#35bf28}+4.20\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.6361ms 0.5026ms 1.9898 KOps/s 1.9587 KOps/s $\color{#35bf28}+1.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6946ms 0.4804ms 2.0817 KOps/s 2.0722 KOps/s $\color{#35bf28}+0.46\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.9050ms 4.5073ms 221.8628 Ops/s 218.2287 Ops/s $\color{#35bf28}+1.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9958ms 0.4891ms 2.0446 KOps/s 2.0388 KOps/s $\color{#35bf28}+0.29\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6468ms 0.4653ms 2.1492 KOps/s 2.1349 KOps/s $\color{#35bf28}+0.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2013ms 1.6171ms 618.3943 Ops/s 613.1592 Ops/s $\color{#35bf28}+0.85\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.2298ms 1.5665ms 638.3825 Ops/s 638.1371 Ops/s $\color{#35bf28}+0.04\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.9354ms 4.6801ms 213.6707 Ops/s 208.7103 Ops/s $\color{#35bf28}+2.38\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.7675ms 0.6358ms 1.5728 KOps/s 1.5736 KOps/s $\color{#d91a1a}-0.05\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8241ms 0.6055ms 1.6515 KOps/s 1.6192 KOps/s $\color{#35bf28}+1.99\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8625ms 4.5635ms 219.1292 Ops/s 213.5777 Ops/s $\color{#35bf28}+2.60\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9550ms 0.5077ms 1.9697 KOps/s 1.9680 KOps/s $\color{#35bf28}+0.09\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6502ms 0.4783ms 2.0907 KOps/s 2.0448 KOps/s $\color{#35bf28}+2.25\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.7803ms 4.4471ms 224.8637 Ops/s 215.9625 Ops/s $\color{#35bf28}+4.12\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7483ms 0.4906ms 2.0385 KOps/s 2.0243 KOps/s $\color{#35bf28}+0.70\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 7.8892ms 0.4766ms 2.0983 KOps/s 2.1288 KOps/s $\color{#d91a1a}-1.43\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1429ms 4.7272ms 211.5420 Ops/s 211.5431 Ops/s $-0.00\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4177ms 0.6448ms 1.5508 KOps/s 1.5520 KOps/s $\color{#d91a1a}-0.08\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7416ms 0.6107ms 1.6374 KOps/s 1.6407 KOps/s $\color{#d91a1a}-0.20\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4937ms 4.1466ms 241.1603 Ops/s 40.1778 Ops/s $\textbf{\color{#35bf28}+500.23\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 4.7054ms 2.2912ms 436.4496 Ops/s 380.2575 Ops/s $\textbf{\color{#35bf28}+14.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.6640ms 1.3322ms 750.6305 Ops/s 755.8930 Ops/s $\color{#d91a1a}-0.70\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3899s 11.9542ms 83.6525 Ops/s 236.4211 Ops/s $\textbf{\color{#d91a1a}-64.62\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.2463ms 2.2956ms 435.6249 Ops/s 433.4114 Ops/s $\color{#35bf28}+0.51\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.5263ms 1.3740ms 727.8161 Ops/s 821.3944 Ops/s $\textbf{\color{#d91a1a}-11.39\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.8705ms 4.3844ms 228.0789 Ops/s 240.6607 Ops/s $\textbf{\color{#d91a1a}-5.23\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.1435ms 2.4543ms 407.4454 Ops/s 419.6695 Ops/s $\color{#d91a1a}-2.91\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.8747ms 1.5064ms 663.8366 Ops/s 639.1976 Ops/s $\color{#35bf28}+3.85\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.3923ms 10.9245ms 91.5373 Ops/s 87.4243 Ops/s $\color{#35bf28}+4.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.5712ms 14.3290ms 69.7887 Ops/s 70.8486 Ops/s $\color{#d91a1a}-1.50\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.2339ms 19.6701ms 50.8385 Ops/s 49.9504 Ops/s $\color{#35bf28}+1.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 14.5635ms 14.3392ms 69.7388 Ops/s 69.9209 Ops/s $\color{#d91a1a}-0.26\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.3995ms 19.8996ms 50.2522 Ops/s 50.0253 Ops/s $\color{#35bf28}+0.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 15.6990ms 15.4417ms 64.7598 Ops/s 65.0810 Ops/s $\color{#d91a1a}-0.49\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7472s 0.7413s 1.3490 Ops/s 1.3330 Ops/s $\color{#35bf28}+1.20\%$
test_transformed 1.0941s 1.0133s 0.9868 Ops/s 1.0219 Ops/s $\color{#d91a1a}-3.43\%$
test_serial 2.2365s 2.1480s 0.4656 Ops/s 0.4736 Ops/s $\color{#d91a1a}-1.71\%$
test_parallel 2.0806s 2.0108s 0.4973 Ops/s 0.5045 Ops/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-True-True-True-True] 0.1583ms 34.6711μs 28.8425 KOps/s 27.6834 KOps/s $\color{#35bf28}+4.19\%$
test_step_mdp_speed[True-True-True-True-False] 83.3440μs 20.2358μs 49.4173 KOps/s 48.8527 KOps/s $\color{#35bf28}+1.16\%$
test_step_mdp_speed[True-True-True-False-True] 82.0040μs 20.1641μs 49.5932 KOps/s 51.0157 KOps/s $\color{#d91a1a}-2.79\%$
test_step_mdp_speed[True-True-True-False-False] 81.2650μs 11.4981μs 86.9710 KOps/s 87.5645 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-True-False-True-True] 0.1156ms 38.0072μs 26.3108 KOps/s 26.4678 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[True-True-False-True-False] 46.2020μs 22.2697μs 44.9042 KOps/s 45.0179 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-True-False-False-True] 54.0930μs 21.5900μs 46.3177 KOps/s 46.3176 KOps/s $+0.00\%$
test_step_mdp_speed[True-True-False-False-False] 0.1335ms 13.6253μs 73.3930 KOps/s 73.9345 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[True-False-True-True-True] 76.9540μs 40.4293μs 24.7346 KOps/s 25.1805 KOps/s $\color{#d91a1a}-1.77\%$
test_step_mdp_speed[True-False-True-True-False] 69.2330μs 23.2696μs 42.9746 KOps/s 41.7206 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-False-True-False-True] 51.7020μs 21.5287μs 46.4497 KOps/s 46.2529 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-False-True-False-False] 38.4320μs 13.3396μs 74.9648 KOps/s 73.5520 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[True-False-False-True-True] 77.1440μs 40.8013μs 24.5090 KOps/s 24.2928 KOps/s $\color{#35bf28}+0.89\%$
test_step_mdp_speed[True-False-False-True-False] 50.7130μs 26.0125μs 38.4431 KOps/s 37.8561 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[True-False-False-False-True] 48.1220μs 23.4825μs 42.5848 KOps/s 42.6560 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[True-False-False-False-False] 37.7920μs 15.2856μs 65.4212 KOps/s 65.7534 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[False-True-True-True-True] 65.7430μs 39.4304μs 25.3612 KOps/s 25.0329 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-True-True-True-False] 57.9030μs 24.3245μs 41.1108 KOps/s 41.5675 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[False-True-True-False-True] 71.9940μs 25.0607μs 39.9031 KOps/s 38.6931 KOps/s $\color{#35bf28}+3.13\%$
test_step_mdp_speed[False-True-True-False-False] 0.1272ms 14.8116μs 67.5145 KOps/s 66.3440 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[False-True-False-True-True] 67.8630μs 41.6900μs 23.9866 KOps/s 23.7937 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[False-True-False-True-False] 54.5030μs 26.0136μs 38.4415 KOps/s 39.5998 KOps/s $\color{#d91a1a}-2.92\%$
test_step_mdp_speed[False-True-False-False-True] 3.5408ms 27.3170μs 36.6073 KOps/s 37.0546 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[False-True-False-False-False] 0.1588ms 16.9499μs 58.9973 KOps/s 59.2576 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[False-False-True-True-True] 72.5830μs 43.9978μs 22.7284 KOps/s 22.9220 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-False-True-True-False] 55.1730μs 28.0944μs 35.5942 KOps/s 35.5625 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[False-False-True-False-True] 54.3820μs 27.4204μs 36.4692 KOps/s 36.5505 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[False-False-True-False-False] 41.8620μs 16.7753μs 59.6114 KOps/s 58.8550 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[False-False-False-True-True] 71.7130μs 44.7473μs 22.3477 KOps/s 22.4462 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[False-False-False-True-False] 59.7330μs 30.1673μs 33.1485 KOps/s 33.4827 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[False-False-False-False-True] 51.3720μs 28.6068μs 34.9568 KOps/s 35.2111 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[False-False-False-False-False] 48.0520μs 18.6296μs 53.6781 KOps/s 53.5799 KOps/s $\color{#35bf28}+0.18\%$
test_values[generalized_advantage_estimate-True-True] 25.8925ms 24.9616ms 40.0615 Ops/s 39.9398 Ops/s $\color{#35bf28}+0.30\%$
test_values[vec_generalized_advantage_estimate-True-True] 93.3248ms 2.7603ms 362.2794 Ops/s 342.3076 Ops/s $\textbf{\color{#35bf28}+5.83\%}$
test_values[td0_return_estimate-False-False] 86.7640μs 64.8231μs 15.4266 KOps/s 14.9455 KOps/s $\color{#35bf28}+3.22\%$
test_values[td1_return_estimate-False-False] 55.4780ms 55.0157ms 18.1766 Ops/s 17.9181 Ops/s $\color{#35bf28}+1.44\%$
test_values[vec_td1_return_estimate-False-False] 1.2474ms 1.0723ms 932.5674 Ops/s 926.6287 Ops/s $\color{#35bf28}+0.64\%$
test_values[td_lambda_return_estimate-True-False] 89.7303ms 87.1593ms 11.4732 Ops/s 11.2723 Ops/s $\color{#35bf28}+1.78\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2896ms 1.0598ms 943.5714 Ops/s 927.4627 Ops/s $\color{#35bf28}+1.74\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.3719ms 24.4221ms 40.9465 Ops/s 40.8526 Ops/s $\color{#35bf28}+0.23\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0505ms 0.7606ms 1.3147 KOps/s 1.3416 KOps/s $\color{#d91a1a}-2.01\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8410ms 0.6585ms 1.5186 KOps/s 1.5062 KOps/s $\color{#35bf28}+0.82\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6686ms 1.4722ms 679.2553 Ops/s 680.3352 Ops/s $\color{#d91a1a}-0.16\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8370ms 0.6736ms 1.4845 KOps/s 1.4770 KOps/s $\color{#35bf28}+0.51\%$
test_dqn_speed[False-None] 6.9359ms 1.3282ms 752.8817 Ops/s 773.2815 Ops/s $\color{#d91a1a}-2.64\%$
test_dqn_speed[False-backward] 2.0563ms 1.8945ms 527.8414 Ops/s 552.5942 Ops/s $\color{#d91a1a}-4.48\%$
test_dqn_speed[True-None] 1.1849ms 0.5752ms 1.7385 KOps/s 1.8099 KOps/s $\color{#d91a1a}-3.95\%$
test_dqn_speed[True-backward] 1.0707ms 1.0189ms 981.4756 Ops/s 981.1779 Ops/s $\color{#35bf28}+0.03\%$
test_dqn_speed[reduce-overhead-None] 0.7510ms 0.5645ms 1.7716 KOps/s 1.7948 KOps/s $\color{#d91a1a}-1.29\%$
test_dqn_speed[reduce-overhead-backward] 1.0538ms 1.0197ms 980.6919 Ops/s 979.5334 Ops/s $\color{#35bf28}+0.12\%$
test_ddpg_speed[False-None] 3.2097ms 2.6987ms 370.5432 Ops/s 377.0791 Ops/s $\color{#d91a1a}-1.73\%$
test_ddpg_speed[False-backward] 4.1636ms 3.9371ms 253.9929 Ops/s 254.2206 Ops/s $\color{#d91a1a}-0.09\%$
test_ddpg_speed[True-None] 1.8619ms 1.2710ms 786.8026 Ops/s 807.1560 Ops/s $\color{#d91a1a}-2.52\%$
test_ddpg_speed[True-backward] 2.6846ms 2.2620ms 442.0959 Ops/s 443.8408 Ops/s $\color{#d91a1a}-0.39\%$
test_ddpg_speed[reduce-overhead-None] 1.6380ms 1.2647ms 790.6702 Ops/s 803.3838 Ops/s $\color{#d91a1a}-1.58\%$
test_ddpg_speed[reduce-overhead-backward] 2.2885ms 2.2488ms 444.6778 Ops/s 447.3082 Ops/s $\color{#d91a1a}-0.59\%$
test_sac_speed[False-None] 8.7793ms 7.5882ms 131.7838 Ops/s 133.0482 Ops/s $\color{#d91a1a}-0.95\%$
test_sac_speed[False-backward] 11.3129ms 10.8931ms 91.8012 Ops/s 92.5444 Ops/s $\color{#d91a1a}-0.80\%$
test_sac_speed[True-None] 2.3353ms 2.0204ms 494.9430 Ops/s 489.2524 Ops/s $\color{#35bf28}+1.16\%$
test_sac_speed[True-backward] 4.2301ms 3.9855ms 250.9076 Ops/s 248.2999 Ops/s $\color{#35bf28}+1.05\%$
test_sac_speed[reduce-overhead-None] 2.3847ms 2.0361ms 491.1251 Ops/s 492.8969 Ops/s $\color{#d91a1a}-0.36\%$
test_sac_speed[reduce-overhead-backward] 4.1780ms 4.0092ms 249.4290 Ops/s 251.7427 Ops/s $\color{#d91a1a}-0.92\%$
test_redq_speed[False-None] 16.0734ms 11.4088ms 87.6517 Ops/s 96.9226 Ops/s $\textbf{\color{#d91a1a}-9.57\%}$
test_redq_speed[False-backward] 18.6710ms 17.7722ms 56.2677 Ops/s 55.8515 Ops/s $\color{#35bf28}+0.75\%$
test_redq_speed[True-None] 3.8642ms 3.4947ms 286.1445 Ops/s 282.5908 Ops/s $\color{#35bf28}+1.26\%$
test_redq_speed[True-backward] 9.1678ms 8.8396ms 113.1277 Ops/s 112.4905 Ops/s $\color{#35bf28}+0.57\%$
test_redq_speed[reduce-overhead-None] 4.0791ms 3.5626ms 280.6934 Ops/s 277.6618 Ops/s $\color{#35bf28}+1.09\%$
test_redq_speed[reduce-overhead-backward] 8.9588ms 8.6991ms 114.9542 Ops/s 112.5810 Ops/s $\color{#35bf28}+2.11\%$
test_redq_deprec_speed[False-None] 11.3508ms 10.7716ms 92.8368 Ops/s 95.3987 Ops/s $\color{#d91a1a}-2.69\%$
test_redq_deprec_speed[False-backward] 16.3879ms 15.7284ms 63.5792 Ops/s 64.8644 Ops/s $\color{#d91a1a}-1.98\%$
test_redq_deprec_speed[True-None] 4.5811ms 3.3995ms 294.1589 Ops/s 299.8872 Ops/s $\color{#d91a1a}-1.91\%$
test_redq_deprec_speed[True-backward] 7.5273ms 7.2003ms 138.8827 Ops/s 133.9306 Ops/s $\color{#35bf28}+3.70\%$
test_redq_deprec_speed[reduce-overhead-None] 3.4817ms 3.2536ms 307.3478 Ops/s 306.6101 Ops/s $\color{#35bf28}+0.24\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.4658ms 7.2315ms 138.2845 Ops/s 135.7056 Ops/s $\color{#35bf28}+1.90\%$
test_td3_speed[False-None] 7.8314ms 7.5918ms 131.7207 Ops/s 132.4561 Ops/s $\color{#d91a1a}-0.56\%$
test_td3_speed[False-backward] 11.2177ms 10.6236ms 94.1300 Ops/s 95.3921 Ops/s $\color{#d91a1a}-1.32\%$
test_td3_speed[True-None] 1.9703ms 1.9270ms 518.9413 Ops/s 524.8193 Ops/s $\color{#d91a1a}-1.12\%$
test_td3_speed[True-backward] 3.9222ms 3.7558ms 266.2565 Ops/s 266.1101 Ops/s $\color{#35bf28}+0.06\%$
test_td3_speed[reduce-overhead-None] 1.9456ms 1.9068ms 524.4467 Ops/s 521.8435 Ops/s $\color{#35bf28}+0.50\%$
test_td3_speed[reduce-overhead-backward] 3.9781ms 3.7571ms 266.1659 Ops/s 269.9751 Ops/s $\color{#d91a1a}-1.41\%$
test_cql_speed[False-None] 28.1893ms 25.1806ms 39.7131 Ops/s 39.6162 Ops/s $\color{#35bf28}+0.24\%$
test_cql_speed[False-backward] 39.1271ms 34.7748ms 28.7564 Ops/s 21.6838 Ops/s $\textbf{\color{#35bf28}+32.62\%}$
test_cql_speed[True-None] 11.5266ms 10.9626ms 91.2193 Ops/s 91.5408 Ops/s $\color{#d91a1a}-0.35\%$
test_cql_speed[True-backward] 17.6736ms 17.0697ms 58.5835 Ops/s 58.4713 Ops/s $\color{#35bf28}+0.19\%$
test_cql_speed[reduce-overhead-None] 11.3689ms 10.9890ms 90.9997 Ops/s 91.3074 Ops/s $\color{#d91a1a}-0.34\%$
test_cql_speed[reduce-overhead-backward] 17.5472ms 16.9828ms 58.8832 Ops/s 58.1820 Ops/s $\color{#35bf28}+1.21\%$
test_a2c_speed[False-None] 5.6270ms 5.3466ms 187.0357 Ops/s 186.3450 Ops/s $\color{#35bf28}+0.37\%$
test_a2c_speed[False-backward] 12.3772ms 11.9704ms 83.5394 Ops/s 83.1366 Ops/s $\color{#35bf28}+0.48\%$
test_a2c_speed[True-None] 3.3947ms 3.0400ms 328.9444 Ops/s 326.7511 Ops/s $\color{#35bf28}+0.67\%$
test_a2c_speed[True-backward] 8.8661ms 8.5069ms 117.5516 Ops/s 108.2247 Ops/s $\textbf{\color{#35bf28}+8.62\%}$
test_a2c_speed[reduce-overhead-None] 3.3352ms 3.0129ms 331.9071 Ops/s 330.6269 Ops/s $\color{#35bf28}+0.39\%$
test_a2c_speed[reduce-overhead-backward] 8.6220ms 8.4707ms 118.0537 Ops/s 119.2207 Ops/s $\color{#d91a1a}-0.98\%$
test_ppo_speed[False-None] 6.0375ms 5.7071ms 175.2217 Ops/s 177.8718 Ops/s $\color{#d91a1a}-1.49\%$
test_ppo_speed[False-backward] 12.9057ms 12.5316ms 79.7984 Ops/s 81.0305 Ops/s $\color{#d91a1a}-1.52\%$
test_ppo_speed[True-None] 3.7609ms 3.4158ms 292.7611 Ops/s 291.0665 Ops/s $\color{#35bf28}+0.58\%$
test_ppo_speed[True-backward] 8.6339ms 8.2414ms 121.3389 Ops/s 112.5147 Ops/s $\textbf{\color{#35bf28}+7.84\%}$
test_ppo_speed[reduce-overhead-None] 3.6210ms 3.4005ms 294.0748 Ops/s 293.0771 Ops/s $\color{#35bf28}+0.34\%$
test_ppo_speed[reduce-overhead-backward] 8.4447ms 8.2478ms 121.2451 Ops/s 121.4515 Ops/s $\color{#d91a1a}-0.17\%$
test_reinforce_speed[False-None] 4.7625ms 4.5012ms 222.1616 Ops/s 225.3284 Ops/s $\color{#d91a1a}-1.41\%$
test_reinforce_speed[False-backward] 7.7126ms 7.3965ms 135.1987 Ops/s 126.7234 Ops/s $\textbf{\color{#35bf28}+6.69\%}$
test_reinforce_speed[True-None] 2.4066ms 2.2139ms 451.6895 Ops/s 442.8922 Ops/s $\color{#35bf28}+1.99\%$
test_reinforce_speed[True-backward] 7.5515ms 7.1673ms 139.5219 Ops/s 133.8487 Ops/s $\color{#35bf28}+4.24\%$
test_reinforce_speed[reduce-overhead-None] 2.4136ms 2.2105ms 452.3943 Ops/s 453.8652 Ops/s $\color{#d91a1a}-0.32\%$
test_reinforce_speed[reduce-overhead-backward] 7.6338ms 7.2016ms 138.8576 Ops/s 139.9682 Ops/s $\color{#d91a1a}-0.79\%$
test_iql_speed[False-None] 21.2175ms 19.6726ms 50.8321 Ops/s 50.4449 Ops/s $\color{#35bf28}+0.77\%$
test_iql_speed[False-backward] 31.6909ms 30.6695ms 32.6057 Ops/s 32.3288 Ops/s $\color{#35bf28}+0.86\%$
test_iql_speed[True-None] 7.3789ms 6.7733ms 147.6393 Ops/s 142.2646 Ops/s $\color{#35bf28}+3.78\%$
test_iql_speed[True-backward] 15.9319ms 15.5073ms 64.4858 Ops/s 62.0462 Ops/s $\color{#35bf28}+3.93\%$
test_iql_speed[reduce-overhead-None] 7.1508ms 6.7746ms 147.6092 Ops/s 146.6295 Ops/s $\color{#35bf28}+0.67\%$
test_iql_speed[reduce-overhead-backward] 16.0154ms 15.4213ms 64.8454 Ops/s 63.7317 Ops/s $\color{#35bf28}+1.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8926ms 6.3273ms 158.0444 Ops/s 157.4583 Ops/s $\color{#35bf28}+0.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5352ms 0.2767ms 3.6144 KOps/s 3.6719 KOps/s $\color{#d91a1a}-1.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4687ms 0.2505ms 3.9923 KOps/s 3.5490 KOps/s $\textbf{\color{#35bf28}+12.49\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4690ms 6.1096ms 163.6768 Ops/s 163.8945 Ops/s $\color{#d91a1a}-0.13\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.9512ms 0.3057ms 3.2709 KOps/s 3.0404 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5986ms 0.3452ms 2.8965 KOps/s 2.9636 KOps/s $\color{#d91a1a}-2.26\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7508ms 1.4081ms 710.1851 Ops/s 787.0073 Ops/s $\textbf{\color{#d91a1a}-9.76\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6457ms 1.2166ms 821.9532 Ops/s 815.3430 Ops/s $\color{#35bf28}+0.81\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5128ms 6.3026ms 158.6638 Ops/s 157.8203 Ops/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9570ms 0.4231ms 2.3636 KOps/s 2.1758 KOps/s $\textbf{\color{#35bf28}+8.63\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7514ms 0.4480ms 2.2320 KOps/s 2.2792 KOps/s $\color{#d91a1a}-2.07\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3504ms 6.1556ms 162.4529 Ops/s 163.0432 Ops/s $\color{#d91a1a}-0.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9857ms 0.3413ms 2.9302 KOps/s 2.7047 KOps/s $\textbf{\color{#35bf28}+8.34\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6092ms 0.3744ms 2.6706 KOps/s 4.0090 KOps/s $\textbf{\color{#d91a1a}-33.38\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5084ms 6.0867ms 164.2939 Ops/s 164.5674 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7593ms 0.2924ms 3.4201 KOps/s 3.4757 KOps/s $\color{#d91a1a}-1.60\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5781ms 0.2959ms 3.3794 KOps/s 4.2374 KOps/s $\textbf{\color{#d91a1a}-20.25\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4814ms 6.2715ms 159.4509 Ops/s 161.1917 Ops/s $\color{#d91a1a}-1.08\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9165ms 0.4666ms 2.1430 KOps/s 2.4315 KOps/s $\textbf{\color{#d91a1a}-11.87\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6838ms 0.4313ms 2.3187 KOps/s 2.5554 KOps/s $\textbf{\color{#d91a1a}-9.26\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4404s 14.0778ms 71.0337 Ops/s 192.2060 Ops/s $\textbf{\color{#d91a1a}-63.04\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.8885ms 2.0412ms 489.9066 Ops/s 437.1767 Ops/s $\textbf{\color{#35bf28}+12.06\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.8763ms 1.1895ms 840.6543 Ops/s 954.9698 Ops/s $\textbf{\color{#d91a1a}-11.97\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.9807ms 5.3399ms 187.2705 Ops/s 33.2281 Ops/s $\textbf{\color{#35bf28}+463.59\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 10.0000ms 2.0483ms 488.2190 Ops/s 508.7335 Ops/s $\color{#d91a1a}-4.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.3798ms 1.1500ms 869.5751 Ops/s 971.3744 Ops/s $\textbf{\color{#d91a1a}-10.48\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3945s 13.4044ms 74.6025 Ops/s 176.7723 Ops/s $\textbf{\color{#d91a1a}-57.80\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.0471ms 1.8715ms 534.3251 Ops/s 453.5216 Ops/s $\textbf{\color{#35bf28}+17.82\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.3179ms 1.2045ms 830.2228 Ops/s 702.9298 Ops/s $\textbf{\color{#35bf28}+18.11\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4503ms 12.9357ms 77.3055 Ops/s 78.3326 Ops/s $\color{#d91a1a}-1.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.5981ms 17.0792ms 58.5506 Ops/s 58.8671 Ops/s $\color{#d91a1a}-0.54\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2838ms 17.5907ms 56.8483 Ops/s 56.8440 Ops/s $+0.01\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 18.0285ms 17.2454ms 57.9865 Ops/s 59.4343 Ops/s $\color{#d91a1a}-2.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 18.0394ms 17.4560ms 57.2868 Ops/s 57.0947 Ops/s $\color{#35bf28}+0.34\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.0514ms 18.5057ms 54.0375 Ops/s 54.3754 Ops/s $\color{#d91a1a}-0.62\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants