Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Improve LunarLander-v2 step performance by >1.5x (#170) #235

Merged
merged 2 commits into from
Jan 3, 2023

Conversation

PaulMest
Copy link
Contributor

@PaulMest PaulMest commented Jan 2, 2023

Description

Improve model training performance by only calculating particles when in a render mode. Initial results vary between 1.5x - 1.75x speed improvement.

Fixes #170

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Results

$ python -m ml-stuff.lunar_lander_turbo --seed 44
Training took 67.50 seconds (baseline)
Training took 44.39 seconds (with_change)
1.52x faster

$ python -m ml-stuff.lunar_lander_turbo --seed 88
Training took 73.72 seconds (baseline)
Training took 42.09 seconds (with_change)
1.75x faster

$ md5 ppo_lunar_lander_with_change-44/policy.pth ppo_lunar_lander_baseline-44/policy.pth
MD5 (ppo_lunar_lander_with_change-44/policy.pth) = 46266d319de7912428268cccff20d966
MD5 (ppo_lunar_lander_baseline-44/policy.pth) = 46266d319de7912428268cccff20d966

$ md5 ppo_lunar_lander_with_change-88/policy.pth ppo_lunar_lander_baseline-88/policy.pth
MD5 (ppo_lunar_lander_with_change-88/policy.pth) = 859a945a9472447e32b0f180d9ca558e
MD5 (ppo_lunar_lander_baseline-88/policy.pth) = 859a945a9472447e32b0f180d9ca558e
Simple test harness
# lunar_lander_turbo.py
import gymnasium as gym
import sys

sys.modules['gym'] = gym
# Using a special version of stable_baselines3 that works with gymnasium (not gym)
# https://github.com/DLR-RM/stable-baselines3/pull/780
# pip install git+https://github.com/carlosluis/stable-baselines3@fix_tests
from stable_baselines3 import PPO
import time


def run_model(seed=None, description=None):
    print(f'Running with seed {seed} ({description})')
    env = gym.make("LunarLander-v2")
    env.reset(seed=seed)
    model = PPO("MlpPolicy", env, verbose=0, seed=seed)

    # Train the model
    start_time = time.time()
    model.learn(total_timesteps=100000)
    end_time = time.time()
    print(f'Training took {end_time - start_time:.2f} seconds')

    # Save the model
    model.save(f"misc2/ppo_lunar_lander_{description}-{seed}")
    # model.save(f"misc2/ppo_lunar_lander_baseline-{seed}")


if __name__ == '__main__':
    import argparse

    parser = argparse.ArgumentParser()
    parser.add_argument('--seed', type=int, default=42)
    parser.add_argument('--description', type=str)
    args = parser.parse_args()
    seed = args.seed
    description = args.description

    run_model(seed=seed, description=description)

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@pseudo-rnd-thoughts pseudo-rnd-thoughts changed the title fix: Improve LunarLander-v2 model training performance by >1.5x (#170) fix: Improve LunarLander-v2 step performance by >1.5x (#170) Jan 3, 2023
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this looks great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug Report] Lunar Lander Runs Slowly
2 participants