Skip to content

learnables/metaschool

Repository files navigation


Test Status

metaschool provides a simple multi-task interface on top of OpenAI Gym environments. Our hope is that it will simplify research in mutli-task, lifelong, and meta-reinforcement learning.

Resources

Supported Environments

Mini Tutorial

At its essence, metaschool builds on 3 basic classes:

  • EnvFactory: A class to generate base Gym environments, given a configuration.
  • WrapperFactory: An (optional) class to generate wrappers, given a configuration.
  • TaskConfig: A simple dict-like object to configure tasks.

Now, say we have a base Gym environment DrivingEnv-v0 for training self-driving system in different conditions (locations, weather, car maker). We can turn this environment into a set of multiple tasks as follows.

Note: we also use a GymTaskset, which lets us automatically sample and keep track of tasks.

import metaschool as ms

class DrivingFactory(ms.EnvFactory):  # defines how to sample new base environments

    def make(self, config):
        env = gym.make('DrivingEnv-v0', **config)
        env = gym.wrappers.RecordEpisodeStatistics(env)
        return env

    def sample(self):
        config = ms.TaskConfig(location='Town2', weather='Sunny')  # TaskConfig is a dict-like configuration
        config.color = random.choice(['Volvo', 'Mercedes', 'Audi', 'Hummer'])
        return config

class ChangingHorizonFactory(ms.WrapperFactory):  # let's us randomize base envs with wrappers

    def wrap(self, env, config):
        return gym.wrappers.TimeLimit(env, max_episode_steps=config.max_steps)

    def sample(self, env=None):
        return ms.TaskConfig(max_steps=random.randint(20, 200))

taskset = GymTaskset(  # helps us create, replicate, and track tasks
    env_factory=DrivingFactory(),
    wrapper_factories=[ChangingHorizonFactory(), ],
)

for iteration in range(num_iterations):  # learning over multiple tasks
    train_task = taskset.sample()  # train_task is a TimeLimit(RecordEpisodeStatistics(DrivingEnv)) with randomized configurations
    learner.learn(train_task)

    for seen_task in iter(taskset):  # loops over all previously seen tasks
        loss = learner.eval(test_task)

Changelog

A human-readable changelog is available in the CHANGELOG.md file.

Citation

To cite this code in your academic publications, please use the following reference.

Arnold, Sebastien M. R. “metaschool: A Gym Interface for Multi-Task Reinforcement Learning”. 2022.

You can also use the following Bibtex entry.

@software{Arnold20222022,
  author = {Arnold, Sebastien M. R.},
  doi = {10.5281/zenodo.1234},
  month = {12},
  title = {{metaschool: A Gym Interface for Multi-Task Reinforcement Learning}},
  url = {https://github.com/learnables/metaschool},
  version = {0.0.1},
  year = {2022}
}

About

A Gym Interface for Multi-Task Reinforcement Learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published