google-research/norml at master · Deep-Brainz/google-research

Name	Name	Last commit message	Last commit date
parent directory ..
envs	envs
example_checkpoints	example_checkpoints
tools	tools
README.md	README.md
__init__.py	__init__.py
config_maml.py	config_maml.py
eval_maml.py	eval_maml.py
maml_rl.py	maml_rl.py
networks.py	networks.py
networks_test.py	networks_test.py
policies.py	policies.py
policies_test.py	policies_test.py
requirements.txt	requirements.txt
rollout_service.py	rollout_service.py
run.sh	run.sh
train_maml.py	train_maml.py

Name

Last commit message

Last commit date

NoRML: No-Reward Meta Learning

This repository contains code released for the paper NoRML: No-Reward Meta Learning.

First, install all dependencies by

pip install -r norml/requirements.txt

The HalfCheetah environment requires Mujoco, so make sure you also followed the proper instructions to install mujoco and mujoco-py.

You can start training from scratch by

python -m norml.train_maml --config MOVE_POINT_ROTATE_MAML --logs maml_checkpoints

Where config should be one of the configs defined in config_maml.py. The config string is of the type {ENV_NAME}_{ALG_NAME}, where ENV_NAME is one of MOVE_POINT_ROTATE, MOVE_POINT_ROTATE_SPARSE, CARTPOLE_SENSOR, HALFCHEETAH_MOTOR and ALG_NAME is one of DR, MAML, MAML_OFFSET, MAML_LAF, NORML as mentioned in the paper.

MOVE_POINT_ROTATE are fast to train and can converge within minutes. Training MOVE_POINT_ROTATE_SPARSE and CARTPOLE_SENSOR can take as long as a day. The Halfcheetah training was done via parallelized workers on a cloud server, and can take a long time on a single machine.

We also provide a convenient script to evaluate the training performance:

python -m norml.eval_maml \
--model_dir norml/example_checkpoints/move_point_rotate_sparse/norml/all_weights.ckpt-991 \
--output_dir maml_eval_results \
--render=True \
--num_finetune_steps 1 \
--test_task_index 0 \
--eval_finetune=True

You should be able to see states/actions logs and an optional rendered video in the maml_eval_results folder.

Citing

If you use this code in your research, please cite the following paper:

Yang, Y., Caluwaerts, K., Iscen, A., Tan, J. & Finn, C. (2019). NoRML: No-Reward Meta Learning.

@article{yang2019norml,
  title={NoRML: No-Reward Meta Learning},
  author={Yang, Yuxiang and Caluwaerts, Ken and Iscen, Atil and Tan, Jie and Finn, Chelsea},
  journal={arXiv preprint arXiv:1903.01063},
  year={2019}
}

Disclaimer: This is not an official Google product.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

norml

norml

README.md

NoRML: No-Reward Meta Learning

Citing

Files

norml

Directory actions

More options

Directory actions

More options

Latest commit

History

norml

Folders and files

parent directory

README.md

NoRML: No-Reward Meta Learning

Citing