LightZero's Logging and Monitoring System

LightZero is a powerful MCTS and reinforcement learning framework that generates comprehensive log files and model checkpoints during the training process. In this article, we will take an in-depth look at LightZero's logging and monitoring system, focusing on the file directory structure after running the framework and the contents of each log file.

File Directory Structure

When we conduct an experiment using LightZero, such as training a MuZero agent in the CartPole environment, the framework organizes the output files as follows:

cartpole_muzero
├── ckpt
│   ├── ckpt_best.pth.tar
│   ├── iteration_0.pth.tar
│   └── iteration_10000.pth.tar
├── log  
│   ├── buffer
│   │   └── buffer_logger.txt
│   ├── collector
│   │   └── collector_logger.txt
│   ├── evaluator
│   │   └── evaluator_logger.txt
│   ├── learner
│   │   └── learner_logger.txt
│   └── serial
│       └── events.out.tfevents.1626453528.CN0014009700M.local
├── formatted_total_config.py
└── total_config.py

As we can see, the main body of the output files consists of two folders: log and ckpt, which store detailed log information and model checkpoints, respectively. The total_config.py and formatted_total_config.py files record the configuration information for this experiment. For more details on their specific meanings, please refer to the Configuration System Documentation.

Log File Analysis

Collector Logs

The log/collector/collector_logger.txt file records various metrics of the collector's interaction with the environment during the current collection stage, including:

episode_count: The number of episodes collected in this stage
envstep_count: The number of environment interaction steps collected in this stage
train_sample_count: The number of training samples collected in this stage
avg_envstep_per_episode: The average number of environment interaction steps per episode
avg_sample_per_episode: The average number of samples per episode
avg_envstep_per_sec: The average number of environment interaction steps collected per second
avg_train_sample_per_sec: The average number of training samples collected per second
avg_episode_per_sec: The average number of episodes collected per second
collect_time: The total time spent on data collection in this stage
reward_mean: The average reward obtained during the collection process in this stage
reward_std: The standard deviation of rewards collected in this stage
each_reward: The reward value of each episode's interaction with the environment
reward_max: The maximum single reward collected in this stage
reward_min: The minimum single reward collected in this stage
total_envstep_count: The cumulative total number of environment interaction steps collected by the collector
total_train_sample_count: The cumulative total number of samples collected by the collector
total_episode_count: The cumulative total number of episodes collected by the collector
total_duration: The total running time of the collector

Evaluator Logs

The log/evaluator/evaluator_logger.txt file records various metrics of the evaluator's interaction with the environment during the current evaluation stage, including:

[INFO]: Log prompts for each completed episode by the evaluator, including the final reward and current episode count
train_iter: The number of completed training iterations of the model
ckpt_name: The path of the model checkpoint used in this evaluation
episode_count: The number of episodes in this evaluation
envstep_count: The total number of environment interaction steps in this evaluation
evaluate_time: The total time spent on this evaluation
avg_envstep_per_episode: The average number of environment interaction steps per evaluation episode
avg_envstep_per_sec: The average number of environment interaction steps per second in this evaluation
avg_time_per_episode: The average time per episode in this evaluation
reward_mean: The average reward obtained in this evaluation
reward_std: The standard deviation of rewards in this evaluation
each_reward: The reward value of each episode's interaction with the environment by the evaluator
reward_max: The maximum reward obtained in this evaluation
reward_min: The minimum reward obtained in this evaluation

Learner Logs

The log/learner/learner_logger.txt file records various information about the learner during the model training process, including:

Neural network structure: Describes the overall architecture of the MuZero model, including the representation network, dynamics network, prediction network, etc.
Learner status: Displays the current learning rate, loss function values, optimizer monitoring metrics, etc., in a tabular format

Tensorboard Log Files

To facilitate experiment management, LightZero saves all scattered log files in the log/serial folder as a single Tensorboard log file, named in the format events.out.tfevents.<timestamp>.<hostname>. Through Tensorboard, users can monitor the trends of various metrics during the training process in real-time.

Checkpoint Files

The ckpt folder stores the checkpoint files of the model parameters:

ckpt_best.pth.tar: The model parameters that achieved the best performance during evaluation
iteration_<iteration_number>.pth.tar: The model parameters periodically saved during the training process

If you need to load the saved model, you can use methods like torch.load('ckpt_best.pth.tar') to read them.

Conclusion

LightZero provides users with a comprehensive logging and monitoring system, helping researchers and developers gain deep insights into the entire training process of reinforcement learning agents. By analyzing the metrics of the collector, evaluator, and learner, we can grasp the progress and effectiveness of the algorithm in real-time and optimize the training strategy accordingly. At the same time, the standardized organization of checkpoint files ensures the reproducibility of experiments. LightZero's well-developed logging and monitoring system will undoubtedly become a powerful assistant for users in algorithm research and practical applications.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

logs.md

logs.md

LightZero's Logging and Monitoring System

File Directory Structure

Log File Analysis

Collector Logs

Evaluator Logs

Learner Logs

Tensorboard Log Files

Checkpoint Files

Conclusion

Files

logs.md

Latest commit

History

logs.md

File metadata and controls

LightZero's Logging and Monitoring System

File Directory Structure

Log File Analysis

Collector Logs

Evaluator Logs

Learner Logs

Tensorboard Log Files

Checkpoint Files

Conclusion