[RLlib] Duplicate custom metrics #24731

simonsays1980 · 2022-05-12T10:46:50Z

What happened + What you expected to happen

What happened?

I constructed my custom callbacks to collect some metrics, run my experimnt with tune and found them twice in TensorBoard. The first time under tune/custom_metrics and the second under tune/sampler_results/custom_metrics.

The reason for this behavior is that in the Trainer class the results dictionary gets provided with the custom metrics in _compile_step_results() twice:

Directly to results["custom_metrics"]
Indirectly by calling summarize_episodes() as summarize_episodes() collects them here.

This somehow blows up the TensorBoard metrics (many pages) and might also use more disk space then needed.

What did you expect to happen?

I expected to find my custom metrics only once. ither in tune/custom_metrics or in tune/sampler_results/custom_metrics.

Easy solution

An easy solution would probably be to remove the line for collecting the custom_metrics

Versions / Dependencies

Linux Fedora 35
Python 3.9.0
ray dev2.0.0

Reproduction script

import gym
import gym_minigrid

import numpy as np

import ray
from ray.rllib.agents.callbacks import DefaultCallbacks
from ray.rllib.agents.ppo import ppo
from ray.rllib.utils.framework import try_import_tf
from ray import tune
from ray.tune.registry import register_env

tf1, tf, tfv = try_import_tf()

class CustomMetricsCallbacks(DefaultCallbacks):
    
    def __init__(self):
        super().__init__()
        
    def on_episode_start(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        assert episode.length == 0, (
            "ERROR: `on_episode_start()` callback should be called right "
            "after env.reset()."
        )
        
        episode.user_data["mymetric"] = []
        
    def on_episode_step(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        # Make sure this episode is ongoing.
        assert episode.length > 0, (
            "ERROR: `on_episode_step()` callback should not be called right "
            "after env.reset()"
        )
        
        mymetric = np.random.random(policies["default_policy"].config["train_batch_size"])
        
        episode.user_data["mymetric"].append(np.max(mymetric))
    
    def on_episode_end(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        episode.custom_metrics["mymetric"] = np.max(episode.user_data["mymetric"])

def env_creator(config=None):    
    name = config.get("name", "MiniGrid-ObstructedMaze-2Dlhb-v0")
    env = gym.make(name)
    env = gym_minigrid.wrappers.ImgObsWrapper(env)
    
    return env

register_env("mini-grid", env_creator)

CONV_FILTERS = [[32, [5, 5], 1], [64, [3, 3], 2], [64, [4, 4], 1]]
config = ppo.DEFAULT_CONFIG.copy()
config["env"] = "mini-grid"
config["num_envs_per_worker"] = 4
config["model"]["post_fcnet_hiddens"] = [256, 256]
config["model"]["post_fcnet_activation"] = "relu"
config["model"]["conv_filters"] = CONV_FILTERS
#config["log_level"] = "INFO"

ray.init(local_mode=True, ignore_reinit_error=True)
tune.run(
    "PPO",
    config=config,
    stop={
        "timesteps_total": 10000,
    }, 
    verbose=1,
    num_samples = 1,
    checkpoint_freq=10,
)
ray.shutdown()

Issue Severity

Low: It annoys or frustrates me.

The text was updated successfully, but these errors were encountered:

cwfparsonson · 2022-09-16T10:47:20Z

I also have this unnecessary duplicate logging of custom_metrics occurring.

simonsays1980 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 12, 2022

krfricke added the rllib RLlib related issues label May 19, 2022

tanmaychimurkar mentioned this issue Sep 2, 2022

[RLlib] Duplicate standard and custom metrics #28265

Open

simonsays1980 mentioned this issue Dec 21, 2022

[RLlib] Tensorboard iterations do not start at zero #30471

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Duplicate custom metrics #24731

[RLlib] Duplicate custom metrics #24731

simonsays1980 commented May 12, 2022

cwfparsonson commented Sep 16, 2022

[RLlib] Duplicate custom metrics #24731

[RLlib] Duplicate custom metrics #24731

Comments

simonsays1980 commented May 12, 2022

What happened + What you expected to happen

What happened?

What did you expect to happen?

Easy solution

Versions / Dependencies

Reproduction script

Issue Severity

cwfparsonson commented Sep 16, 2022