Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Duplicate custom metrics #24731

Open
simonsays1980 opened this issue May 12, 2022 · 1 comment
Open

[RLlib] Duplicate custom metrics #24731

simonsays1980 opened this issue May 12, 2022 · 1 comment
Labels
enhancement Request for new feature and/or capability P2 Important issue, but not time-critical rllib RLlib related issues rllib-logging This problem is related to logging metrics

Comments

@simonsays1980
Copy link
Collaborator

What happened + What you expected to happen

What happened?

I constructed my custom callbacks to collect some metrics, run my experimnt with tune and found them twice in TensorBoard. The first time under tune/custom_metrics and the second under tune/sampler_results/custom_metrics.

The reason for this behavior is that in the Trainer class the results dictionary gets provided with the custom metrics in _compile_step_results() twice:

  1. Directly to results["custom_metrics"]
  2. Indirectly by calling summarize_episodes() as summarize_episodes() collects them here.

This somehow blows up the TensorBoard metrics (many pages) and might also use more disk space then needed.

What did you expect to happen?

I expected to find my custom metrics only once. ither in tune/custom_metrics or in tune/sampler_results/custom_metrics.

Easy solution

An easy solution would probably be to remove the line for collecting the custom_metrics

Versions / Dependencies

Linux Fedora 35
Python 3.9.0
ray dev2.0.0

Reproduction script

import gym
import gym_minigrid

import numpy as np

import ray
from ray.rllib.agents.callbacks import DefaultCallbacks
from ray.rllib.agents.ppo import ppo
from ray.rllib.utils.framework import try_import_tf
from ray import tune
from ray.tune.registry import register_env

tf1, tf, tfv = try_import_tf()

class CustomMetricsCallbacks(DefaultCallbacks):
    
    def __init__(self):
        super().__init__()
        
    def on_episode_start(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        assert episode.length == 0, (
            "ERROR: `on_episode_start()` callback should be called right "
            "after env.reset()."
        )
        
        episode.user_data["mymetric"] = []
        
    def on_episode_step(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        # Make sure this episode is ongoing.
        assert episode.length > 0, (
            "ERROR: `on_episode_step()` callback should not be called right "
            "after env.reset()"
        )
        
        mymetric = np.random.random(policies["default_policy"].config["train_batch_size"])
        
        episode.user_data["mymetric"].append(np.max(mymetric))
    
    def on_episode_end(
        self,
        *,
        worker: RolloutWorker,
        base_env: BaseEnv,
        policies: Dict[str, Policy],
        episode: Episode,
        env_index: int,
        **kwargs,
    ):
        episode.custom_metrics["mymetric"] = np.max(episode.user_data["mymetric"])

def env_creator(config=None):    
    name = config.get("name", "MiniGrid-ObstructedMaze-2Dlhb-v0")
    env = gym.make(name)
    env = gym_minigrid.wrappers.ImgObsWrapper(env)
    
    return env

register_env("mini-grid", env_creator)

CONV_FILTERS = [[32, [5, 5], 1], [64, [3, 3], 2], [64, [4, 4], 1]]
config = ppo.DEFAULT_CONFIG.copy()
config["env"] = "mini-grid"
config["num_envs_per_worker"] = 4
config["model"]["post_fcnet_hiddens"] = [256, 256]
config["model"]["post_fcnet_activation"] = "relu"
config["model"]["conv_filters"] = CONV_FILTERS
#config["log_level"] = "INFO"

ray.init(local_mode=True, ignore_reinit_error=True)
tune.run(
    "PPO",
    config=config,
    stop={
        "timesteps_total": 10000,
    }, 
    verbose=1,
    num_samples = 1,
    checkpoint_freq=10,
)
ray.shutdown()

Issue Severity

Low: It annoys or frustrates me.

@simonsays1980 simonsays1980 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels May 12, 2022
@krfricke krfricke added the rllib RLlib related issues label May 19, 2022
@kouroshHakha kouroshHakha added P2 Important issue, but not time-critical enhancement Request for new feature and/or capability rllib-logging This problem is related to logging metrics and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) bug Something that is supposed to be working; but isn't labels May 24, 2022
@cwfparsonson
Copy link

I also have this unnecessary duplicate logging of custom_metrics occurring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for new feature and/or capability P2 Important issue, but not time-critical rllib RLlib related issues rllib-logging This problem is related to logging metrics
Projects
None yet
Development

No branches or pull requests

4 participants