Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(rjy): add crowd md env new, and multi-head policy #230

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

nighood
Copy link
Collaborator

@nighood nighood commented Jun 7, 2024

  1. New Environment: CrowdSim

    • Description: The CrowdSim environment is a grid world simulation where robots navigate through an environment populated with humans. The primary task for the robots is to minimize the average age of information (AoI) of the humans by moving to their locations and collecting data. Key features of the environment include:
      • Dynamic Interaction: Humans generate data at a constant rate, and robots must manage their limited energy supply while moving to collect this data.
      • Modes:
        • Easy Mode: Robots can only collect data from humans within a certain range, and collecting data resets the AoI of a human to zero.
        • Hard Mode: Robots can collect data from humans even when not within range, and collecting data does not reset the total AoI.
      • Initialization: The environment starts with a dataset of human locations and timestamps. Robots aim to minimize the average AoI by efficiently collecting data.
      • Completion Criteria: The environment is considered solved when the average AoI is minimized to a certain threshold or the time limit is reached.
      • Additional Features: Methods for resetting, closing, and stepping through the environment, seeding for reproducibility, saving replay videos, and generating random actions. Detailed properties for accessing observation space, action space, and reward space.
  2. Multi-Head Policy Version for MuZero, EfficientZero, and Sampled EfficientZero

    • Modification: Introduced multi-head policy versions for the MuZero, EfficientZero, and Sampled EfficientZero algorithms.

@puyuan1996 puyuan1996 added environment New or improved environment config New or improved configuration labels Jun 7, 2024
@@ -93,7 +93,12 @@ def __init__(
cfg.main_config.exp_name = exp_name
self.origin_cfg = cfg
self.cfg = compile_config(
cfg.main_config, seed=seed, env=None, auto=True, policy=SampledEfficientZeroPolicy, create_cfg=cfg.create_config
cfg.main_config,
seed=seed,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多了一个空行吗

if observation_array.ndim == 3:
# Flatten the last two dimensions
observation_array = observation_array.reshape(batch_size, -1)
else:
raise ValueError("For 'mlp' model_type, the observation must have 3 dimensions [B, S, O]")

elif model_type == 'rgcn':
if observation_array.ndim == 4:
# TODO(rjy): strage process
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strage process是什么意思?

activation: Optional[nn.Module] = nn.ReLU(inplace=True),
last_linear_layer_init_zero: bool = True,
norm_type: Optional[str] = 'BN',
self,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bash format.sh一下

output_support_size: int = 601,
last_linear_layer_init_zero: bool = True,
activation: Optional[nn.Module] = nn.ReLU(inplace=True),
norm_type: Optional[str] = 'BN',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些缩进还是换成原来的格式哈

"""
Overview:
Relational graph convolutional network layer.
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

给一下这里代码实现的参考链接

@@ -0,0 +1,113 @@
from typing import Union, Optional
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文件名改为 crowdsim_env

@ENV_REGISTRY.register('crowdsim_lightzero')
class CrowdSimEnv(BaseEnv):

def __init__(self, cfg: dict = {}) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增加overview注释

return len(self.queue)


# # Example of using the InformationQueue class
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

确认这里的example能够正常运行

@ENV_REGISTRY.register('crowdsim_lightzero')
class CrowdSimEnv(BaseEnv):

def __init__(self, cfg: dict = {}) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

增加overview注释,将之前的文档中英文版本放在这里的envs/路径下面哈

mcfg['obs_mode'] = '1-dim-array'
env = CrowdSimEnv(mcfg)
env.seed(314)
env.enable_save_replay('/home/nighoodRen/LightZero/result/test_replay')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

个人信息替换成template



@MODEL_REGISTRY.register('MuZeroModelMD')
class MuZeroModelMD(nn.Module):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所有增加的文件都需要继承自已有的文件,以避免冗余代码哈,只重写修改过的method。例如这里需要继承自MuZeroModel。相应的注释也需要更新一下。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config New or improved configuration environment New or improved environment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants