(Idea) feature: update configuration for turn-based #241

YuriCat · 2022-01-22T00:47:22Z

Turn-based batch creation and zero-sum averaging are different and independent.
Moreover, these should be set False at default for safety.

ikki407 · 2022-01-24T03:30:09Z

handyrl/train.py

@@ -165,7 +165,10 @@ def forward_prediction(model, hidden, batch, args):
        o = o.view(*batch['turn_mask'].size()[:2], -1, o.size(-1))
        if k == 'policy':
            # gather turn player's policies


This comment and line 170 comment is duplicated.

ikki407 · 2022-01-24T03:37:09Z

Thank you, it becomes that the two functions can be totally understood and easy to be used by users.

ikki407 · 2022-01-24T08:33:37Z

Could you update parameters.md?
https://github.com/DeNA/HandyRL/blob/master/docs/parameters.md

YuriCat added 4 commits January 22, 2022 09:13

feature: update configuration for turn-based

439be2b

fix: use zero_sum_averaging

1c38ca6

feature: 2p zero-sum setting should be False in default config

0c66a5c

fix: create dual-player batch when zero_sum_averaging is True

203c63b

ikki407 reviewed Jan 24, 2022

View reviewed changes

ikki407 mentioned this pull request Jan 24, 2022

Merge develop branch into master (2022/1) #244

Merged

YuriCat added 2 commits February 1, 2022 14:02

Merge develop

27fff3c

fix: debug comment

c246d3b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Idea) feature: update configuration for turn-based #241

(Idea) feature: update configuration for turn-based #241

YuriCat commented Jan 22, 2022

ikki407 Jan 24, 2022

ikki407 commented Jan 24, 2022

ikki407 commented Jan 24, 2022

(Idea) feature: update configuration for turn-based #241

Are you sure you want to change the base?

(Idea) feature: update configuration for turn-based #241

Conversation

YuriCat commented Jan 22, 2022

ikki407 Jan 24, 2022

Choose a reason for hiding this comment

ikki407 commented Jan 24, 2022

ikki407 commented Jan 24, 2022