Can you share me the train logs about Figure 2 ? We can not get the similar results as shown in this figure. #3

shuyuandeqipa · 2020-12-13T02:28:31Z

No description provided.

shuyuandeqipa · 2020-12-13T02:29:38Z

-----------------------------------------------

Methods pred_prey_punish [test_return_mean in the log files]

OW-QMIX (w=0.1) 36.8333

OW-QMIX (w=0.5) 36.5000

CW-QMIX (w=0.1) 37.5417 (37.6667)

CW-QMIX (w=0.5) 37.5417 (36.9583)

QTRAN 38.0833

QPLEX 36.1667

QMIX 33.6250

COMA 0.0000

VDN 36.7083 ( 35.7500)

-----------------------------------------------

VDN 37.0833 run 1

VDN 36.1250 run 2

VDN 36.4167 run 3

QMIX 38.0417 run 1

QMIX 30.2083 run 2

QMIX 36.0000 run 3

QPLEX 36.7083 run 1

QPLEX 30.3333 run 2

QPLEX 24.5417 run 3

MADDPG 0

MASAC 0

-----------------------------------------------

tabzraz · 2020-12-14T23:58:48Z

Are you annealing epsilon over 50k or over 1mil timesteps? For the results in Figure 2 of the paper, epsilon is annealed over 50k timesteps.

shuyuandeqipa · 2020-12-15T00:03:51Z

CUDA_VISIBLE_DEVICES=3 nohup python3 -u src/main.py --config=vdn_smac --env-config=pred_prey_punish with epsilon_anneal_time=1000000 use_tensorboard=True > ./wjx_logs_1211/vdn_smac_pred_prey_punish_tensorboard_V2.log 2>&1 &

Yes, epsilon_anneal_time=1000000 ! We can not get the similar results of vdn, qmix, and qplex in figure 2.

shuyuandeqipa · 2020-12-15T00:05:24Z

Is my parameter setting wrong?

tabzraz · 2020-12-15T00:18:58Z

Yeah, for Figure 2 in the paper set epsilon_anneal_time=50000 (or remove it altogether since 50k is the default).

It seems that setting it as 1mil helps the performance (https://openreview.net/forum?id=Rcmk0xxIQV Appendix K.2 show similar results to yours I think).

shuyuandeqipa · 2020-12-15T00:24:59Z

Thanks for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can you share me the train logs about Figure 2 ? We can not get the similar results as shown in this figure. #3

Can you share me the train logs about Figure 2 ? We can not get the similar results as shown in this figure. #3

shuyuandeqipa commented Dec 13, 2020

shuyuandeqipa commented Dec 13, 2020

tabzraz commented Dec 14, 2020

shuyuandeqipa commented Dec 15, 2020

shuyuandeqipa commented Dec 15, 2020

tabzraz commented Dec 15, 2020

shuyuandeqipa commented Dec 15, 2020

Can you share me the train logs about Figure 2 ? We can not get the similar results as shown in this figure. #3

Can you share me the train logs about Figure 2 ? We can not get the similar results as shown in this figure. #3

Comments

shuyuandeqipa commented Dec 13, 2020

shuyuandeqipa commented Dec 13, 2020

-----------------------------------------------

Methods pred_prey_punish [test_return_mean in the log files]

OW-QMIX (w=0.1) 36.8333

OW-QMIX (w=0.5) 36.5000

CW-QMIX (w=0.1) 37.5417 (37.6667)

CW-QMIX (w=0.5) 37.5417 (36.9583)

QTRAN 38.0833

QPLEX 36.1667

QMIX 33.6250

COMA 0.0000

VDN 36.7083 ( 35.7500)

-----------------------------------------------

VDN 37.0833 run 1

VDN 36.1250 run 2

VDN 36.4167 run 3

QMIX 38.0417 run 1

QMIX 30.2083 run 2

QMIX 36.0000 run 3

QPLEX 36.7083 run 1

QPLEX 30.3333 run 2

QPLEX 24.5417 run 3

MADDPG 0

MASAC 0

-----------------------------------------------

tabzraz commented Dec 14, 2020

shuyuandeqipa commented Dec 15, 2020

shuyuandeqipa commented Dec 15, 2020

tabzraz commented Dec 15, 2020

shuyuandeqipa commented Dec 15, 2020