Fix multi_tensor adam/momentum bug when the parameter is list of dict #47352

sneaxiy · 2022-10-26T05:31:35Z

PR types

Bug fixes

PR changes

APIs

Describe

When use_multi_tensor = True in Adam or Momentum optimizer and the parameters is a list of dict, the original codes were wrong because self._param_dict can only handle the first parameter group. The other parameter groups would reuse the self._param_dict from the first group, and the bug would occur.

This PR fixes this bug by adding param_group_idx to some method in Optimizer.

zhangbo9674

LGTM

fix multi_tensor adam/momentum bug

3f46ac1

sneaxiy changed the title ~~Fix multi_tensor adam/momentum bug~~ Fix multi_tensor adam/momentum bug when the parameter is list of dict Oct 26, 2022

sneaxiy closed this Oct 26, 2022

sneaxiy reopened this Oct 26, 2022

sneaxiy requested a review from zhangbo9674 October 26, 2022 08:19

zhangbo9674 approved these changes Oct 26, 2022

View reviewed changes

sneaxiy merged commit 4137c46 into PaddlePaddle:develop Oct 26, 2022

sneaxiy deleted the fix_multi_tensor_adam_and_momentum branch October 26, 2022 08:30

sneaxiy mentioned this pull request Oct 26, 2022

[Cherry-pick Release/2.4] Fix multi_tensor adam and momentum bug when the parameter is list of dict #47372

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix multi_tensor adam/momentum bug when the parameter is list of dict #47352

Fix multi_tensor adam/momentum bug when the parameter is list of dict #47352

sneaxiy commented Oct 26, 2022 •

edited

Loading

zhangbo9674 left a comment

Fix multi_tensor adam/momentum bug when the parameter is list of dict #47352

Fix multi_tensor adam/momentum bug when the parameter is list of dict #47352

Conversation

sneaxiy commented Oct 26, 2022 • edited Loading

PR types

PR changes

Describe

zhangbo9674 left a comment

Choose a reason for hiding this comment

sneaxiy commented Oct 26, 2022 •

edited

Loading