Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] Avoid problem during model averaging when there is parameter-tying. #2113

Merged
merged 1 commit into from
Nov 6, 2023

Conversation

tzyll
Copy link
Contributor

@tzyll tzyll commented Nov 5, 2023

No description provided.

@xingchensong
Copy link
Member

汤老师,可以给具体例子吗?什么情况下是错的

@Mddct
Copy link
Collaborator

Mddct commented Nov 6, 2023

pytorch dict 中共享参数的value如果是引用,确实是个问题

@xingchensong
Copy link
Member

pytorch dict 中共享参数的value如果是引用,确实是个问题

类似emb和lm_head之间的参数共享?

@Mddct
Copy link
Collaborator

Mddct commented Nov 6, 2023

pytorch dict 中共享参数的value如果是引用,确实是个问题

类似emb和lm_head之间的参数共享?

是,也有emb 和output.hidden,如果引用可能同一个参数会被平均多次

@xingchensong
Copy link
Member

ok,got it, thx !

@xingchensong xingchensong merged commit 6afceb5 into wenet-e2e:main Nov 6, 2023
6 checks passed
@tzyll
Copy link
Contributor Author

tzyll commented Nov 6, 2023

再比如,一个网络中两个 encoder,共用一个 GlobalCMVN,那么 encoder1.global_cmvn 与 encoder2.global_cmvn 虽然 key 不同,但指向同一个 memory,就会重复处理。

pytorch dict 中共享参数的value如果是引用,确实是个问题

类似emb和lm_head之间的参数共享?

是,也有emb 和output.hidden,如果引用可能同一个参数会被平均多次

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants