How can I run a trained model and can't run Test_ Hugging face_ Import. py #119

linlong1314 · 2023-07-19T23:51:08Z

How can I run a trained model? Include/ Projects/add/model. pt. Test_ Hugging face_ Import. py directly runs this test program and reports File ".\minGPT\master\mingpt model. py", line 202, in from_ Pre trained
Assert len (keys)==len (sd)
(act): NewGELUActivation(

MkuuWaUjinga · 2023-07-21T19:55:42Z

This is because of the custom implementation of multi-head attention. The CausalSelfAttention module registers a buffer to ensure that attention is only applied to tokens on the left of the input sequence. However, state_dict() returns buffers as part of the model's state. This is implemented differently in Pytorch's native module M̀ultiheadAttention and the mask is not part of the model's state dict. That's why the assertion fails.

You can fix that by adding the flag persistent=False when the buffer is registered in the __init__ function of the CausalSelfAttention module.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I run a trained model and can't run Test_ Hugging face_ Import. py #119

How can I run a trained model and can't run Test_ Hugging face_ Import. py #119

linlong1314 commented Jul 19, 2023

MkuuWaUjinga commented Jul 21, 2023

How can I run a trained model and can't run Test_ Hugging face_ Import. py #119

How can I run a trained model and can't run Test_ Hugging face_ Import. py #119

Comments

linlong1314 commented Jul 19, 2023

MkuuWaUjinga commented Jul 21, 2023