Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HFUT Student]当我使用CNCD_F模型时出现问题【我已解决问题】 #24

Closed
senker216 opened this issue Dec 21, 2024 · 1 comment

Comments

@senker216
Copy link

报错代码
from edustudio.quickstart import run_edustudio

run_edustudio(
dataset='AAAI_2023',
cfg_file_name=None,
traintpl_cfg_dict={
'cls': 'GeneralTrainTPL',
},
datatpl_cfg_dict={
'cls': 'CNCDFDataTPL',
},
modeltpl_cfg_dict={
'cls': 'CNCD_F'
},
evaltpl_cfg_dict={
'clses': ['PredictionEvalTPL'],
}
)
报错为
2024-12-21 11:49:40[ERROR]: Traceback (most recent call last):
File "E:\EduStudio-main\edustudio\quickstart\quickstart.py", line 58, in run_edustudio
traintpl.start()
File "E:\EduStudio-main\edustudio\traintpl\gd_traintpl.py", line 79, in start
metrics = self.one_fold_start(fold_id)
File "E:\EduStudio-main\edustudio\traintpl\general_traintpl.py", line 53, in one_fold_start
self.fit(train_loader=self.train_loader, valid_loader=self.valid_loader)
File "E:\EduStudio-main\edustudio\traintpl\general_traintpl.py", line 96, in fit
loss_dict = self.model.get_loss_dict(**batch_dict)
File "E:\EduStudio-main\edustudio\model\CD\cncd_f.py", line 177, in get_loss_dict
return self.get_main_loss(**kwargs)
File "E:\EduStudio-main\edustudio\model\CD\cncd_f.py", line 170, in get_main_loss
pd = self(stu_id, exer_id).flatten()
File "D:\anac\envs\wl\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "E:\EduStudio-main\edustudio\model\CD\cncd_f.py", line 144, in forward
items_Q_mat = self.Q_mat[exer_id].to(self.traintpl_cfg['device'])
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
这个问题意思是在使用 PyTorch 进行索引操作时,张量(tensor)和索引(indices)不在同一个设备(device)上。加入调试信息查看后发现
stu_id device: cuda:0
exer_id device: cuda:0
self.word_ids device: cuda:0
self.Q_mat device: cpu
找到问题的根源是self.Q_mat没有转移到GPU,修改cncd_f.py代码

def build_model(self):
# prediction sub-net
self.textcnn = TextCNN(self.traintpl_cfg["batch_size"], self.n_cpt, self.text_dim)
self.word2id = self.textcnn.prepare_embedding(self.content_list)
self.word_ids = self.get_from_text().to(self.traintpl_cfg['device']) # 确保在指定设备上

# 确保 Q_mat 在正确的设备上
self.Q_mat = self.Q_mat.to(self.traintpl_cfg['device'])

# 打印设备信息
print(f"self.word_ids device: {self.word_ids.device}")
print(f"self.Q_mat device: {self.Q_mat.device}")

self.out_text_factor = nn.Linear(self.text_dim, 1)
self.student_emb = nn.Embedding(self.n_user, self.n_cpt+1)
self.k_difficulty = nn.Embedding(self.n_item, self.n_cpt)
self.e_difficulty = nn.Embedding(self.n_item, 1)
self.pd_net = PosMLP(
    input_dim=self.n_cpt+1, output_dim=1, activation=self.modeltpl_cfg['activation'],
    dnn_units=self.modeltpl_cfg['dnn_units'], dropout_rate=self.modeltpl_cfg['dropout_rate']
)

def forward(self, stu_id, exer_id):
# 确保索引和张量在同一设备上
stu_id = stu_id.to(self.traintpl_cfg['device'])
exer_id = exer_id.to(self.traintpl_cfg['device'])

# before prednet
items_Q_mat = self.Q_mat[exer_id].to(self.traintpl_cfg['device'])  # 移动到 GPU
items_content = self.word_ids[exer_id]
text_embedding = self.textcnn(items_content)
text_factor = torch.sigmoid(self.out_text_factor(text_embedding))
stu_emb = self.student_emb(stu_id)
stat_emb = torch.sigmoid(stu_emb)
k_difficulty = torch.sigmoid(self.k_difficulty(exer_id))
k_difficulty = torch.cat((k_difficulty, text_factor), dim=1)
e_difficulty = torch.sigmoid(self.e_difficulty(exer_id)) * self.modeltpl_cfg['disc_scale']
# prednet
text_factor_q = torch.ones((items_Q_mat.shape[0], 1), device=self.traintpl_cfg['device'])
input_knowledge_point = torch.cat((items_Q_mat, text_factor_q), dim=1)
input_x = e_difficulty * (stat_emb - k_difficulty) * input_knowledge_point
pd = self.pd_net(input_x).sigmoid()
return pd

更换这些代码即可

@jsxie9
Copy link
Contributor

jsxie9 commented Dec 24, 2024

你好,运行run_cncd_f_demo.py并没有复现你提到的问题

@jsxie9 jsxie9 closed this as completed Dec 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants