You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "trainv3.py", line 356, in <module>
main(args)
File "trainv3.py", line 293, in main
flag = callback_verification(global_step, backbone)
File "/home/ctv.thanhly/FaceNet/ArcFacePytorch/utils/utils_callbacks.py", line 79, in __call__
flag = self.ver_test(backbone, num_update)
File "/home/ctv.thanhly/FaceNet/ArcFacePytorch/utils/utils_callbacks.py", line 35, in ver_test
self.ver_list[i], backbone, 10, 10)
File "/home/ctv.thanhly/FaceNet/ArcFacePytorch/eval/verification.py", line 431, in test
net_out: torch.Tensor = backbone(img)
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 799, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ctv.thanhly/FaceNet/ArcFacePytorch/backbones/efficientformerv2Custome.py", line 650, in forward
x = self.feature(self.head_dropout(x))
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/modules/container.py", line 139, in forward
input = module(input)
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 178, in forward
self.eps,
File "/home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/nn/functional.py", line 2282, in batch_norm
input, weight, bias, running_mean, running_var, training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: CUDA error: an illegal memory access was encountered
terminate called after throwing an instance of 'c10::CUDAError'
what(): CUDA error: an illegal memory access was encountered
Exception raised from query at /pytorch/aten/src/ATen/cuda/CUDAEvent.h:95 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f6259e24a22 in /home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: c10d::ProcessGroupNCCL::WorkNCCL::finishedGPUExecutionInternal() const + 0x132 (0x7f62ff23d342 in /home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #2: c10d::ProcessGroupNCCL::WorkNCCL::isCompleted() + 0x50 (0x7f62ff23efa0 in /home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #3: c10d::ProcessGroupNCCL::workCleanupLoop() + 0x11c (0x7f62ff23f9bc in /home/ctv.thanhly/miniconda3/envs/arcface2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #4: <unknown function> + 0xd6de4 (0x7f62ffe06de4 in /lib/x86_64-linux-gnu/libstdc++.so.6)
frame #5: <unknown function> + 0x9609 (0x7f63061c0609 in /lib/x86_64-linux-gnu/libpthread.so.0)
frame #6: clone + 0x43 (0x7f6305f8d293 in /lib/x86_64-linux-gnu/libc.so.6)
The text was updated successfully, but these errors were encountered:
Hey Bro,
I designed the "ArcFace Dynamic Margin" referring to your module for my task but I met a CUDA error. If you know it, can you help me ?
ArcFace Dynamic Margin
CUDA error:
The text was updated successfully, but these errors were encountered: