-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EOFError #441
Comments
Please delete all the npy files in the nnUNet_preprocessed folder of your task and try again |
(do not delete the npz files!) And make sure your SSD is not full |
Thanks. It does work. |
It still stops from time to time. Although the command -c works now, I have to delete the npy files again and again. |
Are you training in a docker container? |
No, I trained it locally in Linux |
Mhm it appears like something is wrong with the location you store the data at. Is it a local SSD? |
Similar problem, my training stops from time to time. So, I have to continue the training with command -c manually. (but I trained it in a docker container) |
yes |
Hm that is strange. Have you checked your RAM? Maybe the ram was full and the system killed some background worker |
Thanks, maybe something wrong caused by the Ram, because I run four folds at the same time. |
Hm you should be able to train multiple folds simultaneously. I do it all the time. The only thing you have to consider is that only one of the trainings can to the extraction of the files (npz -> npy) at once, so if you train multiple folds you need to first just start one fold. Only once this fold is using the GPU you can start the others. If you already have the files extracted from a previous training then you can start all folds simultaneously |
Hi, Fabian.
I don't know how to figure out this problem. What's more, when I use the command -c, it still continues from epoch 100, even i have trained over 300 epochs. Could you give me some advice, thanks a lot.
The text was updated successfully, but these errors were encountered: