Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cpp_fmriprep.slurm #26

Merged
merged 2 commits into from
Jul 2, 2024
Merged

Conversation

marcobarilari
Copy link
Contributor

Do these numbers make sense?

@marcobarilari
Copy link
Contributor Author

They seem to the resources limit for lemaitre3 (deducted from https://www.ceci-hpc.be/scriptgen.html) and # SBATCH --ntasks-per-node=2 from Filippo tests does not seems to make any difference

@yyang1234
Copy link

Hi, for me my problem isn't related to memory but I tried to run with these numbers and they works. So they make sense.

@marcobarilari
Copy link
Contributor Author

marcobarilari commented Nov 8, 2023

If with these numbers you were able to run to the end without error the problematic subjects THEN (1) it is very good news and (2) very likely that your problem could not find file X was indeed related to memory.

Possible explanation In a nutshell: we provided too little "ram" memory so that fmriprep was forced to free it up during some processes, therefore it deleted cached files that were supposed to be used later and not being there it dumbly through the error you saw.

LMK if it makes sense to you

@yyang1234
Copy link

Hi, what you said is an important point. But the error that happened to me appeared even with these new numbers. The error was solved by clearing the content in both the output folder and the scratch folder.
Now in the slurm file I define workdir like this --work-dir /scratch_dir/work-fmriprep/"${subjID}" \ instead of --work-dir /scratch_dir/work-fmriprep \ to make the scratch folder less likely to be mixed with each others. Does it make sense?

@yyang1234
Copy link

I also found that for instance, a single subject has three tasks, running the first task works without any issues. However, when I subsequently run the second task for the same subject, an error occurs. Now I run all tasks for one subject at one time to avoid the issue.

@marcobarilari
Copy link
Contributor Author

Ah interesting, well this could be important to change as well in this repo. Will run some trials.

@marcobarilari
Copy link
Contributor Author

I also found that for instance, a single subject has three tasks, running the first task works without any issues. However, when I subsequently run the second task for the same subject, an error occurs. Now I run all tasks for one subject at one time to avoid the issue.

What was the error? Let's share it in the trouble shooting section in his repo

@yyang1234
Copy link

The same error as I posted before. I think the error related to recycled either the output folder or the scratch folder within same participants.

@marcobarilari marcobarilari merged commit c5fa574 into main Jul 2, 2024
0 of 2 checks passed
@marcobarilari marcobarilari deleted the marco_update-fmriprep-slurm-param branch July 2, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants