-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
canu/2.0 failed to find the number of jobs in 'correction/0-mercounts/meryl-count.sh' #1740
Comments
I'm pretty sure this changed between 1.9 and 2.0 because Meryl had some large-scale changes between those releases. I'd guess either the Meryl binary is failing (this is what gives the configuration) or it's something with PBSPro since that cluster manager seems to always be unstable and breaking. Can you post the output from the configure steps ( |
Hi skoren, Thank you for the quick reply. Here is what I saw in the output dir.
and as far as I can tell the working directory for the PBS job is just the output dir passed through -d in the command I started canu and rdna.seqStore is right in it but some how meryl think its two level above the output directory.
Cheers |
So I wanted to see the contents of Meryl's run, rdna.ms16.config.01.out |
Here you go
|
Hmm, that looks like you have a bad Canu installation, how did you install Canu? If through a package manager like condo, we do not recommend this and instead I suggest you download the package hosted under the releases page. |
Here how I installed canu/2.0.
|
If that is, I don't think that is the version you're running because then you wouldn't end up with a path like canu/2.0/bin/meryl. Have you tried giving the full path the the version you installed as you said above? |
I copied the folder to where all our application is installed. Is it a show stopper?
|
Ah OK, that's why I was surprised by the path change. No as long as it's finding all the binaries it shouldn't matter that the path is different. I just wanted to make sure it wasn't picking up some system-wide Canu installation instead of yours or mixing the two up. Everything in the Meryl command that is not working looks correct, it should be looking for the seqStore two levels up because it first changes directories, you can see this in the log:
So Meryl and all its output should be running under the correction/0-mercounts folder. It's weird the output log (rdna.ms16.config.01.out)is in the top-level folder unless you tried to run the command by hand after it failed. Is there an rdna.ms16.config.01.out in the correction/0-mercounts folder? Is there a correction/0-mercounts/meryl-configure.err file, if so post both of those assuming they're not identical to what you already posted. The other thing to check is the sequence store. What's the contents of your seqStore folder? Can you post the seqStore.err log? |
I didn't run any part of the job manually.
This is how the seqStore looks like
The seqStore.err is in the top-level folder, 583KB large, so I just put the meaningful par here
I know the failure is quite mysterious. I myself don't understand it neither. My initial guess is the wrong correction/0-mercounts/meryl-count.sh script leads to failure during the meryl configuration and in turn breaks some assumption about when the working directory should be which, and then lead to the failure. |
No, the error can't have anything to do with meryl-count.sh, the configure step comes before that and it is what isn't running properly. I don't see how it can be ending up in the wrong folder as if your system is ignoring the chdir command in perl. That code hasn't changed in a long time. Are you able to run on a single node w/useGrid=false without this error? |
Hi @skoren , |
@einzigsue after carefully reading your logs I think I see what is happening and this is a PBS-specific bug. Since PBS doesn't maintain working directories for submitted jobs, Canu generates the scripts to correctly change into their folder on start but then this wipes out the initial cd to the right place. So essentially the flow is:
and then configuration fails. It should be enough to remove the work directory shell code from the configure script. Commenting out line 399 in Meryl.pm should do it:
(if you don't want to build it you can change the file in |
Idle, resolved by commit above or the code change listed. |
I installed canu/2.0 under CentOS 8 on our cluster and we tested the installation with two cases and both of them failed with the following error in the file canu.out.
Is it because the file correction/0-mercounts/meryl-count.sh is not correctly generated in the following lines?
Does canu use some sort of template to generate the meryl-count script? Is there any recent changes since version 1.9? We don't have any issues using the earlier version 1.9 yet.
Here is the command we used to start canu and the jobwrapper.sh passed to gridEngineSubmitCommnad, together with other grid engine related options, is the way we adapt canu to our cluster which should have very little impact on how the file correction/0-mercounts/meryl-count.sh is generated.
Let us know your diagnosis
Yue
The text was updated successfully, but these errors were encountered: