-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flo failed on Large genome #33
Comments
Not sure if the error is coming from GNU parallel or blat. The contents of If it's GNU parallel, you could try using a newer version. The version that the install script installs is quite old. If it's blat, it is possible that 256 GB is not sufficient memory for the task. Did you monitor the memory usage using |
here is blat joblog: |
I was wondering what would be the best way to update parallel, do I install an new version or update the one in /ext/parallel-20150722? |
Best to install in new folder. You can tell flo about the new folder using :add_to_path: key in the config file. |
changed to r5.16xlarge and used a new parallel, lower the parallel from 21 to 10 and still get the same error. any suggestion? Thanks! |
Sorry, I am not quite sure what is happening here. I have not encountered this error before. From the information we have in this thread, it might as well be a bug in blat. It might be worth trying to run the blat commands listed in joblst.blat one by one to check if all the chunks fail with the above error, or one in particular. With an isolated example it might then be worth asking on blat's mailing list. Just to be sure, is it possible that the ooc file you constructed is using a different tileSize than what you are using for running blat? I guess not, because you have Did you compile blat yourself or did you download pre-compiled executable (e.g., using the install script)? It is possible that a difference in glibc between your instance and the host on which blat was compiled. In which case, compiling blat yourself can help. But this is a kind of issue where you would be better off getting help on blat's mailing list. I used flo on ~400 Mb genome, split into 40 chunks, so 10 Mb per chunk. I wonder if increasing the number of processes so that each chunk is smaller helps. Lastly, I would quickly check the fasta and psl file for each chunk just to make sure we are not missing something too obvious. |
Hi |
here is what happened when I run blat on one chunk: |
Hello ! |
flo failed on a 14Gb genome, with "corrupted double-linked list (not small)" error. it runs normally with genome smaller than 4Gb in size. The setting is on an aws m5.16xlarge EC2 instance.
rake -f /home/ubuntu/flo/Rakefile &
mkdir run
cp /home/ubuntu/s.fa run/source.fa
cp /home/ubuntu/t.fa run/target.fa
faToTwoBit run/source.fa run/source.2bit
faToTwoBit run/target.fa run/target.2bit
twoBitInfo run/source.2bit stdout | sort -k2nr > run/source.sizes
twoBitInfo run/target.2bit stdout | sort -k2nr > run/target.sizes
faSplit sequence run/target.fa 21 run/chunk_
parallel --joblog run/joblog.faSplit -j 21 -a run/joblst.faSplit
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:
O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
;login: The USENIX Magazine, February 2011:42-47.
This helps funding further development; and it won't cost you a cent.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
To silence the citation notice: run 'parallel --bibtex'.
123322 pieces of 123923 written
133957 pieces of 134763 written
150983 pieces of 152743 written
156478 pieces of 157558 written
98419 pieces of 99073 written
99082 pieces of 99724 written
103154 pieces of 103663 written
113555 pieces of 113991 written
118767 pieces of 119728 written
123551 pieces of 124526 written
141741 pieces of 142672 written
144495 pieces of 146237 written
130388 pieces of 131310 written
147572 pieces of 148896 written
138549 pieces of 140111 written
141907 pieces of 142961 written
149246 pieces of 150844 written
149613 pieces of 150822 written
197774 pieces of 198899 written
160747 pieces of 162550 written
167525 pieces of 170389 written
parallel --joblog run/joblog.blat -j 21 -a run/joblst.blat
Academic tradition requires you to cite works you base your article on.
When using programs that use GNU Parallel to process data for publication
please cite:
O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
;login: The USENIX Magazine, February 2011:42-47.
This helps funding further development; and it won't cost you a cent.
If you pay 10000 EUR you should feel free to use GNU Parallel without citing.
To silence the citation notice: run 'parallel --bibtex'.
corrupted double-linked list (not small)
free(): invalid next size (normal)
free(): invalid next size (normal)
double free or corruption (!prev)
double free or corruption (!prev)
malloc(): smallbin double linked list corrupted
free(): invalid next size (normal)
malloc(): memory corruption
free(): invalid next size (normal)
double free or corruption (!prev)
free(): invalid next size (normal)
double free or corruption (!prev)
double free or corruption (!prev)
rake aborted!
Command failed with status (21): [parallel --joblog run/joblog.blat -j 21 -a...]
/home/ubuntu/flo/Rakefile:153:in
parallel' /home/ubuntu/flo/Rakefile:99:in
block in <top (required)>'/home/ubuntu/flo/Rakefile:37:in `block in <top (required)>'
Tasks: TOP => run/liftover.chn
(See full trace by running task with --trace)
[1]+ Exit 1 rake -f /home/ubuntu/flo/Rakefile
The text was updated successfully, but these errors were encountered: