-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimal parallelisation: -p N / --multicore N with bowtie2 #96
Comments
I think it's still the rule that each multicore will use 5 CPUs close to
100%. Is that right, @FelixKrueger ?
I usually only change the --multicore N value according to the num. cores and memory available on the instance.
…On Mar 15, 2017 07:56, "sklages" ***@***.***> wrote:
Hi,
I am playing around with bismark using a few large human WGBS datasets.
Setup:
- a few 80-core server, 1TB RAM
- WGBS human datasets, Illumina PE125
- Bismark Version: v0.17.1_dev
- bowtie2 2.2.9
I am wondering which is the "optimal" combination of -p N / --multicore N.
No matter which value for -p I use, bowtie2 always runs on a single core.
Using --multicore N spawns N (bowtie2) processes, just like it should.
When I use --multicore 8 -p 4 I would expect 8 bismark/bowtie2 processes
each running on 4 cores (using 32 cores in total). But it runs 8 bowtie2
processes each running on one single core.
So obviously there is something wrong with my understanding of -p N /
--multicore N.
What about using --multicore 70 together with -p 1? I could imagine that
this way I/O becomes annoying with splitting/joining the data.
Any ideas?
best,
Sven
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#96>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAJpN0RRK7rat8SjYgZBxPM-im91rln6ks5rl5magaJpZM4Mdmch>
.
|
In my case, no. 8 multicores run on 8 CPUs, each with 100% -> single thread/core. For me it is the same: |
As a general rule I just ran a quick test over here with You can see that this spawns 2 Bowtie2 threads, each using 300% CPU (you might have to wait until all instances have been spawned and are running fully). Here is a test with This spawns 6 Bowtie2 threads ((OT+OB) * 3), each using 300% (from I would personally either go with what @avilella said and ignore I hope this clears things up a little? |
@FelixKrueger - a suggestion / request.. I personally find the term |
@ewels - I have added/changed the name of |
@sklages - From our screenshots above seem to indicate that both |
Hi all,
I created a small spreadsheet to dynamically do the maths between bismark
mem/CPU requirements and machine specs:
http://tinyurl.com/bismarkparallel
If my understanding of the CPU usage is correct, an instance with, say, 36
CPU but only 60GB of RAM would only be able to do --parallel 5, even though
with more RAM, the max to reach 100% would be 12.
Genome size is for GRCh38Decoy (contains decoy sequences),
@FelixKrueger, let me know if my calculations are correct...
A.
…On Wed, Mar 15, 2017 at 11:27 AM, FelixKrueger ***@***.***> wrote:
@sklages <https://github.com/sklages> - From our screenshots above seem
to indicate that both --parallel and -p are working as intended, so it
might either be something related to your system or, probably more likely,
a matter of how your top is presenting the threads. It appears that top
is presenting threads in a hierarchical manner, so maybe something is
'being swallowed' somewhere? If you take a small test data set, is there
any difference in time at all if you run -p as 1, 2, 3 or 4? Ultimately I
don't think this is actually anything I could fix, but it would rather be a
Bowtie2 issue...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#96 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAJpN13M9GxX8cixKIwH8yq-hoKCqc8xks5rl8s8gaJpZM4Mdmch>
.
|
The calculations in your spreadsheet look fine to me! |
So, if I got it right: roughly 10G per one hg38-bismark instance, running at least 2 (directional) or 4 (non-directional) bowtie2 threads. Plus some decompression (gzip) in the beginning and samtools threads. |
Indeed, and then the Bismark thread itself which operates all the threads and does the methylatiuon calling etc. will add 1 core (at 100%), and 1 copy of the reference sequence in memory (so roughly 12G for a human genome in total). Glad this helped reduce the confusion. |
Yes, thanks, this helped a lot 👍 . |
Brilliant, thanks for renaming it @FelixKrueger! 😁 The table would still be nice too 😉 |
This implements the suggestions for speeding up bismark here: FelixKrueger/Bismark#96 They suggest just leaving -p (the number of threads for bowtie) to just be the default, and spinning up more instances of bismark with --parallel instead. This should be a more efficient usage of resources.
Hi,
I am playing around with bismark using a few large human WGBS datasets.
Setup:
I am wondering which is the "optimal" combination of
-p N / --multicore N
.No matter which value for
-p
I use, bowtie2 always runs on a single core.Using
--multicore N
spawnsN
(bowtie2) processes, just like it should.When I use
--multicore 8 -p 4
I would expect 8 bismark/bowtie2 processes each running on 4 cores (using 32 cores in total). But it runs 8 bowtie2 processes each running on one single core.So obviously there is something wrong with my understanding of
-p N / --multicore N
.What about using
--multicore 70
together with-p 1
? I could imagine that this way I/O becomes annoying with splitting/joining the data.Any ideas?
best,
Sven
The text was updated successfully, but these errors were encountered: