-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
--split-memory-limit not reducing memory requirement of mmseqs search
job
#338
Comments
So what's going on is that you created an index with This split indices work but have unexpected performance pitfalls: You need to have the index on a fast IO system so you can import each split into memory fast enough. Since the index is larger than the input sequences it can be faster to just recompute the index on the fly instead of reading in an existing one. |
Thanks for the clarification! So the use of |
Currently, I think it should crash no matter what since there is an index present that doesn't fit into RAM. The error message for that is very useless. You have to either recreate the index with a certain memory limit in mind or remove it (actually rename just the |
I created an index for UniRef90 using
According to the error message, the database size is < 100G, so why am I getting an I also get this error when using I'm using mmseq2 12.113e3 The |
It turns out that I just need ~300G of memory for the job in order to not get the |
It appears that no matter how many splits I use for Using I'm using 8 threads. Is the |
Could you run |
Yeah, the first split is ~2x larger than the rest. I tried using 16 splits & 8 mmseq-search threads, and that cut down on the memory requirement from 296G to just 232G |
When I split the database into 16 parts, the first split is 70G, while all of the rest are 16G. Is there a way to get equal-sized splits (assuming that would require less memory for running |
The first split also stores other data not needed for the prefiltering. I still suspect that there is something wrong with memory overcommit on your system. Try |
Our SGE cluster is running Ubuntu 18.04.5 with NFSv4. |
the
SGE seems to have a couple knobs to limit that (i.e. |
I normally just request h_vmem, such as:
For ~300G of vmem. I haven't played around with s_vmem. |
Does the issue also happen if you don't set that? What linux kernel version are your nodes running? |
If I don't set The nodes are running |
18.04.5 seems to use kernel 5.4 so later than 4.7. I think we found the reason. You should set I am still not sure that it's will be very valuable to precompute an Index in your case. Transferring the large index with NFS might be slower than recomputing it on the fly. |
Computing the idx with 8 threads takes ~1 hour. Transferring the large index is much faster. My previous jobs that created the idx on the fly took ~2 hours, but with the pre-computed idx, the jobs take ~30 minutes. Is there any why to homogenize the splits so that they are all approx. the same size. To be clear, ~29G of h_vmem per thread (using 8 threads) is needed to run the |
Maybe I should note that I'm splitting the query fasta into subsets, creating mmseqs dbs for each, and searching against UniRef90 (with the pre-generated idx). I know that I could use openmpi for scaling on a cluster, but splitting and running all of the queries in parallel with snakemake is more fault-tolerant. Having to request ~300G per cluster job greatly limits the number of parallel jobs that will run on the cluster at the same time, so I'd prefer to reduce the memory requirements, if possible. It seems that the first split stays fairly large in size regardless of the total number of splits. I'd try ~30 splits, but I'm guessing that I will still be stuck with a split file that's ~70G. |
I split the non-prefilter-index parts into separate files with this commit: 553a670 |
Great! Thanks for the quick edit! I'll give it a try later today. |
The good news is that the updated code splits the idx rather evenly:
The bad news is that Note: I still require ~240G of memory if I just use 1 thread (parallel=1, h_vmem=240G) |
I am not sure what the fix is. I think the issue is now that this IMO the I would suggest to talk to your SGE admin to setup a separate queue that doesn't enforce memory limits. Reengineering MMseqs2 to page in splits on-demand is I think quite a big reengineering effort. We can keep it in mind for the future. |
Thanks for your help with this! Yeah, no matter how many splits I create, Given the speed/accuracy of mmseqs, I'll probably still use it for my needs, if at all possible. Otherwise, I'll have to switch back to diamond in order to reduce the memory required for each job. |
Expected Behavior
According to the mmseqs docs:
...so I'm using
--split-memory-limit
with 80% of the RAM provided for the qsub job. However, I always get the error:Even if I reduce
--split-memory-limit
to 50% or just 20% of the total memory provided for the qsub job, the job still dies with the same error. Maybe I'm not understanding or using--split-memory-limit
correctly??I'm using UniRef90 as the db. If I use 336G for the job mem limit, then the
mmseq search
job runs without an error.Steps to Reproduce (for bugs)
mmseqs search
on UniRef90 to provide a large RAM requirement for the job.Your Environment
Ubuntu 18.04.4
The text was updated successfully, but these errors were encountered: