Memory exceeded when running single cell data alignment #189

Qirongmao97 · 2024-05-16T15:04:20Z

Hi,

I tried running a task with a memory limit of 250GB, but it kept going over the limit.

I'm thinking the problem might be related to how I'm using the --read_group input. Right now, I'm using a file called putative_bc.csv from BLAZE to sort out barcodes. Do you know a better way to align things using BLAZE results with IsoQuant?"

Originally posted by @Qirongmao97 in #165 (comment)

lianov · 2024-05-17T16:09:14Z

Agreed, we are also seeing this with the latest version / v3.4.1 (same approach for single-cell data with the use of --read_group). Before (version 3.3.1) the same sample used ~194 GB, and now it is using 2.08TB (and not respecting the memory limit which we have also set for 250GB).

andrewprzh · 2024-05-20T12:43:20Z

@Qirongmao97 @lianov

How do you set the memory limit?
I think exceeding the memory limit might be related to python multiprocessing.

It seems like RAM peak occurs right at end when results are being merged.
I will run IsoQuant on some single-cell data I have and check its memory consumption.

Best
Andrey

lianov · 2024-05-20T15:20:38Z

@andrewprzh : in my case it was set as part of the NextFlow job from our nf-core/scnanoseq pipeline. Upon checking the nextflow logs, it reported much higher usage without causing the job to fail in SLURM (which is another issue as well, as it should have failed). If you need more details, please let me know. Thanks for looking into this!

andrewprzh · 2024-05-24T17:20:01Z

I tested IsoQuant on some very large dataset with 70K barcodes, it does take a lot of RAM. I'll start investigating the issue.
I think it might be partially caused by the Python multiprocessing mechanisms.

I'll keep you updated.

Best
Andrey

Qirongmao97 · 2024-05-27T08:50:26Z

Hi Andrew @andrewprzh ,

I'm running IsoQuant on the Visium dataset with only 5K barcodes. Technically, it should not require much RAM, right? I was wondering if there might be an issue related to the input from the BLAZE demultiplexing step. Currently, I am also trying the nf-core/scnanoseq pipeline, but it would be great if you could share your pipeline for processing single-cell data with IsoQuant.

Thanks!

ljwharbers · 2024-06-03T14:44:39Z

Hi @andrewprzh ,

Just commenting to let you know that I was running into the same issues. I have data from some custom spatial transcriptomics data with closer to a (few) million(s) of 'barcodes'. While I have some nodes with multiple TB of memory available, I'm afraid that (after reading these comments) this won't be enough for my dataset. I got a run scheduled with 2TB of RAM this evening, so I will update when I know more.

Do you have any potential fix in mind or could you point me at the (most likely) chunk of code where this occurs, and I can have a look as well.

Edit:
Indeed as expected it also runs out of memory when allocating 2TB of memory sadly.

Thanks,
Luuk

andrewprzh · 2024-06-07T14:40:08Z

@Qirongmao97

Currently I use a barcode calling tool of my own, which will become a part of IsoQuant at some point.
In fact, I don't think it matters how the barcodes are called, it's the number of distinct barcodes that matters.
How many barcodes you have in total?

Best
Andrey

andrewprzh · 2024-06-07T14:41:55Z

@ljwharbers

A few million barcodes is really a lot, and it's kind of expected to consume a lot of RAM.
Do all of these barcodes represent real cells or there's is chance to apply some filtering?

Best
Andrey

Qirongmao97 · 2024-06-07T17:42:06Z

@andrewprzh

Hi, in this Visium dataset we have 3700 cells (spots)

ljwharbers · 2024-06-08T16:12:33Z

@andrewprzh

I realize it's a bit of an extreme scenario :')
These are barcodes representing real spatial coordinates (so not really cell but for analysis purposes it doesn't matter). Each barcode has only very few unique reads (and thus genes/transcripts) associated with them.

I think the main problem here is that, if I understand it correctly, a 'cell' x gene/transcript matrix is always generated and this consumes a huge amount of memory.

Could a solution be to (have an option) to not generate the output in this 'wide' format, but in a 'long' format instead. I can imagine that this would save a lot of memory.

andrewprzh · 2024-06-14T17:48:54Z

@ljwharbers

Yes, the matrix is always stored in some way. Previously, IsoQuant was outputting the "long" format, but then we decided to use "wide" format for everything.
I'll what I can do to make a workaround.

andrewprzh · 2024-06-14T17:50:17Z

@Qirongmao97

3700 is not really a lot... I also see that RAM peak occurs at then, probably when the counts are merged into a single table.

ljwharbers · 2024-06-17T11:35:20Z

@ljwharbers

Yes, the matrix is always stored in some way. Previously, IsoQuant was outputting the "long" format, but then we decided to use "wide" format for everything. I'll what I can do to make a workaround.

Thanks, that would be amazing!

lianov · 2024-07-01T15:23:31Z

@andrewprzh : thank you for your work on this once again. Do you foresee a fix on this issue in the near future? We are getting close to final review with nf-core on scnanoseq and for now, we have chosen to downgrade isoquant to 3.3.1 as a temporary fix to this issue. If you think there might be a fix to this issue in the near future, could you let us know so we can attempt to update the pipeline before first release with a new version of isoquant?

If not, no problem - we will aim to release a patch as soon as it is available. Thank you again.

andrewprzh · 2024-07-01T15:26:55Z

@lianov

Unfortunately, I'm quite busy with other projects and trying to work on IsoQuant in between. I think using 3.3.1 for now is a good solution since I cannot predict the timeline at the moment...
I will keep you updated anyway.

Best
Andrey

lianov · 2024-07-01T15:32:19Z

@andrewprzh : No problem, totally get it and thank you for the quick reply. We will move forward with this plan in the meantime.

andrewprzh · 2024-07-01T15:43:05Z

Makes sense, good luck and stay tuned :)

andrewprzh · 2024-07-13T22:46:28Z

@lianov

New 3.4.2 consumes significantly less memory compared to 3.4.1.

However, there still might be issues with single-cell data, which I'm still working on.

Best
Andrey

lianov · 2024-07-15T14:53:05Z

@andrewprzh : thank you for the update. @atrull314 and I will be looking into this new release for sure for performance in single-cell data. Thank you for your continued updates etc!

andrewprzh · 2024-08-03T11:07:22Z

@Qirongmao97 @lianov @ljwharbers

New IsoQuant 3.5 should consume far less RAM when using read groups, for gene, transcript and exon counts too.

It also outputs grouped counts in both - matrix and linear formats.

Best
Andrey

lianov · 2024-08-05T15:56:27Z

@andrewprzh : Great, we will be trying this out asap. Thank you again for your updates.

ljwharbers · 2024-08-06T08:02:06Z

@andrewprzh this is amazing, thanks! I'm testing it now and it runs smoothly so far, no memory issues (and this is with ~50 million barcodes!). Amazing work!

lianov · 2024-08-07T18:33:56Z

@andrewprzh : just to follow-up on our end. We are also seeing improvements in memory with this latest version after some pre-lim tests (~80GB with a PromethION dataset). We will continue to test on other datasets and upgrade the pipeline ASAP to be released with IsoQuant 3.5.

lianov · 2024-08-23T19:18:40Z

Follow-up here to close the loop on our end at least: fully tested this version across the datasets and we can confirm better performance. Also quantification sensitivity on this latest version is much better than before! Thanks for all the improvements! [this latest version is implemented in the scnanoseq pipeline and we are very close to releasing it on our end.

andrewprzh · 2024-08-24T10:32:40Z

Thanks a lot for getting back and happy to hear about positive results!
And thank you for embedding IsoQuant into you pipeline!

ljwharbers · 2024-08-26T12:20:20Z

Also follow-up from my side.
I've ran the latest version with >50 million barcodes and there are no memory issues any more. The run time is (very) long due to outputting in dense matrix format, typically days for my dataset. After simply commenting out lines that write in matrix format, everything processed in a couple of hours.

Super impressed with the speed and sensitivity. I will also be including isoquant in my nf-core pipeline (which is still a bit away from being released).

Thanks for your continued work and your quick responses!

lianov · 2024-08-26T15:59:47Z

@ljwharbers : that's good info on tracking down the run time source. On most of our datasets with default threads it takes about ~8hr, but this is helpful to us and maybe an area that we can also aid in contributing in the future.

ljwharbers · 2024-08-27T08:36:58Z

I think that ultimately the best option would be to, during processing, have the intermediate files in the 'linear' format and only in the final merging step transform it to a (sparse) matrix or linear format, depending on the user requirement. This would save a lot of time even if the user wants the output in matrix format.

@lianov I simply have a small script to change the linear format into a sparse mtx, which is compatible with (almost) all downstream single-cell processing tools.

While writing this, I see that @andrewprzh just released v3.5.1 already with the ability for the user to specify the output format. Amazing work once again!

andrewprzh · 2024-08-27T10:23:29Z

@ljwharbers

Thanks for the feedback! For now I implemented a simple option --counts_format, but I'll rework counts output in a more optimal way to avoid merging large files. Interestingly, linear format was the default option previously for grouped counts with the large number of groups, but somehow we decided to switch to matrix format.

I'll close this issue for now, feel free to reopen or start a discussion if needed.

Qirongmao97 changed the title ~~![plot_zoom_png (1)](https://github.com/ablab/IsoQuant/assets/57286623/d9a249d8-fbb5-49a2-8a98-aa640f2da941)~~ Memory exceed with single cell data alignment May 16, 2024

Qirongmao97 changed the title ~~Memory exceed with single cell data alignment~~ Memory exceeded when running single cell data alignment May 16, 2024

andrewprzh added the performance Issues related to computational perfromance label May 20, 2024

lianov mentioned this issue Jun 4, 2024

Isoquant memory issue nf-core/scnanoseq#22

Closed

andrewprzh added the fixed in release Issue resolved and the fix is released, waiting for approval label Aug 3, 2024

andrewprzh closed this as completed Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory exceeded when running single cell data alignment #189

Memory exceeded when running single cell data alignment #189

Qirongmao97 commented May 16, 2024 •

edited

Loading

lianov commented May 17, 2024

andrewprzh commented May 20, 2024

lianov commented May 20, 2024

andrewprzh commented May 24, 2024

Qirongmao97 commented May 27, 2024

ljwharbers commented Jun 3, 2024 •

edited

Loading

andrewprzh commented Jun 7, 2024

andrewprzh commented Jun 7, 2024

Qirongmao97 commented Jun 7, 2024

ljwharbers commented Jun 8, 2024

andrewprzh commented Jun 14, 2024

andrewprzh commented Jun 14, 2024

ljwharbers commented Jun 17, 2024

lianov commented Jul 1, 2024

andrewprzh commented Jul 1, 2024

lianov commented Jul 1, 2024

andrewprzh commented Jul 1, 2024

andrewprzh commented Jul 13, 2024 •

edited

Loading

lianov commented Jul 15, 2024

andrewprzh commented Aug 3, 2024

lianov commented Aug 5, 2024

ljwharbers commented Aug 6, 2024

lianov commented Aug 7, 2024

lianov commented Aug 23, 2024

andrewprzh commented Aug 24, 2024

ljwharbers commented Aug 26, 2024

lianov commented Aug 26, 2024

ljwharbers commented Aug 27, 2024

andrewprzh commented Aug 27, 2024

Memory exceeded when running single cell data alignment #189

Memory exceeded when running single cell data alignment #189

Comments

Qirongmao97 commented May 16, 2024 • edited Loading

lianov commented May 17, 2024

andrewprzh commented May 20, 2024

lianov commented May 20, 2024

andrewprzh commented May 24, 2024

Qirongmao97 commented May 27, 2024

ljwharbers commented Jun 3, 2024 • edited Loading

andrewprzh commented Jun 7, 2024

andrewprzh commented Jun 7, 2024

Qirongmao97 commented Jun 7, 2024

ljwharbers commented Jun 8, 2024

andrewprzh commented Jun 14, 2024

andrewprzh commented Jun 14, 2024

ljwharbers commented Jun 17, 2024

lianov commented Jul 1, 2024

andrewprzh commented Jul 1, 2024

lianov commented Jul 1, 2024

andrewprzh commented Jul 1, 2024

andrewprzh commented Jul 13, 2024 • edited Loading

lianov commented Jul 15, 2024

andrewprzh commented Aug 3, 2024

lianov commented Aug 5, 2024

ljwharbers commented Aug 6, 2024

lianov commented Aug 7, 2024

lianov commented Aug 23, 2024

andrewprzh commented Aug 24, 2024

ljwharbers commented Aug 26, 2024

lianov commented Aug 26, 2024

ljwharbers commented Aug 27, 2024

andrewprzh commented Aug 27, 2024

Qirongmao97 commented May 16, 2024 •

edited

Loading

ljwharbers commented Jun 3, 2024 •

edited

Loading

andrewprzh commented Jul 13, 2024 •

edited

Loading