memory: core dumped at haplotype_scaffold #20

lukemn · 2021-02-02T17:35:25Z

Hi Shilpa,

I'm running hifiasm/pstools as in #16 on an ~100Mb genome, expected to be mostly haploid. I'm assuming this shouldn't be a major issue? I don't really trust the base-level results of short-read HiC assembly/scaffolding on HiFi tigs, and I'm hoping DipAsm will do a better job of it.

I get through with some minor (I assume) complaints (a few ERROR: key not in position table during hic_mapping_haplo, and various rm errors during resolve_haplotypes), but then a core dump during the haplotype_scaffold stage. There are 56 utgs for each of hap1 and hap2 in pred_haplotypes.fa, each ~250Mb. Any thoughts?

Here's that log:
start main
All above 5M: 13
All above 1.5M: 44
Update best buddy score.
Get potential connections 4.
Insert connections.
Save graphs and scores.
Nodes in graph: 2.
Left edges: 376.
Update best buddy score.
Get potential connections 4.
Insert connections.
Save graphs and scores.
Nodes in graph: 4.
Left edges: 184.
Update best buddy score.
Update best buddy score.
Get potential connections 4.
Insert connections.
Save graphs and scores.
Nodes in graph: 5.
Left edges: 304.
Update best buddy score.
Update best buddy score.
Finish get first scaffolds.
free(): invalid pointer

The text was updated successfully, but these errors were encountered:

shilpagarg · 2021-02-02T17:43:25Z

Thanks for pointing this. Yes, I have seen the invalid pointer error in non-human assemblies. I am working on this and will provide an update soon.

shilpagarg · 2021-02-02T21:47:16Z

Please try https://pstools.s3.us-east-2.amazonaws.com/pstools_1.

lukemn · 2021-02-03T10:00:40Z

Works, thanks!

I get 242 Mb of hap1 and 32 Mb of hap2, and 62 Mb in broken_nodes. hap1 is much bigger than expected, there may be some bacterial contamination in there. I have genetic map-based pseudochromosomes from other assemblies, so I'll go through these files to see what looks sensible.

Also, I guess you plan to get to this eventually, can you say something about what pstools is doing relative to the previous docker pipeline?

Is there good reason not to use the primary hifiasm contigs (or other assemblies), rather than the raw unitigs?

shilpagarg · 2021-02-03T10:29:22Z

Good to know. The pstools method is purely graph-based without any haplotype collapses and enables routine production of phased sequences. I will be happy to help further if you could send me an email. As I mentioned, I only tested for humans, but it will be interesting to see for other genomes.

Working on unitigs is better than contigs to avoid any random cross-chromosome or long-range chromosome connections. Instead, Hi-C information is powerful to disentangle such cases in the graph.

shilpagarg · 2021-02-03T12:35:53Z

Yes, I agree with that it depends on characteristics of genome. Specifically, Hi-C is helpful for genomes with complex centromeres, for example, humans. For small genomes with no centromeres, I understand HiFi would be good enough. Another aspect is cost-effective. IMO there is no generalized method that is best for every genome.

zhoudreames · 2021-02-04T01:37:26Z

Yes, I agree with that it depends on characteristics of genome. Specifically, Hi-C is helpful for genomes with complex centromeres, for example, humans. For small genomes with no centromeres, I understand HiFi would be good enough. Another aspect is cost-effective. IMO there is no generalized method that is best for every genome.

I use the pstools_1 agan runining my project,but i got error result ,the length of scaffold_0l_hap1 is ~1.5G ,longer than the biggest chromsome length(~300Mb),this why?

shilpagarg closed this as completed Feb 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

memory: core dumped at haplotype_scaffold #20

memory: core dumped at haplotype_scaffold #20

lukemn commented Feb 2, 2021

shilpagarg commented Feb 2, 2021

shilpagarg commented Feb 2, 2021

lukemn commented Feb 3, 2021

shilpagarg commented Feb 3, 2021

shilpagarg commented Feb 3, 2021

zhoudreames commented Feb 4, 2021

memory: core dumped at haplotype_scaffold #20

memory: core dumped at haplotype_scaffold #20

Comments

lukemn commented Feb 2, 2021

shilpagarg commented Feb 2, 2021

shilpagarg commented Feb 2, 2021

lukemn commented Feb 3, 2021

shilpagarg commented Feb 3, 2021

shilpagarg commented Feb 3, 2021

zhoudreames commented Feb 4, 2021