Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory: core dumped at haplotype_scaffold #20

Closed
lukemn opened this issue Feb 2, 2021 · 6 comments
Closed

memory: core dumped at haplotype_scaffold #20

lukemn opened this issue Feb 2, 2021 · 6 comments

Comments

@lukemn
Copy link

lukemn commented Feb 2, 2021

Hi Shilpa,

I'm running hifiasm/pstools as in #16 on an ~100Mb genome, expected to be mostly haploid. I'm assuming this shouldn't be a major issue? I don't really trust the base-level results of short-read HiC assembly/scaffolding on HiFi tigs, and I'm hoping DipAsm will do a better job of it.

I get through with some minor (I assume) complaints (a few ERROR: key not in position table during hic_mapping_haplo, and various rm errors during resolve_haplotypes), but then a core dump during the haplotype_scaffold stage. There are 56 utgs for each of hap1 and hap2 in pred_haplotypes.fa, each ~250Mb. Any thoughts?

Here's that log:
start main
All above 5M: 13
All above 1.5M: 44
Update best buddy score.
Get potential connections 4.
Insert connections.
Save graphs and scores.
Nodes in graph: 2.
Left edges: 376.
Update best buddy score.
Get potential connections 4.
Insert connections.
Save graphs and scores.
Nodes in graph: 4.
Left edges: 184.
Update best buddy score.
Update best buddy score.
Get potential connections 4.
Insert connections.
Save graphs and scores.
Nodes in graph: 5.
Left edges: 304.
Update best buddy score.
Update best buddy score.
Finish get first scaffolds.
free(): invalid pointer

@shilpagarg
Copy link
Owner

Thanks for pointing this. Yes, I have seen the invalid pointer error in non-human assemblies. I am working on this and will provide an update soon.

@shilpagarg
Copy link
Owner

@lukemn
Copy link
Author

lukemn commented Feb 3, 2021

Works, thanks!

I get 242 Mb of hap1 and 32 Mb of hap2, and 62 Mb in broken_nodes. hap1 is much bigger than expected, there may be some bacterial contamination in there. I have genetic map-based pseudochromosomes from other assemblies, so I'll go through these files to see what looks sensible.

Also, I guess you plan to get to this eventually, can you say something about what pstools is doing relative to the previous docker pipeline?

Is there good reason not to use the primary hifiasm contigs (or other assemblies), rather than the raw unitigs?

@shilpagarg
Copy link
Owner

Good to know. The pstools method is purely graph-based without any haplotype collapses and enables routine production of phased sequences. I will be happy to help further if you could send me an email. As I mentioned, I only tested for humans, but it will be interesting to see for other genomes.

Working on unitigs is better than contigs to avoid any random cross-chromosome or long-range chromosome connections. Instead, Hi-C information is powerful to disentangle such cases in the graph.

@shilpagarg
Copy link
Owner

Yes, I agree with that it depends on characteristics of genome. Specifically, Hi-C is helpful for genomes with complex centromeres, for example, humans. For small genomes with no centromeres, I understand HiFi would be good enough. Another aspect is cost-effective. IMO there is no generalized method that is best for every genome.

@zhoudreames
Copy link

Yes, I agree with that it depends on characteristics of genome. Specifically, Hi-C is helpful for genomes with complex centromeres, for example, humans. For small genomes with no centromeres, I understand HiFi would be good enough. Another aspect is cost-effective. IMO there is no generalized method that is best for every genome.

I use the pstools_1 agan runining my project,but i got error result ,the length of scaffold_0l_hap1 is ~1.5G ,longer than the biggest chromsome length(~300Mb),this why?
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants