Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polishing a GFA #427

Open
mmcguffi opened this issue Mar 24, 2023 · 3 comments
Open

Polishing a GFA #427

mmcguffi opened this issue Mar 24, 2023 · 3 comments

Comments

@mmcguffi
Copy link

mmcguffi commented Mar 24, 2023

Is your feature request related to a problem? Please describe.

  • When I polish a Flye assembly with Medaka I lose assembly information since medaka_consensus accepts fasta files instead of gfa files (and outputs fasta files). Often this connection information can be quite important for various downstream analyses.

Describe the solution you'd like

  • An input gfa file to medaka_consensus which gives a polished gfa output file

Describe alternatives you've considered

  • I have thought that it might be possible to reconstruct a polished gfa file by using the node/edge information in the pre-polished gfa file. However, I have noticed that contigs are sometimes lost during polishing. Are contigs ever merged during polishing or are they only deleted? I think it should be possible to reconstruct a polished gfa if they are only deleted, though Im less sure of the feasibility of this if contigs are merged.

Thanks for the great tool!

@cjw85
Copy link
Member

cjw85 commented Mar 24, 2023

The simple question first: contigs are never merged during polishing.

It is unlikely that we will ever implement GFA output. I'd have to refresh my memory of the details of GFA but I believe outputting an updated GFA would require recomputing connections and overlaps between the contigs. This is not a trivial operation when the contigs have changed length (as they do during polishing, and we wish to keep containments). If there is a library out there that implements such transformations (possibly something akin to liftover), then in might be possible to embed into medaka. Otherwise it would be a task of a standalone tool.

@cjw85
Copy link
Member

cjw85 commented Mar 24, 2023

Hmmm, even in the case of simple links care would need to be taken in implementing this because medaka can arbitrarily extend contigs ends (not starts). This could have subtle and weird effects on the interpretation of the GFA.

@mmcguffi
Copy link
Author

mmcguffi commented Oct 7, 2024

Any chance you would implement medaka gfa polishing? This remains a hassle for several of our pipelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants