-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Entropy panel unavailable if mutations aren't translated #881
Comments
I got extremely confused by this bug. I encountered it as the following error in a workflow that uses The error is:
Thrown by This is a very serious bug, it really makes using the same genemap for Nextclade and Nextclade reference build of monkeypox impossible which is not good, discrepancies arise like this. |
Unless the problem described by me is not a real bug, or not identical, feel free to readjust pain score. But based on my use case in addition to the original from @jameshadfield I've upgraded this |
The workaround for me is to:
limit_info = dict( gff_type = ['gene', 'source', 'nuc'] )
MT903344.1 Genbank source 1 197233 . + . locus_tag=nuc I'll put in a PR, hope that doesn't break anything. Should I open this as a separate issue @jameshadfield ? |
For a detailed write-up of the bug which motivated this commit, see #881. By storing the (nucleotide) genome annotation in the node-data produced from augur-ancestral we make this information available for export. Previously this information was only exported by `augur translate` which was problematic for workflows which didn't perform translation. No changes are needed to `augur export v2` (which may now process multiple "annotations" blocks) due to the behavior of `NodeData.deep_add_or_update` which will recurse into dicts in annotation blocks and when confronted with non-dict values which already exist overwrite them. This poses a potential problem where two node-data JSONs which (e.g.) define different `annotations['nuc']` coordinates will not raise any error and the output coodinates are dependent on the order the node-data JSONs were provided to `augur export v2`. Closes #881.
For a detailed write-up of the bug which motivated this commit, see #881. By storing the (nucleotide) genome annotation in the node-data produced from augur-ancestral we make this information available for export. Previously this information was only exported by `augur translate` which was problematic for workflows which didn't perform translation. No changes are needed to `augur export v2` (which may now process multiple "annotations" blocks) due to the behavior of `NodeData.deep_add_or_update` which will recurse into dicts in annotation blocks and when confronted with non-dict values which already exist overwrite them. This poses a potential problem where two node-data JSONs which (e.g.) define different `annotations['nuc']` coordinates will not raise any error and the output coodinates are dependent on the order the node-data JSONs were provided to `augur export v2`. Closes #881.
For a detailed write-up of the bug which motivated this commit, see nextstrain#881. By storing the (nucleotide) genome annotation in the node-data produced from augur-ancestral we make this information available for export. Previously this information was only exported by `augur translate` which was problematic for workflows which didn't perform translation. No changes are needed to `augur export v2` (which may now process multiple "annotations" blocks) due to the behavior of `NodeData.deep_add_or_update` which will recurse into dicts in annotation blocks and when confronted with non-dict values which already exist overwrite them. This poses a potential problem where two node-data JSONs which (e.g.) define different `annotations['nuc']` coordinates will not raise any error and the output coodinates are dependent on the order the node-data JSONs were provided to `augur export v2`. Closes nextstrain#881.
Current Behaviour
For the entropy panel to be displayed in auspice we have three requirements:
JSON.panels
includes"entropy"
JSON.meta.genome_annotations
exists and includes at least a{nuc: {start: INT, end: INT}}
object.These typically come from an augur workflow with steps:
(i)
augur ancestral
(does not create anodeDataJSON.annotations
object)(ii)
augur translate
(creates anodeDataJSON.annotations
object)(iii)
augur export
(takes care of requirement 1 unless you opt-out, and converts thenodeDataJSON.annotations
object toJSON.meta.genome_annotations
).However if you choose not to translate mutations (step ii) then no
annotations
object is available for export, and thus requirement 2 is not met and the entropy panel is not displayed.Expected behavior
A pipeline using steps (i, iii) should be valid. In other words, translating mutations should be optional.
Possible solution
The solution is not as simple as just adding an annotations block to the node-data JSON produced by
augur ancestral
, asaugur export v2
assumes that there will only be one of these blocks.The most consistent would be for
augur ancestral
creates anodeDataJSON.annotations
object withnuc
informationaugur translate
creates anodeDataJSON.annotations
object with per-gene informationaugur export v2
accepts multiple annotations blocks and combines them, accepting identical duplicate elements and exiting if there are conflicts.The text was updated successfully, but these errors were encountered: