-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotations schema updates #1281
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,34 +1,90 @@ | ||
{ | ||
"type" : "object", | ||
"$schema": "http://json-schema.org/draft-06/schema#", | ||
"title": "JSON object for the `annotations` key, typically produced by `augur translate`", | ||
"description": "Coordinates etc of genes / genome", | ||
"$id": "https://nextstrain.org/schemas/augur/annotations", | ||
"title": "Schema for the 'annotations' property (node-data JSON) or the 'genome_annotations' property (auspice JSON)", | ||
"properties": { | ||
"nuc": { | ||
"type": "object", | ||
"allOf": [{ "$ref": "#/$defs/startend" }], | ||
"properties": { | ||
"start": { | ||
"enum": [1], | ||
"$comment": "nuc must begin at 1" | ||
}, | ||
"strand": { | ||
"type": "string", | ||
"enum":["+"], | ||
"description": "Strand is optional for nuc, as it should be +ve for all genomes (-ve strand genomes are reverse complemented)", | ||
"$comment": "Auspice will not proceed if the JSON has strand='-'" | ||
} | ||
}, | ||
"additionalProperties": true, | ||
"$comment": "All other properties are unused by Auspice." | ||
} | ||
}, | ||
"required": ["nuc"], | ||
"patternProperties": { | ||
"^[a-zA-Z0-9*_-]+$": { | ||
"^(?!nuc)[a-zA-Z0-9*_-]+$": { | ||
"$comment": "Each object here defines a single CDS", | ||
"type": "object", | ||
"oneOf": [{ "$ref": "#/$defs/startend" }, { "$ref": "#/$defs/segments" }], | ||
"additionalProperties": true, | ||
"required": ["strand"], | ||
"properties": { | ||
"seqid":{ | ||
"description": "Sequence on which the coordinates below are valid. Could be viral segment, bacterial contig, etc", | ||
"$comment": "Unused by Auspice 2.0", | ||
"type": "string" | ||
"gene": { | ||
"type": "string", | ||
"description": "The name of the gene the CDS is from. Optional.", | ||
"$comment": "Shown in on-hover infobox & influences default CDS colors" | ||
}, | ||
"type": { | ||
"description": "Type of the feature. could be mRNA, CDS, or similar", | ||
"$comment": "Unused by Auspice 2.0", | ||
"type": "string" | ||
"strand": { | ||
"description": "Strand of the CDS", | ||
"type": "string", | ||
"enum": ["-", "+"] | ||
}, | ||
"start": { | ||
"description": "Gene start position (one-based, following GFF format)", | ||
"type": "number" | ||
"color": { | ||
"type": "string", | ||
"description": "A CSS color or a color hex code. Optional." | ||
}, | ||
"end": { | ||
"description": "Gene end position (one-based closed, last position of feature, following GFF format)", | ||
"type": "number" | ||
"display_name": { | ||
"type": "string", | ||
"$comment": "Shown in the on-hover info box" | ||
}, | ||
"strand": { | ||
"description": "Positive or negative strand", | ||
"description": { | ||
"type": "string", | ||
"enum": ["-","+"] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion: Add Even though Auspice only cares if the value is I started this in #1279 but that PR can be closed and the change included in here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the review -- really appreciated! We're going to have to think through this. All annotations are interpreted by Auspice as CDSs, so a strand of I can allow them in the schema and then have Auspice filter to Update: Auspice PR now ignores any non-nuc annotation which is not explicitly +/- strand There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have shifted this conversation to #1279 |
||
"$comment": "Shown in the on-hover info box" | ||
} | ||
} | ||
} | ||
}, | ||
"$defs": { | ||
"startend": { | ||
"type": "object", | ||
"required": ["start", "end"], | ||
"properties": { | ||
"start": { | ||
"type": "integer", | ||
"minimum": 1, | ||
"description": "Start position (one-based, following GFF format)" | ||
}, | ||
"end": { | ||
"type": "integer", | ||
"minimum": 2, | ||
"description": "End position (one-based, following GFF format). This value _must_ be greater than the start." | ||
} | ||
} | ||
}, | ||
"segments": { | ||
"type": "object", | ||
"required": ["segments"], | ||
"properties": { | ||
"segments": { | ||
"type": "array", | ||
"minItems": 1, | ||
"items": { | ||
"type": "object", | ||
"allOf": [{ "$ref": "#/$defs/startend" }] | ||
} | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party, but shouldn't we in principle allow each cds fragment to have its own strandedness? Here the strand is fixed for all fragments, which will be fine in most cases but who knows, maybe sometimes fragments might come from opposite strands?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ChatGPT tells me that strandedness never changes within a CDS so then what we have here should work in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With trans-splicing it might work, but doesn't seem to happen in viruses so we should be good for now: https://en.wikipedia.org/wiki/Trans-splicing