You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nextclade provides gene map GFFs for its many datasets. These gene maps use the gene_name qualifier key to define the name of each gene (e.g., the SARS-CoV-2 gene map). Augur's utils.load_features function currently only checks for gene names stored with qualifier keys of gene and locus_tag. When it tries to parse a Nextclade GFF, the load_features function fails to find the gene qualifier, defaults to locus_tag, and then crashes with a key error when that qualifier does not exist.
Expected behavior
We should be able to load features from GFF3 files with gene names referenced by alternate qualifier keys.
Add a unit test for utils.load_features that attempts to load features from a Nextclade GFF.
Add logic to the load_features function to check for gene_name among the other default qualifiers.
Additional context
Related to #953. Although we plan to reimplement load_features with a more modern GFF parser, the solution proposed here will fix this minor but blocking bug in the short term.
The text was updated successfully, but these errors were encountered:
Current Behavior
Nextclade provides gene map GFFs for its many datasets. These gene maps use the
gene_name
qualifier key to define the name of each gene (e.g., the SARS-CoV-2 gene map). Augur'sutils.load_features
function currently only checks for gene names stored with qualifier keys ofgene
andlocus_tag
. When it tries to parse a Nextclade GFF, theload_features
function fails to find thegene
qualifier, defaults tolocus_tag
, and then crashes with a key error when that qualifier does not exist.Expected behavior
We should be able to load features from GFF3 files with gene names referenced by alternate qualifier keys.
How to reproduce
Steps to reproduce the current behavior:
load_features
function from a Python terminal and call it with the gene map GFF.Possible solution
utils.load_features
that attempts to load features from a Nextclade GFF.load_features
function to check forgene_name
among the other default qualifiers.Additional context
Related to #953. Although we plan to reimplement
load_features
with a more modern GFF parser, the solution proposed here will fix this minor but blocking bug in the short term.The text was updated successfully, but these errors were encountered: