Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: clean up app #474

Merged
merged 120 commits into from
Aug 7, 2023
Merged
Changes from 1 commit
Commits
Show all changes
120 commits
Select commit Hold shift + click to select a range
b17d52c
refactor: update TokenType + remove protein termination classifier
korikuzma May 10, 2023
008b632
refactor: Remove TokenMatchType (not necessary)
korikuzma May 10, 2023
c2a47bb
forgot to remove additional match_type
korikuzma May 10, 2023
111d104
refactor: clean up handling unknown tokens
korikuzma May 10, 2023
cc79fd1
refactor: Rename GeneMatchToken to GeneToken
korikuzma May 10, 2023
5cd918d
refactor: create AltType str enum
korikuzma May 10, 2023
095c9e9
refactor: update TokenType type
korikuzma May 10, 2023
b79410c
refactor: remove LookupType enum (not used)
korikuzma May 10, 2023
67ab56a
wip: add work for tokenizers
korikuzma May 22, 2023
944d6db
wip: storing progress
korikuzma May 26, 2023
a4c2bf7
wip: store progress for delins
korikuzma Jun 6, 2023
8b76419
wip: store progress for cdna insertion
korikuzma Jun 12, 2023
7857767
wip: store progress for ref agree
korikuzma Jun 12, 2023
6778521
wip: minor cleanup of validators
korikuzma Jun 12, 2023
d567c5a
wip: store progress for protein stop gain
korikuzma Jun 13, 2023
59e7942
wip: store initial work for genomic dup
korikuzma Jun 13, 2023
4dd8043
wip: store progress for genomic ambiguous dups
korikuzma Jun 20, 2023
b652db8
wip: progress for genomic del
korikuzma Jun 20, 2023
d2cf750
wip: remove canonical variation work
korikuzma Jun 22, 2023
d909427
wip: fix classifiers
korikuzma Jun 22, 2023
80dd57c
wip: handle if mane none in translators
korikuzma Jun 22, 2023
1b39114
wip: tmp stop running gh actions
korikuzma Jun 22, 2023
8dd9d92
wip: more progress for normalize
korikuzma Jun 26, 2023
df8e845
wip: store more progress
korikuzma Jun 27, 2023
450981e
wip: add genomic del ambiguous
korikuzma Jun 27, 2023
8310cdf
iMerge branch 'main' into issue-332-kori-merge-main
korikuzma Jun 28, 2023
8512eea
wip: fix tokenizers
korikuzma Jun 28, 2023
6ee5110
wip: clean up classifier tests
korikuzma Jun 28, 2023
666ab23
wip: rename coding dna --> cdna
korikuzma Jun 28, 2023
f194615
wip: storing progress for dup1
korikuzma Jun 28, 2023
aa4d895
wip: dup/del progress
korikuzma Jun 29, 2023
082ebe0
wip: more progress for dup del
korikuzma Jun 30, 2023
89aee85
wip: get hgvs dup del mode tests to pass
korikuzma Jun 30, 2023
91c56b0
wip: fix bug in to_vrs
korikuzma Jun 30, 2023
e3296a5
wip: rm unused fixture + fix normalize test
korikuzma Jul 5, 2023
5a3443d
Merge branch 'main' into issue-332-kori
korikuzma Jul 13, 2023
716025d
wip: use local cool-seq-tool
korikuzma Jul 13, 2023
126709b
wip: classification.gene --> classification.gene_token
korikuzma Jul 13, 2023
ef1239e
wip: fixes to gene -> gene_token in classifiers
korikuzma Jul 13, 2023
1a9d375
wip: fix cool-seq-tool imports
korikuzma Jul 14, 2023
73eedcb
wip: fix setting gene in genomic translators
korikuzma Jul 14, 2023
e65d4d7
wip: fix case for BRAF V512E test in normalize
korikuzma Jul 14, 2023
d43f552
wip: update genomic sub for normalize
korikuzma Jul 14, 2023
511da9d
wip: rm support for some acs
korikuzma Jul 14, 2023
c043da3
wip: more progress for genomic
korikuzma Jul 16, 2023
8678516
wip: initial work for to_copy_number
korikuzma Jul 17, 2023
908f89d
wip: sort translation results
korikuzma Jul 17, 2023
66f2e55
wip: rm get_mane_valid_result
korikuzma Jul 17, 2023
8d53275
wip: sort translation result
korikuzma Jul 17, 2023
c5a1324
wip: fix gnomad vcf / genomic insertion
korikuzma Jul 17, 2023
4c36c8b
wip: fix classifier test
korikuzma Jul 18, 2023
8e31b68
wip: more fixes
korikuzma Jul 18, 2023
3866538
wip: fix gnomad vcf to protein
korikuzma Jul 19, 2023
21fcc1a
wip: fix cnv tests
korikuzma Jul 19, 2023
ff10b04
wip: make mappings an instance var
korikuzma Jul 20, 2023
2c6bc3c
wip: update initializing cst
korikuzma Jul 20, 2023
57dd0bf
wip: rm gene_tokens
korikuzma Jul 20, 2023
c8e689b
wip: clean up some todos
korikuzma Jul 20, 2023
ebf63a1
wip: more cleanup
korikuzma Jul 20, 2023
11c5672
wip: clean up some flake8 errors
korikuzma Jul 20, 2023
6a85a6c
wip: clean up del/dup ambiguous translate method
korikuzma Jul 21, 2023
299c1fd
wip: clean up genomic del/dup translator
korikuzma Jul 21, 2023
bdb8882
wip: flake8 + fix gnomad vcf to protein
korikuzma Jul 21, 2023
fcdd907
wip: more flake8
korikuzma Jul 21, 2023
9c4e929
wip: rm todos for issues
korikuzma Jul 24, 2023
1377c8d
wip: fix gnomad vcf deletions
korikuzma Jul 24, 2023
c5f0266
wip: update cool-seq-tool version
korikuzma Jul 24, 2023
b58cde3
wip: transcripts --> accessions
korikuzma Jul 24, 2023
8b1e4ef
wip: rm todos (made new issues / commented on existing)
korikuzma Jul 24, 2023
0110910
wip: coding dna --> cdna
korikuzma Jul 24, 2023
98e9629
wip: validate gene pos
korikuzma Jul 24, 2023
8a25ac4
wip: update validation checks
korikuzma Jul 24, 2023
8bba456
wip: rm todo check liftover note
korikuzma Jul 24, 2023
20fe592
wip: remove unused code
korikuzma Jul 25, 2023
88dd0f2
wip: rename instance vars + imports for vrs-python
korikuzma Jul 25, 2023
8d8e570
wip: clean up class instance vars
korikuzma Jul 25, 2023
06af434
wip: rm CodonTable class, create function in gnomad vcf to protein
korikuzma Jul 25, 2023
5219160
wip: more flake8
korikuzma Jul 25, 2023
5c4b71c
wip: HGVSDupDelModeEnum -> HGVSDupDelModeOption + fix CopyChange import
korikuzma Jul 25, 2023
d213851
wip: clean up validator tests
korikuzma Jul 25, 2023
1724266
wip: move validator fixtures
korikuzma Jul 26, 2023
fdcfe33
wip: rm duplications from to_vrs
korikuzma Jul 26, 2023
ead256d
wip: add amplification import + allow parentheses in hgvs
korikuzma Jul 26, 2023
3a14c63
wip: accidentally used wrong var name
korikuzma Jul 26, 2023
3a170aa
wip: update translator tests + fix translator bugs
korikuzma Jul 26, 2023
330f7b9
wip: add amplification validator test
korikuzma Jul 26, 2023
38354f0
wip: move classifier tests back to yaml
korikuzma Jul 26, 2023
c44575c
wip: move validator tests back to yaml
korikuzma Jul 26, 2023
55b6813
wip: refactor tokenizer tests
korikuzma Jul 26, 2023
858e2dc
wip: cleanup schemas (flake8/enum changes)
korikuzma Jul 27, 2023
802215a
wip: cleanup classifiers (flake8)
korikuzma Jul 27, 2023
187fe34
wip: flake8 for normalize + utils
korikuzma Jul 27, 2023
660fd42
wip: cleanup tests (flake8/vulture)
korikuzma Jul 27, 2023
51c8254
wip: resolve flake8 errors
korikuzma Jul 27, 2023
7a11575
wip: update gh actions
korikuzma Jul 27, 2023
c4672f9
add more comments
korikuzma Jul 27, 2023
031c36a
bump version
korikuzma Jul 27, 2023
d511da4
cleanup: replace flake8 with ruff
korikuzma Jul 29, 2023
18f7058
Merge branch 'main' into issue-332-kori
korikuzma Jul 29, 2023
c4e25eb
cleanup: add black
korikuzma Jul 29, 2023
99bc49e
style: add isort to ruff select
korikuzma Jul 29, 2023
934663d
fix: hgvs_to_copy_number_count requires baseline_copies
korikuzma Jul 29, 2023
0636f48
cicd: combine black + ruff into one job
korikuzma Jul 29, 2023
222778f
style: remove old noqa
korikuzma Jul 29, 2023
6c8c122
refactor: create methods for validating pos
korikuzma Jul 30, 2023
6859e93
refactor: add method for validating protein hgvs classification
korikuzma Jul 30, 2023
6e705d2
refactor: cdna + protein translators
korikuzma Jul 30, 2023
421ba5d
style: ignore ANN003 - missing-type-kwargs
korikuzma Jul 31, 2023
fd26395
fix: classifier import
korikuzma Jul 31, 2023
051d8f7
docs: cleanup readme
korikuzma Aug 1, 2023
60f68eb
refactor: ensure unique list of warnings in service response
korikuzma Aug 1, 2023
d6cbd64
fix: validate Allele in to_vrs_allele
korikuzma Aug 1, 2023
2e2351d
pr review changes
korikuzma Aug 1, 2023
6bcb1ed
fix: forgot to update return in to copy number variation
korikuzma Aug 1, 2023
bf3458b
update readme on why we return cdna when given gene genomic change
korikuzma Aug 1, 2023
1bb95ae
pr review changes
korikuzma Aug 1, 2023
ce7bbfb
update invalid tests for normalize + put in todo reminder
korikuzma Aug 1, 2023
ae7ef17
tests: add test for genomic delins change w gene
korikuzma Aug 2, 2023
4bac65d
tests: stop checking exact gene normalizer response
korikuzma Aug 2, 2023
f422b4d
pr review changes: update type hints + docstrings
korikuzma Aug 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
wip: remove canonical variation work
korikuzma committed Jun 22, 2023
commit d2cf750902d8c6224930e25b966e56b6edc04d28
495 changes: 0 additions & 495 deletions tests/test_to_canonical_variation.py

This file was deleted.

59 changes: 3 additions & 56 deletions variation/main.py
Original file line number Diff line number Diff line change
@@ -20,9 +20,9 @@
from variation.schemas.hgvs_to_copy_number_schema import \
HgvsToCopyNumberCountService, HgvsToCopyNumberChangeService
from variation.query import QueryHandler
from variation.schemas.normalize_response_schema \
import HGVSDupDelMode as HGVSDupDelModeEnum, ToCanonicalVariationFmt, \
ToCanonicalVariationService, TranslateIdentifierService
from variation.schemas.normalize_response_schema import (
HGVSDupDelMode as HGVSDupDelModeEnum, TranslateIdentifierService
)
from variation.schemas.service_schema import AmplificationToCxVarService, \
ClinVarAssembly, ParsedToCnVarService, ToGenomicService, ToCdnaService
from .version import __version__
@@ -36,7 +36,6 @@ class Tags(Enum):

SEQREPO = "SeqRepo"
TO_PROTEIN_VARIATION = "To Protein Variation"
TO_CANONICAL = "To Canonical Variation"
VRS_PYTHON = "VRS-Python"
TO_COPY_NUMBER_VARIATION = "To Copy Number Variation"
ALIGNMENT_MAPPER = "Alignment Mapper"
@@ -313,58 +312,6 @@ async def gnomad_vcf_to_protein(
"`fmt`=`hgvs` and Copy Number Count Variation.`"


@app.get("/variation/to_canonical_variation",
summary="Given SPDI or HGVS, return VRSATILE Canonical Variation",
response_description="A response to a validly-formed query.",
description="Return VRSATILE Canonical Variation",
response_model=ToCanonicalVariationService,
tags=[Tags.TO_CANONICAL])
async def to_canonical_variation(
q: str = Query(..., description="HGVS or SPDI query"),
fmt: ToCanonicalVariationFmt = Query(...,
description="Format of the input variation. Must be `spdi` or `hgvs`"), # noqa: E501
complement: bool = Query(False, description=complement_descr),
do_liftover: bool = Query(False, description="Whether or not to liftover to "
"GRCh38 assembly."),
hgvs_dup_del_mode: Optional[HGVSDupDelModeEnum] = Query(
HGVSDupDelModeEnum.DEFAULT, description=hgvs_dup_del_mode_decsr),
copy_change: Optional[CopyChange] = Query(
None, description=copy_change_descr),
baseline_copies: Optional[int] = Query(
None, description=baseline_copies_descr),
untranslatable_returns_text: bool = Query(
False, description=untranslatable_descr)
) -> ToCanonicalVariationService:
"""Return categorical variation for canonical SPDI

:param str q: HGVS or SPDI query
:param ToCanonicalVariationFmt fmt: Format of the input variation. Must be
`spdi` or `hgvs`.
:param bool complement: This field indicates that a categorical variation
is defined to include (false) or exclude (true) variation concepts matching the
categorical variation. This is equivalent to a logical NOT operation on the
categorical variation properties.
:param Optional[HGVSDupDelModeEnum] hgvs_dup_del_mode: Determines how to interpret
HGVS dup/del expressions in VRS. Must be one of: `default`, `copy_number_count`,
`copy_number_change`, `repeated_seq_expr`, `literal_seq_expr`
:param Optional[CopyChange] copy_change: copy change.
Only used when `fmt`=`hgvs` and Copy Number Change Variation.
:param Optional[int] baseline_copies: Baseline copies number
Only used when `fmt`=`hgvs` and Copy Number Count Variation
:param bool untranslatable_returns_text: `True` return VRS Text Object when
unable to translate or normalize query. `False` return `None` when
unable to translate or normalize query.
:return: ToCanonicalVariationService for variation query
"""
q = unquote(q)
resp = await query_handler.to_canonical_handler.to_canonical_variation(
q, fmt, complement, do_liftover=do_liftover,
hgvs_dup_del_mode=hgvs_dup_del_mode, baseline_copies=baseline_copies,
copy_change=copy_change,
untranslatable_returns_text=untranslatable_returns_text)
return resp


def _get_allele(request_body: Union[TranslateToQuery, TranslateToHGVSQuery],
warnings: List) -> Optional[models.Allele]:
"""Return VRS allele object from request body. `warnings` will get updated if
3 changes: 0 additions & 3 deletions variation/query.py
Original file line number Diff line number Diff line change
@@ -19,7 +19,6 @@
from variation.to_vrsatile import ToVRSATILE
from variation.gnomad_vcf_to_protein_variation import GnomadVcfToProteinVariation
from variation.normalize import Normalize
from variation.to_canonical_variation import ToCanonicalVariation
from variation.to_copy_number_variation import ToCopyNumberVariation


@@ -100,6 +99,4 @@ def __init__(
mane_transcript_mappings, codon_table]
self.gnomad_vcf_to_protein_handler = GnomadVcfToProteinVariation(
*to_protein_params)
self.to_canonical_handler = ToCanonicalVariation(
*to_vrs_params + [self._tlr, uta_db])
self.to_copy_number_handler = ToCopyNumberVariation(*to_vrs_params)
1 change: 0 additions & 1 deletion variation/schemas/app_schemas.py
Original file line number Diff line number Diff line change
@@ -12,4 +12,3 @@ class Endpoint(str, Enum):
TRANSLATE_FROM = "translate_from"
HGVS_TO_COPY_NUMBER_COUNT = "hgvs_to_copy_number_count"
HGVS_TO_COPY_NUMBER_CHANGE = "hgvs_to_copy_number_change"
TO_CANONICAL = "to_canonical"
69 changes: 1 addition & 68 deletions variation/schemas/normalize_response_schema.py
Original file line number Diff line number Diff line change
@@ -5,8 +5,7 @@

from pydantic import BaseModel
from pydantic.types import StrictStr
from ga4gh.vrsatile.pydantic.vrsatile_models import VariationDescriptor, \
CanonicalVariation
from ga4gh.vrsatile.pydantic.vrsatile_models import VariationDescriptor


class HGVSDupDelMode(str, Enum):
@@ -275,69 +274,3 @@ def schema_extra(schema: Dict[str, Any],
"url": "https://github.com/cancervariants/variation-normalization" # noqa: E501
}
}


class ToCanonicalVariationFmt(str, Enum):
"""Define formats for to_canonical endpoint"""

HGVS = "hgvs"
SPDI = "spdi"


class ToCanonicalVariationService(ServiceResponse):
"""A response model for the to canonical variation service"""

query: str
canonical_variation: Optional[CanonicalVariation]

class Config:
"""Configure model."""

@staticmethod
def schema_extra(
schema: Dict[str, Any],
model: Type["ToCanonicalVariationService"]) -> None:
"""Configure OpenAPI schema."""
if "title" in schema.keys():
schema.pop("title", None)
for prop in schema.get("properties", {}).values():
prop.pop("title", None)
schema["example"] = {
"query": "NC_000007.14:140753335:A:T",
"warnings": [],
"canonical_variation": {
"_id": "ga4gh:VCC.W0r_NF_ecKXjgvTwcMNkyVS1pB_CXMj9",
"type": "CanonicalVariation",
"complement": False,
"variation": {
"_id": "ga4gh:VA.fZiBjQEolbkL0AxjoTZf4SOkFy9J0ebU",
"type": "Allele",
"location": {
"_id": "ga4gh:VSL.zga82-TpYiNmBESCfvDvAz9DyvJF98I-",
"type": "SequenceLocation",
"sequence_id": "ga4gh:SQ.F-LrLMe1SRpfUZHkQmvkVKFEGaoDeHul",
"interval": {
"type": "SequenceInterval",
"start": {
"type": "Number",
"value": 140753335
},
"end": {
"type": "Number",
"value": 140753336
}
}
},
"state": {
"type": "LiteralSequenceExpression",
"sequence": "T"
}
}
},
"service_meta_": {
"version": "0.2.20",
"response_datetime": "2022-02-20T17:16:19.415675",
"name": "variation-normalizer",
"url": "https://github.com/cancervariants/variation-normalization"
}
}
Loading