Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add optional integration of GestaltMatcher/PEDIA (#399, #1125) #1249

Merged
merged 78 commits into from
Sep 27, 2024
Merged
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
6418176
Merge pull request #1 from bihealth/main
ahujameg Apr 3, 2023
06ac783
Merge branch 'bihealth:main' into main
ahujameg Oct 12, 2023
fd85dfb
GestaltMatcher Integration
ahujameg Nov 10, 2023
3d04399
Merge branch 'bihealth:main' into main
ahujameg Nov 10, 2023
3c8f28d
Merge branch 'main' into pedia
ahujameg Nov 10, 2023
1e57325
Restoring
ahujameg Nov 10, 2023
eeb150c
Implemented PEDIA integration into VarFish ticket #399
ahujameg Feb 7, 2024
9ce7b40
Merge branch 'varfish-org:main' into pedia
ahujameg Feb 7, 2024
f4a363a
Implemented PEDIA integration into VarFish ticket #399
ahujameg Feb 7, 2024
f0de558
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Feb 8, 2024
013e012
Updating package-lock.json
ahujameg Feb 8, 2024
7bc5ba1
Updating package-lock.json
ahujameg Feb 8, 2024
be070fe
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 8, 2024
e8c322c
Updating package-lock.json
ahujameg Feb 8, 2024
e0e38f4
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 13, 2024
54da902
Merge branch 'varfish-org:main' into pedia
ahujameg Feb 13, 2024
8c3e135
Updating package-lock.json
ahujameg Feb 13, 2024
d65950d
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Feb 13, 2024
beec4bf
Updating package-lock.json
ahujameg Feb 13, 2024
2d4a396
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 14, 2024
596c9d8
Updating package-lock.json
ahujameg Feb 14, 2024
8607eb9
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 14, 2024
5b90dce
Updating package-lock.json
ahujameg Feb 14, 2024
67f068a
Updating package-lock.json
ahujameg Feb 14, 2024
16fd695
Updating package-lock.json
ahujameg Feb 14, 2024
2132267
Updating package-lock.json
ahujameg Feb 16, 2024
6eef715
Updating package-lock.json
ahujameg Feb 16, 2024
f0e18e3
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Feb 16, 2024
5578b3c
Resolving CI failures
ahujameg Feb 16, 2024
ced1187
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Feb 16, 2024
0e3be1e
Resolving CI failures
ahujameg Feb 16, 2024
67537f2
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Feb 16, 2024
0cca9da
Fixing CI errors
ahujameg Feb 16, 2024
54ae75d
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 16, 2024
851ab06
Resolving CI failures
ahujameg Feb 16, 2024
575e950
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Feb 16, 2024
b6ee2d2
Resolving CI failures
ahujameg Feb 16, 2024
cda671b
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 19, 2024
240ad60
Resolving CI failures
ahujameg Feb 19, 2024
41e7bde
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 19, 2024
e41e89b
Resolving CI errors.
ahujameg Feb 19, 2024
ec0bcb9
Merge remote-tracking branch 'origin/pedia' into pedia
ahujameg Feb 20, 2024
7fe2842
Improving code coverage
ahujameg Feb 20, 2024
a8db61b
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Mar 8, 2024
72527a4
Merge remote-tracking branch 'upstream/main' into pedia
ahujameg Mar 8, 2024
d8c8e69
Improving code coverage
ahujameg Mar 8, 2024
95a768e
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Mar 8, 2024
c64977d
Improving code coverage
ahujameg Mar 8, 2024
a49892d
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Mar 8, 2024
2ba6004
Improving code coverage
ahujameg Mar 8, 2024
008e339
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Mar 10, 2024
8a45344
Merge branch 'varfish-org:main' into pedia
ahujameg Mar 12, 2024
f6221c3
Squashing commits.
ahujameg Mar 12, 2024
795193c
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Mar 12, 2024
eacb634
Merge latest upstream changes.
ahujameg Jul 15, 2024
e684df2
Moving ext_gestaltmatcher to backend.
ahujameg Jul 15, 2024
df3f099
Fixing test case for file export.
ahujameg Jul 16, 2024
33f7def
Update and merge latest upstream code.
ahujameg Jul 16, 2024
8d1d140
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Jul 16, 2024
d4b174a
Update and merge latest upstream code.
ahujameg Jul 16, 2024
656f6e2
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Jul 17, 2024
53c282f
Update and merge latest upstream code.
ahujameg Jul 17, 2024
2b31c80
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Jul 17, 2024
d316793
Implement review comments.
ahujameg Jul 17, 2024
6641237
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 23, 2024
2998243
Merge branch 'main' into pedia
ahujameg Sep 23, 2024
718c86c
Implement review comments.
ahujameg Sep 23, 2024
6c814a4
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 23, 2024
30ece82
Implement review comments.
ahujameg Sep 23, 2024
17e1cad
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 23, 2024
13ec8ad
Implementing review comments
ahujameg Sep 23, 2024
4faf141
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 24, 2024
6bcc360
Implementing review comments
ahujameg Sep 24, 2024
b6ba13b
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 24, 2024
6ba33e6
Implementing review comments
ahujameg Sep 24, 2024
1680773
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 24, 2024
5bc9379
Implementing review comments
ahujameg Sep 24, 2024
c62fe36
Merge branch 'pedia' of https://github.com/ahujameg/varfish-server in…
ahujameg Sep 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions backend/cases/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ def get_context_data(self, *args, **kwargs):
),
"exomiser_enabled": settings.VARFISH_ENABLE_EXOMISER_PRIORITISER,
"cadd_enabled": settings.VARFISH_ENABLE_CADD,
"cada_enabled": settings.VARFISH_ENABLE_CADA,
"extra_anno_fields": extra_anno_fields,
"url_prefixes": {
"annonars": settings.VARFISH_BACKEND_URL_PREFIX_ANNONARS,
Expand Down
27 changes: 16 additions & 11 deletions backend/config/settings/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@
"varannos.apps.VarannosConfig",
# Legacy apps - not used anymore!
"hgmd.apps.HgmdConfig",
"ext_gestaltmatcher.apps.ExtGestaltmatcherConfig",
]

# See: https://docs.djangoproject.com/en/dev/ref/settings/#installed-apps
Expand Down Expand Up @@ -537,6 +538,15 @@
"VARFISH_CADA_REST_API_URL", "https://cada.gene-talk.de/api/process"
)

# Enable PEDIA prioritization.
VARFISH_ENABLE_PEDIA = env.bool("VARFISH_ENABLE_PEDIA", default=False)
VARFISH_PEDIA_REST_API_URL = env.str("VARFISH_PEDIA_REST_API_URL", "http://127.0.0.1:9000/pedia")

# Enable Gestalt-based prioritization.
VARFISH_ENABLE_GESTALT_MATCHER = env.bool("VARFISH_ENABLE_GESTALT_MATCHER", default=False)
# Configure URL to GestaltMatcher REST API
VARFISH_GM_SENDER_URL = env.str("VARFISH_GM_SENDER_URL", "http://127.0.0.1:7000/")

# Enable submission of variants to CADD server.
VARFISH_ENABLE_CADD_SUBMISSION = env.bool("VARFISH_ENABLE_CADD_SUBMISSION", default=False)
# CADD version to use for for submission
Expand Down Expand Up @@ -780,21 +790,16 @@ def set_logging(level):
AUTH_LDAP_SERVER_URI = env.str("AUTH_LDAP_SERVER_URI", None)
AUTH_LDAP_BIND_DN = env.str("AUTH_LDAP_BIND_DN", None)
AUTH_LDAP_BIND_PASSWORD = env.str("AUTH_LDAP_BIND_PASSWORD", None)
AUTH_LDAP_START_TLS = env.str("AUTH_LDAP_START_TLS", False)
stolpeo marked this conversation as resolved.
Show resolved Hide resolved
AUTH_LDAP_CA_CERT_FILE = env.str("AUTH_LDAP_CA_CERT_FILE", None)
AUTH_LDAP_CONNECTION_OPTIONS = {**LDAP_DEFAULT_CONN_OPTIONS}
if AUTH_LDAP_CA_CERT_FILE:
AUTH_LDAP_CONNECTION_OPTIONS[ldap.OPT_X_TLS_CACERTFILE] = AUTH_LDAP_CA_CERT_FILE
AUTH_LDAP_CONNECTION_OPTIONS[ldap.OPT_X_TLS_NEWCTX] = 0
AUTH_LDAP_USER_FILTER = env.str("AUTH_LDAP_USER_FILTER", "(sAMAccountName=%(user)s)")

AUTH_LDAP_USER_SEARCH_BASE = env.str("AUTH_LDAP_USER_SEARCH_BASE", None)
AUTH_LDAP_CONNECTION_OPTIONS = LDAP_DEFAULT_CONN_OPTIONS

AUTH_LDAP_USER_SEARCH = LDAPSearch(
AUTH_LDAP_USER_SEARCH_BASE, ldap.SCOPE_SUBTREE, LDAP_DEFAULT_FILTERSTR
env.str("AUTH_LDAP_USER_SEARCH_BASE", None),
ldap.SCOPE_SUBTREE,
LDAP_DEFAULT_FILTERSTR,
)
AUTH_LDAP_USER_ATTR_MAP = LDAP_DEFAULT_ATTR_MAP
AUTH_LDAP_USERNAME_DOMAIN = env.str("AUTH_LDAP_USERNAME_DOMAIN", None)
AUTH_LDAP_DOMAIN_PRINTABLE = env.str("AUTH_LDAP_DOMAIN_PRINTABLE", AUTH_LDAP_USERNAME_DOMAIN)
AUTH_LDAP_DOMAIN_PRINTABLE = env.str("AUTH_LDAP_DOMAIN_PRINTABLE", None)

AUTHENTICATION_BACKENDS = tuple(
itertools.chain(("projectroles.auth_backends.PrimaryLDAPBackend",), AUTHENTICATION_BACKENDS)
Expand Down
Empty file.
7 changes: 7 additions & 0 deletions backend/ext_gestaltmatcher/admin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from django.contrib import admin

from .models import SmallVariantQueryGestaltMatcherScores, SmallVariantQueryPediaScores

# Register your models here.
admin.site.register(SmallVariantQueryGestaltMatcherScores)
admin.site.register(SmallVariantQueryPediaScores)
6 changes: 6 additions & 0 deletions backend/ext_gestaltmatcher/apps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from django.apps import AppConfig


class ExtGestaltmatcherConfig(AppConfig):
default_auto_field = "django.db.models.BigAutoField"
name = "ext_gestaltmatcher"
34 changes: 34 additions & 0 deletions backend/ext_gestaltmatcher/migrations/0001_initial.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.11.20 on 2023-10-20 07:18
from __future__ import unicode_literals

from django.db import migrations, models
import django.db.models.deletion


class Migration(migrations.Migration):
dependencies = []

operations = [
migrations.CreateModel(
name="SmallVariantQueryGestaltMatcherScores",
fields=[
(
"id",
models.AutoField(
auto_created=True, primary_key=True, serialize=False, verbose_name="ID"
),
),
("gene_id", models.CharField(help_text="Entrez gene ID", max_length=64)),
("gene_symbol", models.CharField(help_text="The gene symbol", max_length=128)),
("priority_type", models.CharField(help_text="The priority type", max_length=64)),
("score", models.FloatField(help_text="The gene score")),
(
"query",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE, to="variants.SmallVariantQuery"
),
),
],
)
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# -*- coding: utf-8 -*-
# Generated by Django 1.11.20 on 2023-11-14 07:18
from __future__ import unicode_literals

from django.db import migrations, models
import django.db.models.deletion


class Migration(migrations.Migration):
dependencies = [("ext_gestaltmatcher", "0001_initial")]

operations = [
migrations.CreateModel(
name="SmallVariantQueryPediaScores",
fields=[
(
"id",
models.AutoField(
auto_created=True, primary_key=True, serialize=False, verbose_name="ID"
),
),
("gene_id", models.CharField(help_text="Entrez gene ID", max_length=64)),
("gene_symbol", models.CharField(help_text="The gene symbol", max_length=128)),
("score", models.FloatField(help_text="The gene score")),
(
"query",
models.ForeignKey(
on_delete=django.db.models.deletion.CASCADE, to="variants.SmallVariantQuery"
),
),
],
)
]
Empty file.
43 changes: 43 additions & 0 deletions backend/ext_gestaltmatcher/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
from django.db import models


# Create your models here.
class SmallVariantQueryGestaltMatcherScores(models.Model):
"""Annotate ``SmallVariantQuery`` with Gestalt Matcher scores (if configured to do so)."""

#: The query to annotate.
query = models.ForeignKey("variants.SmallVariantQuery", on_delete=models.CASCADE)

#: The Entrez gene ID.
gene_id = models.CharField(max_length=64, null=False, blank=False, help_text="Entrez gene ID")

#: The gene symbol.
gene_symbol = models.CharField(
max_length=128, null=False, blank=False, help_text="The gene symbol"
)

#: The priority type.
priority_type = models.CharField(
max_length=64, null=False, blank=False, help_text="The priority type"
)

#: The score.
score = models.FloatField(null=False, blank=False, help_text="The gene score")


class SmallVariantQueryPediaScores(models.Model):
"""Annotate ``SmallVariantQuery`` with PEDIA scores (if configured to do so)."""

#: The query to annotate.
query = models.ForeignKey("variants.SmallVariantQuery", on_delete=models.CASCADE)

#: The Entrez gene ID.
gene_id = models.CharField(max_length=64, null=False, blank=False, help_text="Entrez gene ID")

#: The gene symbol.
gene_symbol = models.CharField(
max_length=128, null=False, blank=False, help_text="The gene symbol"
)

#: The score.
score = models.FloatField(null=False, blank=False, help_text="The gene score")
99 changes: 97 additions & 2 deletions backend/variants/file_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,15 @@
ExportProjectCasesFileBgJobResult,
SmallVariantComment,
VariantScoresFactory,
annotate_with_gm_scores,
annotate_with_joint_scores,
annotate_with_pathogenicity_scores,
annotate_with_pedia_scores,
annotate_with_phenotype_scores,
annotate_with_transcripts,
get_pedia_scores,
prioritize_genes,
prioritize_genes_gm,
unroll_extra_annos_result,
)
from .queries import (
Expand Down Expand Up @@ -122,6 +126,16 @@ def to_str(val):
("phenotype_rank", "Phenotype Rank", int),
)

HEADERS_GM_SCORES = (
("gm_score", "Gestalt Score", float),
("gm_rank", "Gestalt Rank", int),
)

HEADERS_PEDIA_SCORES = (
("pedia_score", "PEDIA Score", float),
("pedia_rank", "PEDIA Rank", int),
)

#: Names of the pathogenicity scoring header columns.
HEADERS_PATHO_SCORES = (
("pathogenicity_score", "Pathogenicity Score", float),
Expand Down Expand Up @@ -318,6 +332,14 @@ def _is_prioritization_enabled(self):
)
)

def _is_gm_enabled(self):
"""Return whether Gestalt Matcher prioritization is enabled in this query."""
return settings.VARFISH_ENABLE_GESTALT_MATCHER and self.query_args.get("gm_enabled")

def _is_pedia_enabled(self):
"""Return whether PEDIA prioritization is enabled in this query."""
return settings.VARFISH_ENABLE_PEDIA and self.query_args.get("pedia_enabled")

def _is_pathogenicity_enabled(self):
"""Return whether pathogenicity scoring is enabled in this query."""
return settings.VARFISH_ENABLE_CADD and all(
Expand Down Expand Up @@ -352,6 +374,10 @@ def _yield_columns(self, members):
header += HEADERS_TRANSCRIPTS
if self._is_prioritization_enabled() and self._is_pathogenicity_enabled():
header += HEADERS_JOINT_SCORES
if self._is_gm_enabled():
header += HEADERS_GM_SCORES
if self._is_pedia_enabled():
header += HEADERS_PEDIA_SCORES
header += HEADER_FLAGS
header += HEADER_COMMENTS
header += self.get_extra_annos_headers()
Expand Down Expand Up @@ -391,13 +417,25 @@ def _yield_smallvars(self):
_result = annotate_with_pathogenicity_scores(_result, variant_scores)
if self._is_prioritization_enabled() and self._is_pathogenicity_enabled():
_result = annotate_with_joint_scores(_result)
if self._is_gm_enabled():
gene_scores = self._fetch_gm_scores([entry.entrez_id for entry in _result])
_result = annotate_with_gm_scores(_result, gene_scores)
if self._is_pedia_enabled():
pedia_scores = self._fetch_pedia_scores(_result)
if pedia_scores:
_result = annotate_with_pedia_scores(_result, pedia_scores)
fields = {x[1].label: x[0] for x in enumerate(list(ExtraAnnoField.objects.all()))}
_result = unroll_extra_annos_result(_result, fields)
self.job.add_log_entry("Writing output file...")
total = len(_result)
steps = math.ceil(total / 10)
for i, small_var in enumerate(_result):
if self._is_prioritization_enabled() or self._is_pathogenicity_enabled():
if (
self._is_prioritization_enabled()
or self._is_pathogenicity_enabled()
or self._is_gm_enabled
or self._is_pedia_enabled()
):
if i % steps == 0:
self.job.add_log_entry("{}%".format(int(100 * i / total)))
else:
Expand All @@ -421,7 +459,7 @@ def _fetch_gene_scores(self, entrez_ids):
if self._is_prioritization_enabled():
try:
prio_algorithm = self.query_args.get("prio_algorithm")
hpo_terms = tuple(sorted(self.query_args.get("prio_hpo_terms_curated", [])))
hpo_terms = tuple(sorted(self.query_args.get("prio_hpo_terms", [])))
stolpeo marked this conversation as resolved.
Show resolved Hide resolved
return {
str(gene_id): score
for gene_id, _, score, _ in prioritize_genes(
Expand All @@ -433,6 +471,63 @@ def _fetch_gene_scores(self, entrez_ids):
else:
return {}

def _fetch_gm_scores(self, entrez_ids):
prio_gm = self.query_args.get("prio_gm")
if all((self._is_gm_enabled(), prio_gm)):
try:
return {
str(gene_id): score
for gene_id, gene_symbol, score, priority_type in prioritize_genes_gm(
prio_gm, logging=self.job.add_log_entry
)
}
except ConnectionError as e:
self.job.add_log_entry(e)
else:
return {}

def _fetch_pedia_scores(self, result):
if self._is_pedia_enabled():
try:
payloadList = []

"""Read and json object by reading ``result`` ."""
for line in result:
payload = dict()

if all(
(
line.entrez_id,
hasattr(line, "phenotype_score"),
hasattr(line, "pathogenicity_score"),
hasattr(line, "gm_score"),
)
):
payload["gene_name"] = line.symbol
payload["gene_id"] = line.entrez_id

payload["cada_score"] = line.phenotype_score
payload["cadd_score"] = line.pathogenicity_score
payload["gestalt_score"] = (
0 if line.gm_score == float("inf") else line.gm_score
)

payload["label"] = False
payloadList.append(payload)

case_name = self.job.case.name
if case_name.startswith("F_"):
name = case_name[2:] # Remove the first two characters ("F_")
else:
name = case_name
scores = {"case_name": name, "genes": payloadList}

return {str(gene_id): score for gene_id, _, score in get_pedia_scores(scores)}
except ConnectionError as e:
self.job.add_log_entry(e)
else:
return {}

def _fetch_variant_scores(self, variants):
if self._is_pathogenicity_enabled():
try:
Expand Down
Loading
Loading