-
Notifications
You must be signed in to change notification settings - Fork 29
Archived Builds
This page provides an overview of the PKT Human Disease Knowledge Graph Benchmark and lists all builds and by release and build date.
👉 For additional details on the Human Disease Benchmark Builds, please see the associated manuscript: https://doi.org/10.48550/arXiv.2307.05727.
The PKT Human Disease KG was built to model mechanisms of human disease, which includes the Central Dogma and represents multiple biological scales of organization including molecular, cellular, tissue, and organ. The knowledge representation was designed in collaboration with a PhD-level molecular biologist (below).
(Click Figure to Enlarge)
The PKT Human Disease KG was constructed using 12 OBO Foundry ontologies, 31 Linked Open Data sets, and results from two large-scale experiments (Supplementary material).
- The 12 OBO Foundry ontologies were selected to represent chemicals and vaccines (i.e., ChEBI) and Vaccine Ontology [VO]), cells and cell lines (i.e., Cell Ontology [CL], Cell Line Ontology [CLO]), gene/gene product attributes (i.e., Gene Ontology [GO]), phenotypes and diseases (i.e., Human Phenotype Ontology [HPO], Mondo Disease Ontology [Mondo]), proteins, including complexes and isoforms (i.e., PRO), pathways (i.e., Pathway Ontology [PW]), types and attributes of biological sequences (i.e., Sequence Ontology [SO]), and anatomical entities (the Uber Anatomy Ontology [Uberon]). The Relation Ontology (RO) is used to provide relationships between the core OBO Foundry ontologies and database entities.
- The PKT Human Disease KG contained 18 node types (Table 1 below) and 33 edge types (listed by relation in Table 3 [below]). Note that the number of nodes and edge types reflects those that are explicitly added to the core set of OBO Foundry ontologies and does not take into account the node and edge types provided by the ontologies.
- These nodes and edge types were used to construct 12 different PKT Human Disease benchmark KGs by altering the Knowledge Model (i.e., class- vs. instance-based), Relation Strategy (i.e., standard vs. inverse relations), and Semantic Abstraction (i.e., OWL-NETS (yes/no) with and without Knowledge Model harmonization [OWL-NETS Only vs. OWL-NETS + Harmonization]) parameters.
- Benchmarks within the PheKnowLator ecosystem are different versions of a KG that can be built under alternative knowledge models, relation strategies, and with or without semantic abstraction. They provide users with the ability to evaluate different modeling decisions (based on the prior mentioned parameters) and to examine the impact of these decisions on different downstream tasks. Please note that the actual content of each KG may differ slightly across the builds.
Table 1. PKT Human Disease Knowledge Graph Primary Node Types.
Node | Universal Resource Identifier |
---|---|
Anatomical Entities | http://purl.obolibrary.org/obo/UBERON |
Biological Processes | http://purl.obolibrary.org/obo/GO |
Catalysts | http://purl.obolibrary.org/obo/CHEBI |
Cells | http://purl.obolibrary.org/obo/CL |
Cell Lines | http://purl.obolibrary.org/obo/CLO |
Cellular Components | http://purl.obolibrary.org/obo/GO |
Chemicals | http://purl.obolibrary.org/obo/CHEBI |
Cofactors | http://purl.obolibrary.org/obo/CHEBI |
Diseases | http://purl.obolibrary.org/obo/MONDO |
Genes | http://www.ncbi.nlm.nih.gov/gene/ |
Molecular Functions | http://purl.obolibrary.org/obo/GO |
Pathwaysa | http://purl.obolibrary.org/obo/PWhttps://reactome.org/content/detail/R-HSA- |
Phenotypes | http://purl.obolibrary.org/obo/HP |
Proteins | http://purl.obolibrary.org/obo/PR |
Sequencesb | http://purl.obolibrary.org/obo/SO |
Transcripts | https://uswest.ensembl.org/Homo_sapiens/Transcript/Summary?t=ENST |
Vaccinesb | http://purl.obolibrary.org/obo/VO |
Variants | https://www.ncbi.nlm.nih.gov/snp/rs |
Table 2. PKT Human Disease Knowledge Graph Primary Node Types.
Relations | Edge Types |
---|---|
participates in (RO_0000056) has participant (RO_0000057) |
chemical-pathway; gene-pathway; protein-biological process; protein-pathway |
has function (RO_0000085) function of (RO_0000079) |
pathway-molecular function; protein-molecular function |
located in (RO_0001025) location of (RO_0001015) |
protein-anatomy; protein-cella; protein-cellular component; transcript-anatomy; transcript-cella |
has component (RO_0002180) | pathway-cellular component |
has phenotype (RO_0002200) phenotype of (RO_0002201) |
disease-phenotype |
has gene product (RO_0002205) gene product of (RO_0002204) |
gene-protein |
interacts with (RO_0002434) | chemical-gene; chemical-protein |
genetically interacts with (RO_0002435) | gene-gene |
molecularly interacts with (RO_0002436) | chemical-biological process; chemical-cellular component; chemical-molecular function; protein-catalyst; protein-cofactor; protein-protein |
transcribed to (RO_0002511) transcribed from (RO_0002510) |
gene-transcript |
ribosomally translates to (RO_0002513) ribosomal Translation of (RO_0002512) |
transcript-protein |
causally influences (RO_0002566) causally influenced by (RO_0002559) |
variant-gene |
is substance that treats (RO_0002606) is treated by substance (RO_0002302) |
chemical-disease; chemical-phenotype |
causes or contributes to condition (RO_0003302) | gene-disease; gene-phenotype; variant-disease; variant-phenotype |
realized in response to (RO_0009501) | biological process-pathway |
Please use the the links below to access specific builds. Within a specific date you will find links to access all of the KG files, metadata, and logs. Details on each of the KG file types is available here.
-
v1.0.0
-
v2.0.0
-
v2.1.0
-
v3.0.2