Skip to content

Archived Builds

Tiffany J. Callahan edited this page Nov 1, 2023 · 15 revisions

PKT Human Disease Knowledge Graph Benchmark Builds


This page provides an overview of the PKT Human Disease Knowledge Graph Benchmark and lists all builds and by release and build date.

Page Organization




PKT Human Disease Knowledge Graph


👉 For additional details on the Human Disease Benchmark Builds, please see the associated manuscript: https://doi.org/10.48550/arXiv.2307.05727.

The PKT Human Disease KG was built to model mechanisms of human disease, which includes the Central Dogma and represents multiple biological scales of organization including molecular, cellular, tissue, and organ. The knowledge representation was designed in collaboration with a PhD-level molecular biologist (below).

(Click Figure to Enlarge)



The PKT Human Disease KG was constructed using 12 OBO Foundry ontologies, 31 Linked Open Data sets, and results from two large-scale experiments (Supplementary material).


  • The 12 OBO Foundry ontologies were selected to represent chemicals and vaccines (i.e., ChEBI) and Vaccine Ontology [VO]), cells and cell lines (i.e., Cell Ontology [CL], Cell Line Ontology [CLO]), gene/gene product attributes (i.e., Gene Ontology [GO]), phenotypes and diseases (i.e., Human Phenotype Ontology [HPO], Mondo Disease Ontology [Mondo]), proteins, including complexes and isoforms (i.e., PRO), pathways (i.e., Pathway Ontology [PW]), types and attributes of biological sequences (i.e., Sequence Ontology [SO]), and anatomical entities (the Uber Anatomy Ontology [Uberon]). The Relation Ontology (RO) is used to provide relationships between the core OBO Foundry ontologies and database entities.
  • The PKT Human Disease KG contained 18 node types (Table 1 below) and 33 edge types (listed by relation in Table 3 [below]). Note that the number of nodes and edge types reflects those that are explicitly added to the core set of OBO Foundry ontologies and does not take into account the node and edge types provided by the ontologies.
  • These nodes and edge types were used to construct 12 different PKT Human Disease benchmark KGs by altering the Knowledge Model (i.e., class- vs. instance-based), Relation Strategy (i.e., standard vs. inverse relations), and Semantic Abstraction (i.e., OWL-NETS (yes/no) with and without Knowledge Model harmonization [OWL-NETS Only vs. OWL-NETS + Harmonization]) parameters.
  • Benchmarks within the PheKnowLator ecosystem are different versions of a KG that can be built under alternative knowledge models, relation strategies, and with or without semantic abstraction. They provide users with the ability to evaluate different modeling decisions (based on the prior mentioned parameters) and to examine the impact of these decisions on different downstream tasks. Please note that the actual content of each KG may differ slightly across the builds.

Table 1. PKT Human Disease Knowledge Graph Primary Node Types.

Node Universal Resource Identifier
Anatomical Entities http://purl.obolibrary.org/obo/UBERON
Biological Processes http://purl.obolibrary.org/obo/GO
Catalysts http://purl.obolibrary.org/obo/CHEBI
Cells http://purl.obolibrary.org/obo/CL
Cell Lines http://purl.obolibrary.org/obo/CLO
Cellular Components http://purl.obolibrary.org/obo/GO
Chemicals http://purl.obolibrary.org/obo/CHEBI
Cofactors http://purl.obolibrary.org/obo/CHEBI
Diseases http://purl.obolibrary.org/obo/MONDO
Genes http://www.ncbi.nlm.nih.gov/gene/
Molecular Functions http://purl.obolibrary.org/obo/GO
Pathwaysa http://purl.obolibrary.org/obo/PWhttps://reactome.org/content/detail/R-HSA-
Phenotypes http://purl.obolibrary.org/obo/HP
Proteins http://purl.obolibrary.org/obo/PR
Sequencesb http://purl.obolibrary.org/obo/SO
Transcripts https://uswest.ensembl.org/Homo_sapiens/Transcript/Summary?t=ENST
Vaccinesb http://purl.obolibrary.org/obo/VO
Variants https://www.ncbi.nlm.nih.gov/snp/rs

Table 2. PKT Human Disease Knowledge Graph Primary Node Types.

Relations Edge Types
participates in (RO_0000056)
has participant (RO_0000057)
chemical-pathway; gene-pathway; protein-biological process; protein-pathway
has function (RO_0000085)
function of (RO_0000079)
pathway-molecular function; protein-molecular function
located in (RO_0001025)
location of (RO_0001015)
protein-anatomy; protein-cella; protein-cellular component; transcript-anatomy; transcript-cella
has component (RO_0002180) pathway-cellular component
has phenotype (RO_0002200)
phenotype of (RO_0002201)
disease-phenotype
has gene product (RO_0002205)
gene product of (RO_0002204)
gene-protein
interacts with (RO_0002434) chemical-gene; chemical-protein
genetically interacts with (RO_0002435) gene-gene
molecularly interacts with (RO_0002436) chemical-biological process; chemical-cellular component; chemical-molecular function; protein-catalyst; protein-cofactor; protein-protein
transcribed to (RO_0002511)
transcribed from (RO_0002510)
gene-transcript
ribosomally translates to (RO_0002513)
ribosomal Translation of (RO_0002512)
transcript-protein
causally influences (RO_0002566)
causally influenced by (RO_0002559)
variant-gene
is substance that treats (RO_0002606)
is treated by substance (RO_0002302)
chemical-disease; chemical-phenotype
causes or contributes to condition (RO_0003302) gene-disease; gene-phenotype; variant-disease; variant-phenotype
realized in response to (RO_0009501) biological process-pathway



Build Archive


Please use the the links below to access specific builds. Within a specific date you will find links to access all of the KG files, metadata, and logs. Details on each of the KG file types is available here.




Return to Top


Clone this wiki locally