Skip to content

OWL NETS 2.0

Tiffany J. Callahan edited this page Dec 1, 2021 · 40 revisions

OWL-NETS 2.0

Original Repository: OWL-NETS
Example Application: OWLNETS_Example_Application.ipynb

Purpose: OWL-NETS (NEtwork Transformation for Statistical learning) is a computational method that reversibly abstracts Web Ontology Language (OWL)-encoded biomedical knowledge into a more biologically meaningful network representation. OWL-NETS generates semantically rich knowledge graphs that contain heterogeneous nodes and edges and can be used for tasks that do not require OWL semantics.

Publication for V1.0:

Callahan TJ, Baumgartner WA, Bada M, Stefanski AL, Tripodi I, White EK, Hunter LE. OWL-NETS: Transforming OWL Representations for Improved Network Inference. Pac Symp Biocomput. 2018;23:133-144. PMID:29218876; PMCID:PMC5737627


OWL-NETS 2.0: This wiki discusses an alternative and arguably more generalizable adaptation of the original project. This new version was developed as a fundamental component of the PheKnowLator project to decode OWL-encoded classes.


Table of Contents



Overview


Problem

An ontology or knowledge graph built using OWL using contains two types of entities that we'd like to decode when transforming into an OWL-NETS representation: (1) owl:Class and (2) owl:Axiom. While each of the components shown below is needed to build a semantically rich knowledge graph, the majority of the information used to construct each object is not biologically or clinically meaningful. Thus, the goal of the current algorithm is to decode all OWL-encoded classes and axioms (like those shown below) into something more clinically or biologically meaningful.

OWL:Class

Scenario 1: owl unionOf constructor

<!-- http://purl.obolibrary.org/obo/CL_0000995 -->
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/CL_0000995">
        <owl:equivalentClass>		
            <owl:Class>
                <owl:unionOf rdf:parseType="Collection">	
                    <rdf:Description rdf:about="http://purl.obolibrary.org/obo/CL_0001021"/>
                    <rdf:Description rdf:about="http://purl.obolibrary.org/obo/CL_0001026"/>
                </owl:unionOf>				
            </owl:Class>
        </owl:equivalentClass>		
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/CL_0001060"/>
    </owl:Class>

http://purl.obolibrary.org/obo/CL_0000995
The OWL class CL_0000995 (i.e. CD34-positive, CD38-positive common myeloid progenitor OR CD34-positive, CD38-positive common lymphoid progenitor) was built by taking the union:

  • CL_0001021 (i.e. CD34-positive, CD38-positive common lymphoid progenitor)
  • CL_0001026 (i.e. CD34-positive, CD38-positive common myeloid progenitor)

OWL-NETS would decode this class into:

CL_0001021, rdfs:subClassOf, CL_0000995
CL_0001026, rdfs:subClassOf, CL_0000995
CL_0000995, rdfs:subClassOf, CL_0001060

Scenario 2: owl intersectionOf constructor

<!-- http://purl.obolibrary.org/obo/HP_0000340 -->
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/HP_0000340">
        <owl:equivalentClass>		
            <owl:Restriction>			
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/>
                <owl:someValuesFrom>
                    <owl:Class>
                        <owl:intersectionOf rdf:parseType="Collection">	
                            <rdf:Description rdf:about="http://purl.obolibrary.org/obo/PATO_0001481"/>
                            <owl:Restriction>		
                                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0000052"/>
                                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/UBERON_0008200"/>
                            </owl:Restriction>		
                            <owl:Restriction>		
                                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002573"/>
                                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/PATO_0000460"/>
                            </owl:Restriction>		
                        </owl:intersectionOf>				
                    </owl:Class>
                </owl:someValuesFrom>
            </owl:Restriction>			
        </owl:equivalentClass>					
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/HP_0000290"/>
    </owl:Class>

http://purl.obolibrary.org/obo/HP_0000340
The owl class HP_0000340 (i.e. sloping forehead) was built by taking the intersection of:

  • PATO_0001481, RO_0000052, UBERON_0008200 (i.e. sloped, inheres in, forehead)
  • PATO_0001481, RO_0002573, PATO_0000460 (i.e. sloped, has modifier, abnormal)

OWL-NETS would decode this class into:

HP_0000340, RO_0000086, PATO_0001481
HP_0000340, RO_0000052, UBERON_0008200
HP_0000340, RO_0002573, PATO_0000460
HP_0000340, rdfs:subClassOf, HP_0000290

Scenario 3: owl restriction

<!-- http://purl.obolibrary.org/obo/GO_0000785 -->
    <owl:Class rdf:about="http://purl.obolibrary.org/obo/GO_0000785">
        <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/GO_0110165"/>
        <rdfs:subClassOf>
            <owl:Restriction>		
                <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000050"/>
                <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/GO_0005694"/>
            </owl:Restriction>		
        </rdfs:subClassOf>
    </owl:Class>

http://purl.obolibrary.org/obo/GO_0000785
The OWL class GO_0000785 (i.e. chromatin) is a restricted to BFO_0000050 (i.e. part of) GO_0005694 (i.e. chromosome)

OWL-NETS would decode this class into:

GO_0000785, BFO_0000050, GO_0005694
GO_0000785, rdfs:subClassOf, GO_0110165

OWL:Axioms

Scenario 1: owl:annotatedSource and owl:annotatedTarget are not anonymous

<owl:Axiom>
    <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/CL_0002004"/>
    <owl:annotatedProperty rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
    <owl:annotatedTarget rdf:resource="http://purl.obolibrary.org/obo/CL_0000547"/>
    <oboInOwl:is_inferred rdf:datatype="http://www.w3.org/2001/XMLSchema#string">true</oboInOwl:is_inferred>
</owl:Axiom>

http://purl.obolibrary.org/obo/CL_0002004
The OWL class CL_0002004 (i.e. CD34-negative, GlyA-negative proerythroblast) has the following logical statements:

  • CL_0002004 SubClassOf UBERON_0002238 (CD34-negative, GlyA-negative proerythroblast subClassOf CD34-negative, GlyA-negative proerythroblast)

OWL-NETS would decode this axiom into:

CL_0002004, rdfs:subClassOf, CL_0000547

Scenario 2: owl:annotatedSource is anonymous and owl:annotatedTarget is not

<owl:Axiom>
    <owl:annotatedSource>
        <owl:Class>
            <owl:intersectionOf rdf:parseType="Collection">
                <rdf:Description rdf:about="http://purl.obolibrary.org/obo/UBERON_0010757"/>
                <owl:Restriction>
                    <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000050"/>
                    <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_9606"/>
                </owl:Restriction>
            </owl:intersectionOf>
            <rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/UBERON_0002238"/>
        </owl:Class>
    </owl:annotatedSource>
    <owl:annotatedProperty rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
    <owl:annotatedTarget rdf:resource="http://purl.obolibrary.org/obo/UBERON_0002238"/>
    <oboInOwl:source rdf:datatype="http://www.w3.org/2001/XMLSchema#string">FMA</oboInOwl:source>
</owl:Axiom>

http://purl.obolibrary.org/obo/UBERON_0010757
The OWL class UBERON_0010757 (i.e. rib 8) has the following logical statements:

  • UBERON_0010757 and BFO_0000050 some NCBITaxon_9606 (rib 8 part of Homo sapiens)
  • UBERON_0010757 SubClassOf UBERON_0002238 (rib 8 subClassOf false rib)

OWL-NETS would decode this axiom into:

UBERON_0010757, BFO_0000050, NCBITaxon_9606

Scenario 3: owl:annotatedTarget is anonymous and owl:annotatedSource is not

<owl:Axiom>
    <owl:annotatedSource rdf:resource="http://purl.obolibrary.org/obo/UBERON_0002373"/>
    <owl:annotatedProperty rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
    <owl:annotatedTarget>
        <owl:Restriction>
            <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002202"/>
            <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010023"/>
        </owl:Restriction>
    </owl:annotatedTarget>
</owl:Axiom>

http://purl.obolibrary.org/obo/UBERON_0002373
The OWL class UBERON_0002373 (i.e. Palantine tonsil) has the following logical statement:

  • UBERON_000556 RO_0002202 some UBERON_0010023 (palantine tonsil develops from dorsal paryngeal pouch 2)

OWL-NETS would decode this axiom into:

UBERON_0002373, RO_0002202, UBERON_0010023

Scenario 4: owl:annotatedSource and owl:annotatedTarget are both anonymous

<owl:Axiom>
    <owl:annotatedSource>
        <owl:Class>
            <owl:intersectionOf rdf:parseType="Collection">
                <rdf:Description rdf:about="http://purl.obolibrary.org/obo/UBERON_0005562"/>
                <owl:Restriction>
                    <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000050"/>
                    <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/NCBITaxon_40674"/>
                </owl:Restriction>
            </owl:intersectionOf>
            <rdfs:subClassOf>
                <owl:Restriction>
                    <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002254"/>
                    <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010028"/>
                </owl:Restriction>
            </rdfs:subClassOf>
        </owl:Class>
    </owl:annotatedSource>
    <owl:annotatedProperty rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
    <owl:annotatedTarget>
        <owl:Restriction>
            <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0002254"/>
            <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/UBERON_0010028"/>
        </owl:Restriction>
    </owl:annotatedTarget>
    <oboInOwl:notes rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Mammals</oboInOwl:notes>
    <oboInOwl:source rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ISBN:0073040584-table13.1</oboInOwl:source>
</owl:Axiom>

http://purl.obolibrary.org/obo/UBERON_0005562
The OWL class UBERON_0005562 (i.e. Thymus primordium) has the following logical statements:

  • UBERON_0005562 and BFO_0000050 some NCBITaxon_40674 (thymus primordium part of mammalia)
  • UBERON_0005562 SubClassOf RO_0002254 some UBERON_0010028 (thymus primordium has developmental contribution from ventral part of pharyngeal pouch 4)

OWL-NETS would decode this axiom into:

UBERON_0005562, RO_0002254, UBERON_0010028
UBERON_0005562, BFO_0000050, NCBITaxon_40674  


Algorithm

The algorithm has three goals, each of which is further explained below:

  1. Decode all OWL-encoded classes
  2. Remove all triples that contain subjects, predicates, and/or objects that are needed to ensure OWL semantics, but are not biologically meaningful
  3. Ensure decoded knowledge graph contains a single connected component
  4. Purify the decoded knowledge graph to match an input knowledge graph construction approach (i.e. subclass or instance)

Step 1: Decode OWL-Encoded Classes


A high-level overview of the algorithm is provided in the snippet of the pseudocode below.


Step 2: Remove Triples Containing OWL Semantics


  • Map owl:Class instances back to the original owl:Class
  • Remove all triples that do not contain a subject or object of type BNode or Literal
  • Keep triples containing any owl:ObjectProperty occurring with subject and objects that are owl:Class or owl:NamedIndividual

Step 3: Ensure Decoded Knowledge Graph is Connected


Depending on the source ontology that you apply OWL-NETS to, it's possible that the decoded knowledge graph may contain more than a single connected component. This step ensures that the decoded knowledge graph is connected.

  • Derives a set of root nodes by searching for each node's highest ancestor concept (via rdfs:subClassOf).
    • If the node has no ancestors, all of the node's immediate neighbors are searched and the most frequently visited, highest common ancestor among the neighbors is selected. If none of the neighborhood concepts have any ancestors in common, a random ancestor concept is selected
    • If the node has more than 1 neighbor, the highest ancestor concept is selected
  • Each root node is then added to the graph as rdfs:subClassOf a user-provided URI. BFO_0000001 is the default choice

Step 4: Construction Approach Purification (optional)


Currently, the program is configured to output the results from OWL-NETS in two ways: (1) run the program as-is or (2) run the program as-is with an additional step to "purify" the output by ensuring that the resulting OWL-NETS graph is completely consistent with the specified knowledge graph construction approach (i.e. subclass or instance-based). The "purified" output will include _SUBCLASS_purified_ or _INSTANCE_purified_ in the file names.

The procedure utilized to "purify" the graph is as follows:

  • Subclass Construction Approach:
    • Find all triples containing rdf:type (subj rdf:type obj)
      • Replace rdf:type with rdfs:subClassOf
      • Make subj rdfs:subClassOf all ancestors of obj
  • Instance Construction Approach:
    • Find all triples containing rdfs:subClassOf (subj rdfs:subClassOf obj)
      • Replace rdfs:subClassOf with rdf:type
      • Make subj rdf:type all ancestors of obj



ASSUMPTIONS:
Don't Decode

  • Classes built using the owl:complementOf constructors
  • Triples containing annotations
  • Triples that contain oneOf (e.g. IAO_0000225)
  • Triples containing properties signifying negation ObjectProperty or owl:Class (e.g. lacks_part, disjointWith)

Decode

  • The following property types: someValuesFrom, onClass, hasSelf, hasValue, allValuesFrom
  • Triples containing cardinality constraints, but ignore cardinality

To determine owl:ObjectProperties in decoded owl:intersectionOf or owl:unionOf constructors:

  • RO_0000086 (has quality): If subject is NOT a PATO term and object IS a PATO term
  • Provided onProperty: If both subject and object ARE PATO terms AND there is an onProperty provided
  • rdfs:subClassOf (subclass build) / rdf:type (instance build):
    • If both subject and object ARE PATO terms AND there is not an onProperty
    • If both subject and object ARE NOT PATO terms AND there is not an onProperty




Inputs and Outputs:


  • Output Data:
    • A Networkx MultiDigraph
    • An RDF graph containing all of the owl-encoded (Step 1) and triples containing OWL semantics (Step 2) serialized in nt format
    • A Hash Map Storing Transformation information:
    {'owl_nets': {
                   'decoded_classes': {},
                   'complementOf': {},
                   'cardinality': {},
                   'negation': {},
                   'misc': {}},
     'disjointWith': {},
     'filtered_triples': set(),
     '<<knowledge construction approach>>_approach_purified': set()}


Running OWL-NETS Outside of PheKnowLator

Jupyter Notebook: OWLNETS_Example_Application.ipynb

To run OWL-NETS on a graph or ontology without running pkt_kg you need to provide: (1) fork or clone the PheKnowLator GitHub repository; (2) provide an RDFLib Graph() object or file path to the object you want to transform; (3) provide a path to where the output should be written; and (4) provide a filename (i.e. owl_nets_output). From the PheKnowLator directory run the following code:

from rdflib import Graph
from pkt_kg.owlnets import OwlNets

# load ontology
hp_graph = Graph().parse('path/to/file/hp.owl')

# instantiate class
owl_nets = OwlNets(graph=hp_graph, write_location='resources/', filename='/hpo_test')

# run the method
owl_nets.run_owl_nets()

Clone this wiki locally