27 August 2012

Reasoning with the Variation Ontology using Apache Jena #OWL #RDF

The Variation Ontology (VariO), "is an ontology for standardized, systematic description of effects, consequences and mechanisms of variations".
In this post I will use the Apache Jena library for RDF to load this ontology. It will then be used to extract a set of variations that are a sub-class of a given class of Variation.

Loading the ontology

The OWL ontology is available for download here: http://www.variationontology.org/download/VariO_0.979.owl. A new RDF model for an OWL ontology is created and the owl file is loaded.
OntModel ontModel = ModelFactory.createOntologyModel();
InputStream in = FileManager.get().open(VO_OWL_URL);
ontModel.read(in, "");
in.close();

Creating a Reasoner

A OWL Reasoner is then created and associated to the previous model:
Reasoner reasoner = ReasonerRegistry.getOWLReasoner();
reasoner=this.reasoner.bindSchema(ontModel);

Creating a random set of variations

A new RDF model is created to hold a few instances of random Variations. For each instance, we add a random property 'my:chromosome', a random property 'my:position' and we associated one of the following type:
  • vo:VariO_0000029 "modified amino acid", a sub-Class of vo:VariO_0000028 ("post translationally modified protein")
  • vo:VariO_0000030 "spliced protein", a sub-Class of vo:VariO_0000028 ("post translationally modified protein")
  • vo:VariO_0000033 "effect on protein subcellular localization". It is NOT a sub-class of vo:VariO_0000028
Random rand=new Random();
com.hp.hpl.jena.rdf.model.Model instances = ModelFactory.createDefaultModel();
instances.setNsPrefix("vo",VO_PREFIX);
instances.setNsPrefix("my",MY_URI);

for(int i=0;i< 10;++i)
 {
 Resource subject= null;
 Resource rdftype=null;
 switch(i%3)
    {
    case 0:
       {
       //modified amino acid
       subject=instances.createResource(AnonId.create("modaa_"+i));
       rdftype=instances.createResource(VO_PREFIX+"VariO_0000029");
       break;
       }
    case 1:
       {
       //spliced protein
       subject=instances.createResource(AnonId.create("spliced_"+i));
       rdftype=instances.createResource(VO_PREFIX+"VariO_0000030");
       break;
       }
    default:
       {
       //effect on protein subcellular localization
       subject=instances.createResource(AnonId.create("subcell_"+i));
       rdftype=instances.createResource(VO_PREFIX+"VariO_0000033");
       break;
       }
    }
 instances.add(subject, RDF.type, rdftype);
 instances.add(subject, hasChromosome, instances.createLiteral("chr"+(1+rand.nextInt(22))));
 instances.add(subject, hasPosition, instances.createTypedLiteral(rand.nextInt(1000000)));
 }

Reasoning

A new inference model is created using the reasoner and the instances of variation. An iterator is used to only list the variations being a subclasses of vo:VariO_0000028 and having a property "my:chromosome" and a property "my:position".
InfModel model = ModelFactory.createInfModel (reasoner, instances);
ExtendedIterator<Statement> sti = model.listStatements(
        null, null, model.createResource(VO_PREFIX+"VariO_0000028"));
sti=sti.filterKeep(new Filter<Statement>()
      {
      @Override
      public boolean accept(Statement stmt)
         {
         return   stmt.getSubject().getProperty(hasChromosome)!=null &&
               stmt.getSubject().getProperty(hasPosition)!=null
               ;
         }
      });
Loop over the iterator and print the result:
while(sti.hasNext() )
         {
         Statement stmt = sti.next();
         System.out.println("\t+ " + PrintUtil.print(stmt));
         Statement val=stmt.getSubject().getProperty(hasChromosome);
         System.out.println("\t\tChromosome:\t"+val.getObject());
         val=stmt.getSubject().getProperty(hasPosition);
         System.out.println("\t\tPosition:\t"+val.getObject());
         }

Result

   + (spliced_7 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr7
      Position:   134172^^http://www.w3.org/2001/XMLSchema#int
   + (spliced_4 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr13
      Position:   674316^^http://www.w3.org/2001/XMLSchema#int
   + (spliced_1 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr22
      Position:   457596^^http://www.w3.org/2001/XMLSchema#int
   + (modaa_9 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr12
      Position:   803303^^http://www.w3.org/2001/XMLSchema#int
   + (modaa_6 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr15
      Position:   794137^^http://www.w3.org/2001/XMLSchema#int
   + (modaa_3 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr14
      Position:   34487^^http://www.w3.org/2001/XMLSchema#int
   + (modaa_0 rdf:type http://purl.obolibrary.org/obo/VariO_0000028)
      Chromosome:   chr15
      Position:   536371^^http://www.w3.org/2001/XMLSchema#int

Full source code

import java.io.IOException;
import java.io.InputStream;
import java.util.Random;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import com.hp.hpl.jena.ontology.OntModel;
import com.hp.hpl.jena.ontology.OntModelSpec;
import com.hp.hpl.jena.rdf.model.AnonId;
import com.hp.hpl.jena.rdf.model.InfModel;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.Statement;
import com.hp.hpl.jena.reasoner.Reasoner;
import com.hp.hpl.jena.reasoner.ReasonerRegistry;
import com.hp.hpl.jena.util.FileManager;
import com.hp.hpl.jena.util.PrintUtil;
import com.hp.hpl.jena.util.iterator.ExtendedIterator;
import com.hp.hpl.jena.util.iterator.Filter;
import com.hp.hpl.jena.vocabulary.RDF;

public class VariationOntologyReasoner
  {
  private static final String VO_PREFIX="http://purl.obolibrary.org/obo/";
  private static final String MY_URI="urn:my:ontology";
  private static final String VO_OWL_URL="http://www.variationontology.org/download/VariO_0.979.owl";
  private Reasoner reasoner;
  static final private Property hasChromosome=ModelFactory.createDefaultModel().createProperty(MY_URI,"chromosome");
  static final private Property hasPosition=ModelFactory.createDefaultModel().createProperty(MY_URI,"position");
  
  private VariationOntologyReasoner() throws IOException
    {
    OntModel ontModel = ModelFactory.createOntologyModel();
     InputStream in = FileManager.get().open(VO_OWL_URL);
    ontModel.read(in, "");
    in.close();
    this.reasoner = ReasonerRegistry.getOWLReasoner();
    this.reasoner=this.reasoner.bindSchema(ontModel);
    }
  
  private void run()
    {
    Random rand=new Random();
    com.hp.hpl.jena.rdf.model.Model instances = ModelFactory.createDefaultModel();
    instances.setNsPrefix("vo",VO_PREFIX);
    instances.setNsPrefix("my",MY_URI);

    for(int i=0;i< 10;++i)
      {
      Resource subject= null;
      Resource rdftype=null;
      switch(i%3)
        {
        case 0:
          {
          //modified amino acid
          subject=instances.createResource(AnonId.create("modaa_"+i));
          rdftype=instances.createResource(VO_PREFIX+"VariO_0000029");
          break;
          }
        case 1:
          {
          subject=instances.createResource(AnonId.create("spliced_"+i));
          rdftype=instances.createResource(VO_PREFIX+"VariO_0000030");
          break;
          }
        default:
          {
          //effect on protein subcellular localization
          subject=instances.createResource(AnonId.create("subcell_"+i));
          rdftype=instances.createResource(VO_PREFIX+"VariO_0000033");
          break;
          }
        }
      instances.add(subject, RDF.type, rdftype);
      instances.add(subject, hasChromosome, instances.createLiteral("chr"+(1+rand.nextInt(22))));
      instances.add(subject, hasPosition, instances.createTypedLiteral(rand.nextInt(1000000)));
      }
    
    InfModel model = ModelFactory.createInfModel (reasoner, instances);
    ExtendedIterator<Statement> sti = model.listStatements(null, null, model.createResource(VO_PREFIX+"VariO_0000028"));
    sti=sti.filterKeep(new Filter<Statement>()
        {
        @Override
        public boolean accept(Statement stmt)
          {
          return  stmt.getSubject().getProperty(hasChromosome)!=null &&
              stmt.getSubject().getProperty(hasPosition)!=null
              ;
          }
        });
    while(sti.hasNext() )
      {
      Statement stmt = sti.next();
      System.out.println("\t+ " + PrintUtil.print(stmt));
      Statement val=stmt.getSubject().getProperty(hasChromosome);
      System.out.println("\t\tChromosome:\t"+val.getObject());
      val=stmt.getSubject().getProperty(hasPosition);
      System.out.println("\t\tPosition:\t"+val.getObject());
      }
    }
  
  public static void main(String[] args) throws Exception
    {
    VariationOntologyReasoner app=new VariationOntologyReasoner();
    app.run();
  }
}

That's it,

Pierre






No comments: