WikiPathways-SPARQL-book

WikiPathways RDF Data Model

The WikiPathways RDF content consists of two parts, the GPMLRDF which contains a direct RDF representation of the original GPML in which the WikiPathways are stored, and the WPRDF, which contains the interpretable biology stored in those pathways.

This section describes both bits of RDF, because not all information in the GPML can be biologically interpreted and there are use case of the GPML at this moment too.

Pathways

Of course, central to a pathway database are the pathways. Pathways in the WPRDF are of type wp:Pathway:

SPARQL sparql/listAllPathways.rq (run, edit)

PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX wp:     <http://vocabularies.wikipathways.org/wp#>
SELECT DISTINCT (str(?title) as ?pathway) (str(?label) as ?organism)
WHERE {
 ?pw a wp:Pathway ;
     dc:title ?title ;
     wp:organismName ?label .
}

Resources of this type have the following RDF predicates:

SPARQL sparql/listAllPathwayProperties.rq (run, edit)

PREFIX wp:     <http://vocabularies.wikipathways.org/wp#>
SELECT DISTINCT ?predicate
WHERE {
 ?pw a wp:Pathway ;
     ?predicate [] .
}

Articles

Similarly, we can list all PubMed identifiers with the pathways they occur in:

SPARQL sparql/listAllPubMedIDs.rq (run, edit)

PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT DISTINCT ?pathway ?pubmed WHERE {
  ?pubmed a       wp:PublicationReference ;
          dcterms:isPartOf ?pathway
} ORDER BY ?pathway LIMIT 500

Articles in the WPRDF are of type wp:PublicationReference and have the following predicates:

SPARQL sparql/listAllArticleProperties.rq (run, edit)

PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
PREFIX dcterms: <http://purl.org/dc/terms/>
SELECT DISTINCT ?predicate WHERE {
  ?pubmed a wp:PublicationReference ;
          ?predicate [] .
}

Genes, proteins, and metabolites

Pathways contains biological entities, genes, proteins, metabolite, complexes, and more. Even pathways themselves can be entities in pathways. All entities are represented in the data model as wp:DataNodes. Because there are so many, we will here list only 100 data nodes:

SPARQL sparql/list100DataNodes.rq (run, edit)

PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
PREFIX wp:     <http://vocabularies.wikipathways.org/wp#>
SELECT DISTINCT ?node ?label
WHERE {
 ?node a wp:DataNode ;
     rdfs:label ?label .
} LIMIT 100

DataNodes have the following predicates:

SPARQL sparql/listAllDataNodePredicates.rq (run, edit)

PREFIX wp:     <http://vocabularies.wikipathways.org/wp#>
SELECT DISTINCT ?predicate
WHERE {
 ?pw a wp:DataNode ;
     ?predicate [] .
}

Metabolites

Metabolites are typed a wp:Metabolite, a subclass of wp:DataNode. They do have all predicates that data nodes have. If we just look at metabolites, this subset of data nodes has these predicates:

SPARQL sparql/listAllMetabolitePredicates.rq (run, edit)

PREFIX wp:     <http://vocabularies.wikipathways.org/wp#>
SELECT DISTINCT ?predicate
WHERE {
 ?pw a wp:Metabolite ;
     ?predicate [] .
}

References