Building Semantic Queries
- Last Updated: May 18, 2026
- 4 minute read
- MarkLogic Server
- Version 12.0
- Documentation
Semantic data in the form of triples that describe the edges of graphs is a powerful data model supported by MarkLogic that you will want to explore. See Understand Semantic Graphs for more detailed information than we provide here.
Briefly, triples allow you to encode interconnected “facts” in a subject-predicate-object form to express a domain of knowledge from which you can infer other “facts.” For example, from these two triples,
John (subject) Lives In (predicate) London (object) and
London (subject) Is In (predicate) England (object),
we can infer the “fact” that John Lives In England without having that “fact” explicitly stored anywhere in our database.
We can also use triples to standardize our data, drawing on publicly available vocabularies such as naming conventions or official abbreviations.
Triples are normally queried with a language called SPARQL.
Optic provides two Data Accessor Functions for triples queries:
-
fromTriples()directly accesses the triples so that you do not need SPARQL to make simple triple pattern matches. -
fromSPARQL()lets you use SPARQL to write the more complex and expressive graph queries needed for searching nested taxonomy structures.
We want to find all our employees in the Northeast. Unfortunately, we only have state data in our employee documents. Fortunately, we do have documents containing semantic triples:
ex:CT rdfs:isDefinedBy "CT" ;
a ex:State ;
skos:broader ex:Northeast ;
skos:prefLabel "Connecticut" .
Each of these 4 triples has its own IRI (Internationalized Resource Identifier). They use predefined vocabularies such as RDFS and SKOS shown here as well as others like RDF.
One of these triple facts is that a given state has an official, two-letter abbreviation—which our employee documents use to identify employee states. Another fact is that a given state is in a particular region—such as our needed region, Northeast. This means that we have the data we need to relate our employees’ state data from one set of documents with their regions from another set of documents.
So, with this triples data, we can find all our employees in the Northeast in two steps:
The first step is to produce a row sequence of official codes for states in the Northeast.
An Optic query like this one returns up to 100 rows for triples matching the given patterns:
const ex = op.prefixer('https://example.com/semantics/geo#');
const rdfs = op.prefixer('http://www.w3.org/2000/01/rdf-schema#');
const skos = op.prefixer('http://www.w3.org/2004/02/skos/core#');
const state = op.col('state')
op.fromTriples([
op.pattern(state, skos('broader'), ex('Northeast')),
op.pattern(state, rdfs('isDefinedBy'), op.col('code'))
])
.offsetLimit(0, 100)
.result();
We used this query to find all states whose broader definition is Northeast, then, for each found state, to find its official state code:
-
We defined three prefixers:
-
exis the base IRI for our triples. -
rdfsis the base IRI for the RDFS vocabulary. -
skosis the base IRI for the SKOS vocabulary.
-
-
We defined two columns with
col(). They will both appear in our result:-
col()identifies the column in its argument. -
Before the query, we defined
state. -
When it was needed within a query function parameter, we defined
code.
-
-
The Data Accessor Function
fromTriples()returns a row for each triple matching the given pattern specified in thepattern()functions:-
The first
pattern()function finds triples with any subject ifbroaderis the predicate andNortheastis the object. -
The second
pattern()function finds triples with any object if its subject matches one of the states found by the firstpattern()and its predicate isisDefinedBy.
-
-
The Operator Function
offsetLimit()restricts results returned. The first parameter specifies the number of results to skip; the second, the number of results to return. So, (0, 100) returns the first 100 results. -
The Executor Function
result()executes the query and returns the results as a row sequence.
Here are rows 1-4 of the 11-row x 2-column result:
{
"state": "https://example.com/semantics/geo#CT",
"code": "CT"
}
{
"state": "https://example.com/semantics/geo#DE",
"code": "DE"
}
{
"state": "https://example.com/semantics/geo#MA",
"code": "MA"
}
{
"state": "https://example.com/semantics/geo#MD",
"code": "MD"
}
-
There is one row for each of the 11 Northeastern US states:
-
Its
statecolumn contains the IRI for the triples graph node. -
Its
codecolumn contains the official state code.
-
-
You could suppress the
statecolumn with theselect(code)operator function. -
The rows are in an unspecified order, which could change between executions. You can specify row order with the
orderBy()operator function.
We could have used this fromSPARQL() query to get the same results:
op.fromSPARQL(`
PREFIX ex: <https://example.com/semantics/geo#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?code ?region FROM <https://example.com/semantics/geo> WHERE {
?state skos:broader* ?region .
?state rdfs:isDefinedBy ?code .
FILTER (?region = ex:Northeast)
}
`)
.offsetLimit(0, 100)
.result();
We would have used it instead of fromTriples() if the triples we were interested in were nested in a child structure, because SPARQL has the operator *. Used here on skos:broader, it would enable the query to search all descendants, not just children.
Either way, we have completed the first step toward finding all our employees in the Northeast. Our second step is to join this triples data with our existing employee data in a multi-model query. The next section describes two ways to accomplish this step.
Shortest Path
Optic supports magic properties added as part of SPARQL support in MarkLogic Server. op:shortest-path() can be used to find the shortest path between two nodes in a given graph. See Graph Algorithms in Understand Semantic Graphs.
Assuming the same dataset shown in the following diagram, we want to find the shortest path from the node labeled "john" to "maia":
//Calculate the shortest path between john and maia
const op = require('/MarkLogic/optic');
const ken = op.prefixer('http://example.org/kennedy');
const start = op.col('start');
const predicate = op.col('predicate');
const end = op.col('end');
const path = op.col('path');
const length = op.col('length');
op.fromTriples([op.pattern(start, predicate, end)])
.shortestPath(start, end, path, length)
.where(op.and(op.eq(start,ken('john')),op.eq(end,ken('maia'))))
.result();