Web Fundamentals
The Semantic Web & Linked Data

Ruben Verborgh, Ghent Universityimec

Web Fundamentals
The Semantic Web & Linked Data

Ruben Verborgh

Ghent University imec IDLab

Creative Commons License Except where otherwise noted, the content of these slides is licensed under a Creative Commons Attribution 4.0 International License.

[Cover of Scientific American, May 2001]
©2001 Scientific American

The Semantic Web is a layer
on top of the existing Web.

The Semantic Web layer is integrated
into the existing Web.

Web Fundamentals
The Semantic Web & Linked Data

Web Fundamentals
The Semantic Web & Linked Data

Tim Berners-Lee proposed
4 principles to publish Linked Data.

The Linked Data principles resemble
the REST uniform interface constraints.

  1. Uniquely identify resources.
  2. Provide representations
    of those resources to clients.
  3. Each message you send
    should be self-describing.
  4. Hypermedia controls
    must afford next steps.

Information & non-information resources
should be uniquely identifiable.

Using HTTP URIs ensures that
anybody can look up the resource.

Dereferencing a URI should lead to
useful information about that resource.

By including links to other resources,
we create a Web of Data.

The basic information unit in Linked Data
is a link from one resource to another.

Those two resources each
are identified by an HTTP URI.

To simplify their display,
we abbreviate URIs using prefixes.

In contrast to typical Web links, these links
are typed with a URI we can dereference.

This means that a link type (property)
is also a resource we can describe.

In addition to resources,
link targets can also be literal values.

By linking resources together this way,
we create a Web of Linked Data.

Prefixes are a convention,
so they can be chosen freely.

prefix.cc lists several common ones:

rdf
http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs
http://www.w3.org/2000/01/rdf-schema#
owl
http://www.w3.org/2002/07/owl#
foaf
http://xmlns.com/foaf/0.1/
dbr
http://dbpedia.org/resource/
dbo
http://dbpedia.org/ontology/

Additionally, we will use ex for examples.

Web Fundamentals
The Semantic Web & Linked Data

An immense amount of Linked Data
is available on the Web for reuse.

No Linked Data set is ever complete.
We make the open-world assumption.

The Dublin Core terms are a set of
15 common metadata properties.

Schema.org is a single vocabulary
that covers many different fields.

Web Fundamentals
The Semantic Web & Linked Data

The Resource Description Framework is
a model for data interchange on the Web.

This Linked Data fact can be
represented as an RDF triple.

This Linked Data fact can be
represented as an RDF triple.

We define the triple by its components:

subject
IRI – dbr:Tim_Berners-Lee
predicate
IRI – foaf:knows
object
IRI – dbr:Ted_Nelson

There are 3 types of RDF terms:
named nodes, blank nodes, and literals.

Triples consist of RDF terms as follows:

named node
a resource, identified by an IRI
for subjects, predicates, objects
blank node
an unnamed resource
for subjects and objects
literal
a value, with a datatype (IRI) or language
for objects only

This RDF triple has
a literal as an object.

This RDF triple has
a literal as an object.

We define the triple by its components:

subject
IRI – dbr:Tim_Berners-Lee
predicate
IRI – foaf:givenName
object
literal – Tim with language en

This RDF triple has
a literal as an object.

We define the triple by its components:

subject
IRI – dbr:Tim_Berners-Lee
predicate
IRI – dbo:birthDate
object
literal – 1955-06-08
with datatype xsd:date

This is an RDF graph
consisting of a set of triples.

An RDF dataset has one default graph
and zero or more named graphs.

Several standard syntaxes for RDF exist.
Some of them have multi-graph support.

N-Triples is a line-based syntax
supporting only a default graph.

# Every non-empty line represents a triple or comment.
# IRIs are enclosed in angular brackets (< and >).
<http://dbpedia.org/resource/Tim_Berners-Lee> <http://xmlns.com/foaf/0.1/knows> <http://dbpedia.org/resource/Ted_Nelson>.
# Literals are enclosed in double quotation marks (")
# and optionally end with @ and a language tag.
<http://dbpedia.org/resource/Tim_Berners-Lee> <http://xmlns.com/foaf/0.1/givenName> "Tim"@en.
# Alternatively, they end with ^^ and a datatype IRI.
<http://dbpedia.org/resource/Tim_Berners-Lee> <http://dbpedia.org/ontology/birthDate> "1955-06-08"^^<http://www.w3.org/2001/XMLSchema#date>.

Turtle is a superset of N-Triples
with prefixes and abbreviations.

# Declare prefixes before use (hint: prefix.cc).
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
# The predicate a abbreviates rdf:type.
# A semi-colon ; reuses the subject.
# A comma , reuses the subject and predicate.
dbr:Tim_Berners-Lee a foaf:Person;
                    foaf:knows dbr:Ted_Nelson,
                               dbr:Wendy_Hall.
# There are 3 triples above.

Turtle includes syntactic sugar
to write blank nodes.

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
# The following lines all state something is named Tim.
_:x235 foaf:name "Tim"@en.  # blank node label
[] foaf:name "Tim"@en.      # empty blank node
[ foaf:name "Tim"@en ].     # blank node with properties
# Something named Tim knows something named Wendy.
[ foaf:name "Tim"@en ] foaf:knows [ foaf:name "Wendy"@en ].

# The label-based syntax allows cross-references within
# the same document, and is also supported in N-Triples.

Turtle includes syntactic sugar
to write (head/tail) lists.

PREFIX ex: <http://example.org/>
# Note also a shorthand for writing numbers.
ex:MyLottery ex:luckyNumbers (5 14 15).
# This corresponds to:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
ex:MyLottery ex:luckyNumbers _:list1.
_:list1 rdf:first "5"^^xsd:integer
_:list1 rdf:next  _:list2.
_:list2 rdf:first "14"^^xsd:integer
_:list2 rdf:next  _:list3.
_:list3 rdf:first "15"^^xsd:integer
_:list3 rdf:next  rdf:nil.

N-Quads is a superset of N-Triples
with support for named graphs.

# Triples in the default graph look like N-Triples.
<urn:ex:s1> <urn:ex:p1> <urn:ex:o1>.
<urn:ex:s1> <urn:ex:p2> "abc".

# Triples in named graphs have a fourth element.
<urn:ex:s2> <urn:ex:p1> <urn:ex:o2> <urn:ex:GraphA>.
<urn:ex:s2> <urn:ex:p2> "xyz" <urn:ex:GraphB>.

TriG is a superset of Turtle (not N-Quads)
with support for named graphs.

PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
# Triples in the default graph look like Turtle.
dbr:Tim_Berners-Lee a foaf:Person;
                    foaf:knows dbr:Ted_Nelson.
# Named graphs are indicated by a graph statement.
<http://example.org/graphs/Fiction> {
    dbr:Clark_Kent a foaf:Person;
                     foaf:nick "Superman"@en.
}

JSON-LD is a JSON syntax to represent
an RDF dataset, supporting named graphs.

JSON-LD provides additional interpretation
on top of the JSON specification.

JSON-LD documents look almost
like regular JSON documents.

{
  "@context": "http://schema.org/",
  "@id": "http://dbpedia.org/resource/Tim_Berners-Lee",
  "givenName": "Tim",
  "knows": [{
    "@id": "http://dbpedia.org/resource/Ted_Nelson",
    "givenName": "Ted"
  }]
}

JSON-LD documents can be approached
like regular JSON documents.

{
  "@context": "http://schema.org/",
  "@id": "http://dbpedia.org/resource/Tim_Berners-Lee",
  "givenName": "Tim",
  "knows": [{
    "@id": "http://dbpedia.org/resource/Ted_Nelson",
    "givenName": "Ted"
  }]
}

JSON-LD documents can be approached
as RDF triples (or quads).

# These triples are equivalent to the JSON-LD example.
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX schema: <http://schema.org>

dbr:Ted_Nelson schema:givenName "Ted".
dbr:Tim_Berners-Lee schema:givenName "Tim";
                    schema:knows dbr:Ted_Nelson.

The XML-based syntax for RDF
represents triples, but not named graphs.

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:schema="http://schema.org/">
  <rdf:Description rdf:about="http://dbpedia.org/resource/Ted_Nelson">
    <schema:givenName>Ted</schema:givenName>
  </rdf:Description>
  <rdf:Description rdf:about="http://dbpedia.org/resource/Tim_Berners-Lee">
    <schema:givenName>Tim</schema:givenName>
    <schema:knows rdf:resource="http://dbpedia.org/resource/Ted_Nelson"/>
  </rdf:Description>
</rdf:RDF>

Choose the right RDF syntax based on
graph support and client technology.

RDFa allows extending generic HTML
and XML documents with RDF triples.

This RDFa example interleaves
HTML markup with RDF triples.

<div vocab="http://xmlns.com/foaf/0.1/" typeof="Person">
  <p>
    <span property="name">Alice Birpemswick</span>,
    (<a property="mbox" href="mailto:alice@example.com">alice@example.com</a>)
  </p>
  <ul>
    <li property="knows" typeof="Person">
      <a property="homepage" href="https://example.com/bob/">Bob</a>
    </li>
    <li property="knows" typeof="Person" resource="https://example.com/people/#eve">
      <span property="name">Eve</span>
    </li>
  </ul>
</div>

The extracted RDFa data is regular RDF
that can be converted to other formats.

[] a foaf:Person;
   foaf:name "Alice Birpemswick".
   foaf:mbox <mailto:alice@example.com>;
   foaf:knows [ a foaf:Person;
                foaf:homepage <https://example.com/bob/> ],
              <https://example.com/people/#eve>;

<https://example.com/people/#eve> a foaf:Person;
    foaf:name "Eve".
    

RDF on the Web can be found in webpages
and through content negotiation.

Web Fundamentals
The Semantic Web & Linked Data

RDF Schema is an RDF vocabulary
to model RDF vocabularies.

Practitioners in the RDF world often
refer to vocabularies as ontologies.

RDFS defines the basic building blocks
to construct RDF vocabularies.

rdfs:label is a property that gives
a human-readable name to a resource.

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

foaf:knows rdfs:label "knows"@en, "kent"@nl, "connaît"@fr.

rdfs:label rdfs:label "label"@en.

rdfs:comment is a property that clarifies
human-readable meaning and usage.

# rdf, rdfs, and foaf prefixes omitted for brevity
foaf:knows rdfs:label "knows"@en;
           rfds:comment "A person known by this person (indicating some level of reciprocated interaction between the parties)."@en.

rdfs:comment rdfs:label "comment"@en;
             rdfs:comment "A description of the subject resource."

rdfs:seeAlso is a property to express
some link between two resources.

# rdf, rdfs, and foaf prefixes omitted for brevity
foaf:givenName rdfs:seeAlso foaf:familyName.

rdf:type is a property stating that
resource is an instance of a class.

# rdf, rdfs, and foaf prefixes omitted for brevity
<#me> rdf:type foaf:Person.
rdf:type rdf:type rdf:Property.
# Turtle and TriG allow a in predicate position.
<#me> a foaf:Person.
rdf:type a rdf:Property.

rdfs:Resource is a class
of which everything is an instance.

# rdf, rdfs, and foaf prefixes omitted for brevity
<#me> a rdfs:Resource.
foaf:Person a rdfs:Resource.
rdfs:Resource a rdfs:Resource.
rdf:type a rdfs:Resource.

# Even literals are resources (Turtle cannot express this).
# "Tim"@en a rdfs:Resource.

rdfs:Class is a class for resources
that conceptually define a set of things.

# rdf, rdfs, and foaf prefixes omitted for brevity
foaf:Person a rdfs:Class.
rdfs:Resource a rdfs:Class.
rdfs:Class a rdfs:Class.
# Classes can serve as objects of rdf:type triples.
<#me> a foaf:Person.

# The following triples are semantically incorrect.
# rdf:seeAlso a rdfs:Class.
# <#me> a rdfs:Class.

rdf:Property is a class for resources
that can be used as triple predicates.

# rdf, rdfs, and foaf prefixes omitted for brevity
foaf:knows a rdf:Property.
rdf:type a rdf:Property.
rdf:Property a rdfs:Class.
# Properties can serve as predicates of triples.
<#Tim> foaf:knows <#Ted>.

# The following triples are semantically incorrect.
# rdfs:Class a rdf:Property.
# rdf:Property a rdf:Property.

rdfs:Literal is a class for resources
that have a literal value.

# rdf, rdfs, and foaf prefixes omitted for brevity

# Unfortunately, we cannot express this in Turtle.
# "Tim"@en a rdfs:Literal.
# 5 a rdfs:Literal.
# 2.7 a rdfs:Literal.

# The following triples are semantically incorrect.
# foaf:Person a rdfs:Literal.
# rdfs:Literal a rdfs:Literal.

rdfs:subClassOf is a property stating
all members of a class belong to another.

# rdf, rdfs, and foaf prefixes omitted for brevity
<#ComputerScientist> a rdfs:Class.
foaf:Person a rdfs:Class.
rdfs:Resource a rdfs:Class.
rdfs:Class a rdfs:Class.

<#ComputerScientist> rdfs:subClassOf foaf:Person.
foaf:Person rdfs:subClassOf foaf:Agent.
foaf:Person rdfs:subClassOf rdfs:Resource.
rdfs:Class rdfs:subClassOf rdfs:Resource.

rdfs:domain is a property that states
the class of possible subjects of a property.

# rdf, rdfs, and foaf prefixes omitted for brevity
foaf:img rdfs:domain foaf:Person.
foaf:img rdfs:domain rdfs:Resource.
rdf:type rdfs:domain rdfs:Resource.

rdfs:domain rdfs:domain rdf:Property.

rdfs:range is a property that states
the class of possible objects of a property.

# rdf, rdfs, and foaf prefixes omitted for brevity
foaf:img rdfs:range foaf:Image.
foaf:img rdfs:range rdfs:Resource.
rdf:type rdfs:range rdfs:Class.
rdf:type rdfs:range rdfs:Resource.

rdfs:range rdfs:range rdfs:Class.

rdfs:subPropertyOf is a property stating
a property is more specific than another.

# rdf, rdfs, and foaf prefixes omitted for brevity
<#hasFriend> rdfs:subPropertyOf foaf:knows.
rdfs:range rdfs:subPropertyOf rdfs:seeAlso.
rdfs:domain rdfs:subPropertyOf rdfs:seeAlso.

Knowledge of RDFS will help you
understand most vocabularies.

In particular, read the following vocabularies:

Web Fundamentals
The Semantic Web & Linked Data

The Web Ontology Language (OWL) provides concepts for detailed ontologies.

OWL defines additional constraints
for individuals, properties, and classes.

OWL defines its own version
of resources and classes.

An IRI uniquely identifies a resource,
but one resource can have many IRIs.

Typical properties can either take
a literal or a named node as object.

Inverse properties express a triple
in the opposite direction.

A functional property restricts the objects
for a given subject to be identical.

Functional properties have strong effects,
so you have to understand them well.

What is the logical consequence of the following?

ex:Julia ex:hasSpouse ex:Cathy.
ex:Julia ex:hasSpouse ex:John.
ex:hasSpouse a owl:FunctionalProperty.

It might be counterintuitive, but the conclusion is:

ex:Cathy owl:sameAs ex:John.

To arrive at a contradiction, explicitly define inequality:

ex:Cathy owl:differentFrom ex:John.

OWL contains similar properties for
symmetry, reflexivity, and transitivity.

OWL allows defining classes
based on (properties of) other classes.

ex:Single owl:equivalentClass [ a owl:Class;
    owl:intersectionOf (foaf:Person, [
        a owl:Class, owl:Restriction;
        owl:onProperty ex:hasPartner;
        owl:maxCardinality 0
    ])
].

Web Fundamentals
The Semantic Web & Linked Data

SPARQL Protocol And RDF Query Language: enable querying & updating RDF datasets.

The SPARQL language defines
forms a query can take.

There are currently 4 read-only query forms:

SELECT
find values that satisfy conditions
CONSTRUCT
create triples that satisfy conditions
ASK
check whether data exists
DESCRIBE
show information about a resource

The main building block of a SPARQL query
is a Basic Graph Pattern (BGP).

This query finds artists
influenced by Picasso.

PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?name ?person WHERE {
  ?person a dbo:Artist.
  ?person foaf:name ?name.
  ?person dbo:influencedBy dbr:Pablo_Picasso.
}

A query engine will try to find mappings
such that the entire BGP is satisfied.

When the mappings are substituted in the BGP,
the dataset should contain triples as follows:

Evaluating this query against DBpedia
returns possible mappings.

A CONSTRUCT query
returns matching triples.

An ASK query returns a boolean stating
whether the pattern exists in the dataset.

DESCRIBE query returns (non-specified)
contextual information for resources.

In addition to only BGPs,
SPARQL queries can contain modifiers.

LIMIT
only return the first n results
OPTIONAL
specifies a left join
FILTER
selects based on an expression
ORDER BY
sorts results based on an expression

In addition to only BGPs,
SPARQL queries can contain modifiers.

The purpose of the SPARQL protocol
is sending queries and receiving results.

Web Fundamentals
The Semantic Web & Linked Data

Semantic Web reasoning is an agent’s
ability to verify and discover facts.

Some reasoners are tailored to a task,
others can/need to be extended.

RDFS reasoners can make entailments
based on the RDFS semantics.

For example, the presence of

<#Tim> foaf:knows <#Wendy>.
foaf:knows rdfs:domain foaf:Person.

will result in an extra triple

<#Tim> a foaf:Person.

OWL reasoners can make entailments
based on the OWL semantics.

For example, the presence of

<#me> ex:hasSpouse <#SignificantOther>.

together with the class restrictions we saw earlier
ensures that the following triple cannot be true:

<#me> a ex:Single.

Rule-based reasoners allow you
to choose and define your own rules.

Notation3 (N3) is a rule-based language
defined as a superset of Turtle.

We can define rules
for very specific situations.

Defining rules at a higher level
increases their reusability.

Defining rules at the ontological level
makes a vocabulary declarative.

Web Fundamentals
The Semantic Web & Linked Data

I don’t need to fight
to prove I’m right.

I don’t need to be forgiven.

The Who – Baba O'Riley