How Solid aims to impact the Web (and AI with it)

Ruben Verborgh

Ghent University – imec

To computers,
the hard is easy
and the easy is hard.

paraphrased from Moravec’s Paradox

[Cover of Scientific American, May 2001] — ©2001 Scientific American

Mom needs to see a specialist
and then has to have a series
of physical therapy sessions.
Biweekly or something.
I’m going to have my agent
set up the appointments.

The Semantic Web, 2001

The Semantic Web has suffered
from a chicken-and-egg problem.

No apps, because no data.
No data, because no apps.

AI needs data.
The more the better.

Data is in hands of the happy few.
Getting more data is hard.

The future does not consist of
a small number of huge data sets.

Instead, I believe we will see
a huge number of small data sets.

The big data crisis
Rewiring people, data, and apps
Networked intelligence

The big data crisis
- The Big Data fallacy
- Losing control and innovation
Rewiring people, data, and apps
Networked intelligence

Not even so long ago, we were promised
Big Data would solve all of our problems.

Bringing heaps of data together in one place
will reveal new patterns and insights.
- Find things you didn’t know you didn’t know.
Big Data has driven a fundamental shift
in how enterprises deal with data.
Is it technologically so surprising?
- Of course patterns emerge from so much data.
- It would be more surprising if it didn’t!

Big Data comes with big problems.
Many of them are not technological.

How do we obtain this much data in the first place?
- Acquisition costs are high.
How do we legally justify its storage?
- GDPR compliance is difficult and expensive.
How do we keep on getting more data?
- You need to be the one with the Biggest Data.

The data you can collect
isn’t always the data you want.

Supermarkets know better than me:
- what I have bought
- what I want to buy
With Big Data, one day they’ll know:
- how many steps and breaths I’m taking exactly where
But what they really want to know:
- what I am still buying from their competitors

The current Big Data craziness
is killing meaningful innovation.

A small company creates AI
that matches people to jobs.
- Their AI depends on data.
Today, a large part of activities
necessarily focuses on data harvesting.
- But they actually want to do AI!
Getting bigger means competing with LinkedIn.
- They have already lost that fight before even starting.

The Big Data fallacy is that you cannot
keep on throwing more data at a problem.

Obtaining such enormous amounts of data
is staggeringly expensive.
Adding more and more data to the pile
is even more expensive.
Sooner or later, you will inevitably reach
the point of diminishing returns.
- More data does not deliver proportionally more insights.

The big data crisis
- The Big Data fallacy
- Losing control and innovation
Rewiring people, data, and apps
Networked intelligence

The Web strives to be universal
through independence of many factors.

Anyone can use the Web, regardless of:

hardware

desktop

phone

tablet

watch

…

software

operating system

browser

app

…
Developers are free to innovate.
- build for the Web
- standards provide interoperability

The Web brings freedom of expression
to everyone across the world.

Anyone can say anything about anything.
We all have our own spaces,
so we don’t have to agree.
We can link to opinions of others
to discuss about them.

The Web brings permissionless innovation
at a global scale.

Anyone can build anything for any reason.
You don’t need anyone’s permission
to join the Web and launch a new idea.
- not the case in app stores

Our data has become centralized
in a handful of Web platforms.

People’s personal blogs
are now on Facebook and Twitter.
- great user experience
- but we lost control
This has far-reaching consequences for privacy.
It endangers the Web’s universality.
- Sign in with Facebook to see this content.
- Facebook works better with the native app.

Within the walled gardens of Web apps,
you have to move either data or people.

The current massive centralization
hurts diversity, innovation, and choice.

If you can build 1 API integration
- will it be facebook.com or private-identity-provider.org?
Developers depend on centralized platforms
for data and identity.
- …or they have to become such a platform themselves
People lose control of their data
and cannot easily switch to other apps.
- innovation cannot attract locked-in customers

The big data crisis
Rewiring people, data, and apps
- Decoupling apps and data
- The Solid ecosystem
Networked intelligence

Solid aims to restore choice.

The Solid ecosystem enables people to pick the apps they need, while
storing their data wherever they want.

People control their data, and share it
with the apps and people they choose.

People choose where they store
every single piece of data they produce.

They can grant apps and people access
to very specific parts of their data.

Separating app and storage competition
drives permissionless innovation.

There will not be less data.
There will be more.

Supermarkets can see competitor’s data.
- Under GDPR, consumers can get their data out.
- They can choose to share it… for a better deal!
AI companies do not have to be harvesters.
- Ask for data when you need it.
- Focus on what you really want to do.

The big data crisis
Rewiring people, data, and apps
- Decoupling apps and data
- The Solid ecosystem
Networked intelligence

Solid is not a company or organisation.
Solid is not (just) software.

Solid is an ecosystem.
- standards enable interoperability
Solid is a movement.
- shifting the app builder mindset
Solid is a community.
- different people, companies, and organisations

The Solid server acts as a data pod
that stores and guards your data.

a regular Web server
- with support for access control
- with support for Linked Data
application-agnostic
- build any application
- application-specific logic resides in clients
just like your website
- your data can be opened with any app

A typical data pod can contain
any data you create or need online.

profile 👤
photos 🖼
comments 🗣
likes 👍
… ✨

Solid clients are browser or native apps
that read from or write to your data pod.

You give apps permission.
- Choose very precisely what they can access.
Friends give you permission.
- Choose very precisely what you can access.
Apps deliver a unified experience.
- Browse your friends’ pictures along with yours.

Any app you can envision,
you can build with Solid.

calendar 📅
social feed 👥
photo sharing 📸
conference organization system 🎤
… ✨

The Solid server and several apps exist
and are usable for developers.

Solid server
- store your data online with access control
- free storage at solid.community and inrupt.net
apps
- data browser
- contacts
- photos
- meeting organizer
- …
libraries
- authentication
- data processing
- …

The big data crisis
Rewiring people, data, and apps
Networked intelligence

The Web is a good platform
for data publication,
but a pretty bad platform
for data consumption.
Frank van Harmelen

Interoperability challenges in Solid
are tackled with Linked Data in RDF.

If we all store our own data,
how do we connect it to others’ data?
How can apps share data,
without too many prior agreements?
How do we integrate data
from multiple data pods?

With JSON-LD, every piece of data
can link to any other piece of data.

{
  "@context":  "https://www.w3.org/ns/activitystreams",
  "id":        "#ruben-likes-pfia2019",
  "type":      "Like",
  "actor":     "https://ruben.verborgh.org/profile/#me",
  "object":    "https://www.irit.fr/pfia2019/#this",
  "published": "2019-06-02T12:00:00Z"
}

Data shapes and their semantics
enable layered compatibility.

{
  "@context":  "https://www.w3.org/ns/activitystreams",
  "id":        "#ruben-likes-ldac2019",
  "type":      "Like",
  "actor":     "https://ruben.verborgh.org/profile/#me",
  "object":    "https://www.irit.fr/pfia2019/#this",
  "published": "2019-06-02T12:00:00Z"
}

Different source data
can be concatenated.

{
  "@context":  "https://www.w3.org/ns/activitystreams",
  "@graph": [{
    "type":      "Like",
    "actor":     "https://ruben.verborgh.org/profile/#me",
    "object":    "https://www.irit.fr/pfia2019/#this",
    "published": "2019-06-02T12:00:00Z"
  },{
    "type":      "Like",
    "actor":     "https://example.org/people/marie#me",
    "object":    "https://www.irit.fr/pfia2019/#this",
    "published": "2019-06-02T12:05:00Z"
  }]
}

Decentralized apps have many back-ends. Back-ends work with many apps.

The current approach to building apps
does not play well with decentralization.

When clients do not bind to HTTP requests,
APIs can evolve independently of app logic.

Decentralization needs replication
for realistic performance.

Current networks are centered
around the aggregator.

We need to create network flows
to and from the aggregator.

The individual network nodes
need to become the source of truth.

Aggregators need to become part
of a larger network.

Agents serve as a crucial
but transparent layer in the network.

Agents’ main responsibility is
sustaining a network between nodes.

The big data crisis
Rewiring people, data, and apps
Networked intelligence

The future is not huge.
It is small.

We are prepared to tackle huge,
but not a lot of small.

Can we build artificial intelligence
that works on a lot of small data?

Can we build the intelligence
for agents to sustain a network?