Themes from the 2nd London Linked Data Meetup
Yesterday I joined about 200 other developers, librarians, information architects, journalists and project managers at the 2nd London Linked Data Meetup.
Although there were as many perspectives on this new field as there were attendees, there were some strong common themes running throughout all the talks, workshops and conversations.
Thing-centric
Building applications with Linked Data is a departure from the established traditions. Instead of focusing on documents and transactions, the linked data approach is to start with the things which matter to the users, and design the application around those. These things provide the basic interaction model and URI structure of the application, and everything else adds more data to describe them.
The data consumer chooses, not the publisher
Echoing Tim Berners-Lee’s demand for raw data now, there was a lot of talk about publishing the raw data and letting the consumers of that data decide what to do with it.
Publishing data has usually involved wrapping it up in a particular view, reflecting the publisher’s priorities. But presenting the raw data as a set of Linked Data statements means the consumers can request the view they want, instead of performing complex processing to re-tabulate the data.
This shifts the costs in the process of publishing and using data: publishers have to spend a bit more time publishing, but consumers spend a lot less time messing around with data before they can start using it. Overall, using Linked Data enables many more uses to be cost effective, which can only be a good thing for exciting applications and social transparency.
Decentralisation
The common formats and naming schemes for Linked Data give the data consumer the ability to combine data sets together and process them as one. Rather than having a centralised source for everything, the Linked Data is decentralised and distributed across many sources.
Each of these data sets is curated by people who know and care about them, taking into account the special requirements of the subject area. Even an organisation as great as the BBC cannot create authoritative data across the breath of human knowledge, so they’re increasingly using these public data sets by republishing them with attribution, and actively contributing to them where improvements are necessary.
This decentralisation is good for the web, the quality of the data, and the users. But just as making a query across two traditional relational databases is tricky, so is making a query across multiple linked data sources held on separate servers. Users want data to be aggregated so they can query across these datasets without having to deal with the complexity of actually doing it. There’s some work going on to address this, including Uberblic (German for ‘overview’) which creates a live copy of public datasources and makes it trivial to query across them.
There is a danger that decentralisation encourages de-facto monopolies, such as Wikipedia, a de-facto encyclopaedia monopoly. This is mitigated, to some degree, by the tendency of users to choose data sources which are open and have the greatest commitments to freedom.
URI minting
The use of URIs as identifiers in Linked Data means that someone has to ‘mint’ them. Yet anyone can do this, so duplicates abound. It’s a big problem for data consumers, so much so that the rather useful sameAs lists no less than 26 different URIs that refer to London.
The consensus was that other people’s URIs should be reused, as long as the source was going to be around and the URIs were persistent. The changing nature of Wikipedia URIs, and thus DBpedia URIs, was a concern. While the URIs will still ‘exist’ in the sense that they are usable as identifiers in data sets, it’s important that the source collections continue to evolve to reflect the subject matter.
But no one suggested that there should be a central source of URIs. It’s an important principle that anyone can mint a URI, but they have to do it responsibly.
Provenance and Attribution of linked data
How do you know your data is authoritative and accurate? How should it be attributed in applications? Can it be trusted? These are ongoing issues which are still to be resolved.
While the provenance of data is a problem which exists regardless of the format of data, attribution within a complex combination of data from many Linked Data sources is an interesting problem. When the BBC use Wikipedia entries on their Wildlife Finder, they take the reasonable approach of saying “if you find this inaccurate or offensive, go fix it yourself on Wikipedia”.
Building infrastructure, converting data
Linked Data is still fairly new, and the community seems focused on building the infrastructure and converting other data sources to Linked Data formats. The possibilities for using the Linked Data concepts to build the data sets in the first place isn’t, in general, being explored.
Almost by chance, at OneIS we found that the Linked Data approach is a wonderful way of representing human knowledge. It provides a natural way of describing facts about things, and most importantly, describing the real world links between them. This results in a rich structure which can be used for really precise search when you combine the best of keyword free-text search with the Linked Data semantic graph.
Our research into Information Usability and practical experience with end users shows that, with the right user interface, anyone can create really high quality data in Linked Data formats and use it in their day-to-day work. As a means of describing what an organisation “knows”, it cannot be beaten.
I’ve heard of just one other organisation, Swirrl, who are actively working on general end user applications rather than developer-focused infrastructure. Annoyingly, I managed to miss the Swirrl people at the meetup.
We need think more about end user applications which create Linked Data, and I hope others are interested in starting a discussion about the opportunities it presents.
Next!
There was talk of the next Meetup happening in a few months time. Whenever it is, I’m looking forward to meeting everyone again.
Many thanks to Georgi Kobilarov and Silver Oliver for organising the day.
COMMENTS
blog comments powered by Disqus
