Linking Data

This post is adapted from a post on my work site.

The world wide web has provided the ability for anyone to make a website and link it to any other. This openness, and the ground-up approach of making the sites and links, has led to the mass of interconnected web pages we see today. It is one of the strengths of the web. However, as more and more data makes its way online, computers are finding it difficult to fully understand the connections between different data sets. That is where Tim Berners-Lee's idea of Linked Data comes in. Linked Data is a way of describing and storing data on the web so that others (people or computers) can see how data on web page A is/are connected (concept-wise) to data on web page B.

I like the idea but for a while I’ve been hitting a problem. I couldn't work out how to actually go about creating Linked Data or using it. Most of the websites about Linked Data are very technical and launch into talk of specifications, schemas and ontologies at the slightest provocation. That sort of stuff scares me as it isn't usually written for people that don't already understand it and often seems to lead to an endless chain of documents to read. If, after two or three hours of reading technical documents, I still don't know how to do something basic, I tend to go find something else to do.

That is where I was with Linked Data a couple of weeks ago; nice idea but not a clue how to get started. Then, via a conversation with Doug Burke, I noticed that schools in England and Wales have been included in the UK government’s first foray into Linked Data (Scotland and Northern Ireland have separate education systems and aren’t included yet). As my new job at LCOGT has many school users in the UK (through the Faulkes Telescope Project), that seemed like a good place for me to finally get my feet wet.

Education.data.gov.uk provides web addresses (URIs) - for each school. At a page for a specific school (e.g. Clifton High School) data about that school can be seen and, importantly, understood by special software. To get started I had to find the URI for each school that was in the LCOGT database. This involved learning some SPARQL (apparently similar to SQL or MYSQL) so that I could search their school database. It turned out that our own data quality wasn’t great with some schools being listed with slightly different names, numbers or postcodes compared to those in the government database. However, after a bit of manual effort, we got URIs for 684 schools. That meant we could start doing some interesting things.

The first thing I did was to download the longitude and latitude of every school that we had a Linked Data address for. I then gridded these and made a heat map (the redder an area, the more schools are in that bit of the country) for English and Welsh schools. The result looks fairly similar to a map showing population density so the good news is that the Faulkes Telescope Project doesn't appear to have much bias in which parts of England and Wales register.



That shows the start of what is possible once data get linked. Of course, at this point we were just consuming Linked Data and I thought I should help create some. So we added some Linked Data within the web pages for observations and users. Although not visible to a person viewing the web page it does show up in special software.

Last week I started to experimenting with sharing data properly through a special Linked Data file type known as RDF. I’m still not entirely sure of the best way to put information into RDF yet but I’m creating examples of how it might look and hoping some Linked Data experts might be able to give me some pointers (i.e. corrections rather than links to yet more documentation).

These are just the first baby steps towards making Linked Data at LCOGT. I've already started wondering what we could do if the Simbad or ADS databases provided Linked Data.

Posted in astro blog by Stuart on Monday 29th Nov 2010 (15:54 GMT) | 2 Comments | Permalink

Comments: Linking Data

Note from a Web expert: your example file returns a 404 ;-)

Posted by David Larlet on Wednesday 01st Dec 2010 (05:09 UTC)

gravatarDavid, indeed it did. That is what comes from reusing my post from another site and not correcting relative links. Should be fixed now.

Posted by Stuart on Wednesday 01st Dec 2010 (08:47 UTC)

ADD A COMMENT:


Don't provide an email/URL unless really necessary as your comment may get caught in the spam filter. No URLs get turned into links so don't bother. The ground rules for commenting are:
  1. No profanity or personal attacks please. Keep it clean.
  2. Restrict comments to subjects relevant to the post.
  3. Don't mention Pluto. If you do it'll be replaced by Goofy.
  4. No spam i.e. anything commercial unrelated to astronomy.
  5. If you think you've discovered a Theory of Everything, a replacement to Relativity, or something similar then please publish it in a journal rather than in my comments.
Comments against the spirit of these ground rules may be removed.











* required fields