Archive for the ‘sparql’ Category

GRDDL Is Out, How To Integrate With SPARQL

Tuesday, September 11th, 2007

GRDDL is out, providing a mechanism for providing instructions to convert documents on the web into RDF. In short, GRDDL allows you to link an XSLT transform to your XHTML page, which converts the XHTML into an RDF document. For more information, start at the GRDDL Primer.

(The irony is that while you can use XSLT to convert into RDF, you can’t ever use XSLT to convert RDF into something else with complete certainly because RDF/XML output is nondeterministic.)

The primer includes a few examples of using SPARQL to query the RDF document generated by a GRDDL transform. Here’s an example from the primer:


PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rev: <http://www.purl.org/stuff/rev#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?rating ?name ?region ?hotelname

FROM <http://www.w3.org/TR/grddl-primer/hotel-data.rdf>

WHERE {
?x rev:hasReview ?review;
	vcard:ADR ?address;
	vcard:FN ?hotelname .
?review rev:rating ?rating .
?address vcard:Locality ?region.

FILTER (?rating > "2").

?review rev:reviewer ?reviewer.
?reviewer foaf:name ?name;
	foaf:homepage ?homepage

}

Looking at the FROM line, you see that we are referencing an RDF document by URI. However, if we are using GRDDL, that document doesn’t exist until after we perform any transforms.

This means we can’t use GRDDL directly in our SPARQL queries, as there isn’t a physical RDF document to reference.

However, using the ever useful GRDDL Service, which is an online web service (lower case web service :) to generate RDF from documents using GRDDL, we could integrate GRDDL enabled documents directly into our SPARQL queries.

Let’s replace the FROM clause in our original SPARQL query with the direct URI to the RDF document (instead of the generated "middle man" RDF document).


PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rev: <http://www.purl.org/stuff/rev#>
PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT DISTINCT ?rating ?name ?region ?hotelname

FROM <http://www.w3.org/2007/08/grddl/?docAddr=http%3A%2F%2Fwww.w3.org%2FTR%2Fgrddl-primer%2Fhotel-data.html&output=rdfxml>

WHERE {
?x rev:hasReview ?review;
	vcard:ADR ?address;
	vcard:FN ?hotelname .
?review rev:rating ?rating .
?address vcard:Locality ?region.

FILTER (?rating > "2").

?review rev:reviewer ?reviewer.
?reviewer foaf:name ?name;
	foaf:homepage ?homepage

}

There, now isn’t that much more webby? I think the success of GRDDL lies with the integration into existing RDF toolkits. Otherwise, it’s a two step process to get documents off the web, transformed into RDF, and then into RDF tools.

For XHTML documents, though, my money is with RDFa. I think linking XSLT to XHTML is just too complicated and brittle (hmm, rhymes with GRDDL) for the masses OR for the tools. RDFa at least lets me directly embed the markup inside my XHTML documents, which makes it much easier to change when I change the XHTML. Plus, as my tools will dynamically generate the XHTML (think template languages for web frameworks) I can easily embed the RDFa right into the templates. Plus, I’m already using CSS and CSS classes, which RDFa encourages, so I can ride off of that investment.

SPARQL Via HTTP Methods

Sunday, March 4th, 2007

Querying the web might get a bit easier, with the union of SPARQL directly with HTTP. TripleSoup, a promising proposal at Apache, aims to expose Triple Stores (RDF databases) directly via HTTP.

This reminds me of URIQA, which is an effort to provide native HTTP methods for accessing metadata about a certain resource. URIQA was interesting because it allows you to say

MGET /foo HTTP/1.1

which means “Retrieve the metadata for resource `/foo`”

It looks like TripleSoup is a bit different, in that the URI in the request methods is some type of application. TripleSoup seems to be a gateway directly into the triple store, whereas URIQA masks the concept of talking to the triple store. In URIQA, it looks like the triple store *is* the server you are connecting to. With TripleSoup, the triple store is located at the URI you are sending requests to.

URIQA’s advantage is that you don’t need to know the URI to the application or triple store, you can just send an MGET to the resource. Of course, URIQA doesn’t handle queries with SPARQL.

My first question with TripleSoup is, how would I discover the URI that I can use for querying? It’s the same problem that URIQA tries to solve, “I know the URI for the resource, but I want to get its metadata.” I can ask that question in SPARQL, but who do I ask?

Best of luck to the TripleSoup team, really looking forward to the code.

DARQ - Federated Queries with SPARQL

Friday, June 30th, 2006

Use DARQ to query the entire web as a single large database. It’s federated SPARQL querying.

SPARQL Protocol

Thursday, January 26th, 2006

> SPARQL is a query language and protocol for RDF. This document specifies the SPARQL Protocol; it uses WSDL 2.0 to describe a means for conveying SPARQL queries to an SPARQL query processing service and returning the query results to the entity that requested them.

Of interest is that there are HTTP bindings specifically mentioned in the WSDL 2.0 document.