A solution to the RDF publishing dilemma
The Semantic Web is a vision about the Web in which computer programs can process not only the syntax structure but also the semantic meaning of Web pages. To achieve this , we invented knowledge representation languages like RDF and OWL. The idea is that we will use these languages to describe the meta-data about a Web page and the semantic meaning of its content. One question remains to be unanswered — how should we publish RDF documents on the Web?
I called this the RDF publishing dilemma. In the Semantic Web, should a content creator publish an explicit semantic description of an HTML page in a separate RDF document, or should the semantic information be embedded within the HTML page itself? There are pros and cons associated with either approaches.
If the semantic information is described in a separate RDF document, it simplifies the editing and the management of documents. RDF documents will be treated like other Web documents — e.g., unique URL for each document and no messy syntax mashup between RDF and HTML. However, it has some disadvantages. Version controls become a bit more complex because we need to maintain information consistence between an RDF and an HTML document. Also, it discourages Web designers from adopting the Semantic Web idea because many see the creation of RDF documents as an extra task that gives no immediate benefit.
On the other hand, if semantic information is embedded within the HTML pages, it simplifies version control and lowers the barrier for Web designers to create semantic documents. Adding semantics to an HTML page is simply adding new tags to the existing page. But, this approach has its own problem. Because embedding semantic information in a Web page (e.g., RDF + HTML), it imposes significant overhead and challenge for computer programs to process the document — extra logic needs to be implemented to parse and extract RDF description from the HTML pages.
I came across Ivan’s blog that describes a simple solution that solves the RDF publishing dilemma. You can read about the details in Ivan’s post. The basic idea is that Web publishers will use RDFa to describe semantic information in an HTML page. Instead of requiring computer programs to parse and extract RDF information from the page, the web server is configured to serve an RDF-version of the HTML page by exploiting an RDFa-to-RDF translator and some Apache Rewrite rules.





















Hi,
The whole point of RDFa was to allow authors to add RDF directly to mark-up languages such as HTML, XHTML, SVG and so on, so that publishing the separate RDF/XML document became unnecessary. Of course there will be situations where the RDF/XML document will be the primary source of data, and so transforming that into XHTML+RDFa is a good solution (as Ivan describes).
But I think for many use-cases the RDF/XML step will eventually become unnecessary. For example, to provide client-side tools or RDFa parsers with access to the location at which some photographs were taken (perhaps in some system like Flickr), takes barely a line or two of server-side code; each image on a page, whether in a profile, slideshow, search results or whatever would simply have a couple of extra (RDFa) attributes added to indicate the location and these could then be consumed by browser plug-ins or RDF pipelines.
So although RDFa plays perfectly nicely with RDF–as you rightly point out–it’s big advantage is that it is very easy for anyone to publish RDF without having to get into hairy server configuration. You could even add location data to your blog, for example, even without having full control over the server.
All the best,
Mark
–
Mark Birbeck, webBackplane
mark.birbeck@webBackplane.com
http://webBackplane.com/mark-birbeck
Comment by Mark Birbeck — February 25, 2008 @ 3:24 pm
Sorry…and by “it’s big advantage” I of course mean “its big advantage”.
Comment by Mark Birbeck — February 25, 2008 @ 3:27 pm
[…] readingRealTechGeospatial Semantic WebNova Spivack on the Meaning and Future of the Semantic WebNova Spivack on Collective Intelligence […]
Pingback by Semantic thoughts #2 : business|bytes|genes|molecules — March 4, 2008 @ 2:40 am