rel=”source” might be not so awesome! We already have alternative options.

Today I’ve learned from The Cangelog news blog about Jeremy Keith’s proposal to add a relation between a project/document and its souce code by using rel="source" somewhere in the HTML markup. Sounds good, but it would only be a unspecific relation between a document or project or whatever and something related to some unspecific type of source code container :) I really miss the improved semantics here. And we already have options to describe it.

Keith wrote:

I got chatting to Aral about a markup pattern that’s become fairly prevalent since the rise of Github: linking to the source code for a website or project. You know, like when you see “fork me on Github” links. […] We were talking about how it would be nice to have some machine-readable way of explicitly marking up those kind of links, whether they’re in the head of the document, or visible in the body. Sounds like a job for the rel attribute, I thought. […] I’ve proposed rel=”source”.

The “fork me on Github” link has at least 3 different meanings:

  1. the website of a software project contains a relation to the public source code repository of the software
  2. the link points to a repository where the content of the webseite is managed, this link is the same on all pages of the website
  3. the link points to the related source of the currently shown page deep in the repository

A human can see the difference, a machine not. If we add code to markup to make the relation machine-readable, a rel="source" wouldn’t do the job. A machine really needs a very clear description about “X has a relation to Y”, the machine need to know about X, Y and the relation. A machine needs a object, a predicate and a subject to understand the fact.

Lack of the object

The brainstorming page about rel="source" at the Microformats wiki describe the use case as:

When an author links to a project’s (or document’s) source code (e.g. on GitHub, Google Code, etc.) a rel value of “source” could be used to explicitly define that relationship.

It already describes the problem: is it the source code of the (software) project or is it a link to the source code of the document (content of the webpage)? X (the object) is not defined here, a simple rel="source" do not imply entailment, there is no logical consequence. Machines would have to guess what the object is.

Lack of the subject and its type

While there is a discussion about the label on the brainstorming site, a machine only cares about the semantical meaning, “sfuzcfzcsz” could be the name of the property that relate a software project to it’s download archive. rel="source" don’t say anything about the subject or its type, it only describes that the related resource has to do something with source code.

  • Is it the source code in a zip/tar/xyz archive?
  • Is it a file containing uncompressed source code?
  • Is it a repository address you can checkout or clone from? Is it Git, Mercurial, SVN, …?
  • Is it a relation to the homepage of a code repository (e.g. most Github links)?
  • Is the link about a special version, branch, tag of the source code?

Where to put the relation in?

You may able to use rel="source" on a meta and anchor element in HTML. Please try to use it on a PNG image — e.g. a computer generated fractal — to describe a relation to the source code of the generating program. Good luck! There is no Microformat vocabulary to wrap it in. Leads to the next problem:

What about listings?

If you have a listing of software projects on one webpage, how can we set the correct relations. Using rel="source" multiple times on various links? A machine would only see multiple source relations on one HTML document because currently there is no proper vocabulary in MF1 and MF2 that you can use to describe the software project or web document, maybe h-item but according the spec “h-item is almost never used on its own.”

Already existing and working alternatives

According to a “rel usecase repository”, Keith writes:

The benefit of having one centralised for this is that you can see if someone else has had the same idea as you. Then you can come to agreement on which value to use, so that everyone’s using the same vocabulary instead of just making stuff up.

Good advice :) There are alternative options developed by the RDF and Microdata communities, developed months or years ago:

  • The DOAP project “create an XML/RDF vocabulary to describe software projects, and in particular open source projects.” The DOAP vocabulary provides properties like repository and more to describe relations to Repositories, Downloads, Wikis, Issue Lists and a lot of more stuff. You can add DOAP to your HTML via RDFa and Microdata.

  • The schema.org vocabulary provides concepts like CreativeWork, SoftwareApplication, Code and properties like codeRepository and downloadUrl, also useable in you HTML markup via RDFa and Microdata syntax.

  • Update (Feb 21): there is already a rel-vcs Microformat that is used and supported by various tools. But t is not listed in the “the official registry for rel values.” Of course it shares most of the problems with rel-source regarding semantics.

We don’t need to live in a religious Microformats-centric world with the Microformats wiki as your bible. :) And rel="source" would be probably not so awesome because:

  • it does not add enough semantics that machines can understand the correct fact
  • there are already alternatives that work

Top 7 from the 23rd week: RDFa, Microdata, PDF obfuscation & skateboarding

The last week’s link roundup digest was planned for Sunday but this post was lost in a space-time continuum. Now it’s here:

Semantic Web

  • RDF 1.1 Concepts and Abstract Syntax was published as W3C working draft, defining the RDF data model, introduces new datatypes for HTML fragments and language-tagged strings, and re-worked the XML datatype.
  • 1 year ago schema.org was launched, Dan Brickley wrote a nice roundup “SemTech, RDFa, Microdata and more” what happened since then, and he is giving a outlook about future developments like schema.org 1.0.
  • RDFa.info added an RDFa playground editor, a helpful tool if you wanna test your RDFa markup, or to learn RDFa. Especially the graphical view could help a lot.

Developers zone

  • in “OMG-WTF-PDF” Julia Wolf talked about PDF obfuscation and critical backdoors, it’s from 2010 but still interesting and importing for your security.
  • gmaps.js allows you to use the potential of Google Maps in a simple way. No more extensive documentation or large amount of code.
  • Anchor CMS is “built for art-directed posts.” Basically it is a very simple blog system, handling posts and pages, using a Wordpress-like API but without all the ballast. Funny that it is licensed under WTFPL.

Now, if you wanna know where skateboarding, innovation, hacking and FLOSS meet then check out Rodney Mullen at TEDxUSC on “How Context Shapes Content”:

(Source: delicious.com)