Description
Section 1.4 of the microformats2 parsing specification outlines how to parse link elements (<a>
, <link>
, etc.) for rel
values and defines the JSON output structure.
8AB1
The rels
structure is reasonably straightforward and maps one-to-one with matched elements:
<a rel="author" href="http://example.com/a">author a</a>
<a rel="author" href="http://example.com/b">author b</a>
<a rel="in-reply-to" href="http://example.com/1">post 1</a>
<a rel="in-reply-to" href="http://example.com/2">post 2</a>
<a rel="alternate home"
href="http://example.com/fr"
media="handheld"
hreflang="fr">French mobile homepage</a>
…results in…
{
"rels": {
"author": [ "http://example.com/a", "http://example.com/b" ],
"in-reply-to": [ "http://example.com/1", "http://example.com/2" ],
"alternate": [ "http://example.com/fr" ],
"home": [ "http://example.com/fr" ]
}
}
The parsing rules break down slightly when compiling results for the rel-urls
structure. For each unique URL, the resulting JSON hash should include a key rels
whose value is an array of strings found across matched link elements. The spec also defines rules for parsing various attributes (hreflang
, media
, title
, and type
) and the node's text value. These extended attributes are specified as strings (not arrays), resulting in data loss and a seemingly inconsistent parsing pattern.
Parser Results
Parser developers have implemented this feature with differing results.
Given the markup:
<link rel="me" href="https://sixtwothree.org">
<a rel="me" href="https://sixtwothree.org">Jason Garber</a>
<a rel="home" href="https://sixtwothree.org">Go back home</a>
…the parsers provide differing result JSON.
Go
{
"items": [],
"rels": {
"home": ["https://sixtwothree.org"],
"me": ["https://sixtwothree.org"]
},
"rel-urls": {
"https://sixtwothree.org": {
"rels": ["me"]
}
}
}
PHP
{
"items": [],
"rels": {
"me": ["https://sixtwothree.org"],
"home": ["https://sixtwothree.org"]
},
"rel-urls": {
"https://sixtwothree.org": {
"text": "Jason Garber",
"rels": ["home", "me"]
}
}
}
Python
{
"items": [],
"rels": {
"me": ["https://sixtwothree.org"],
"home": ["https://sixtwothree.org"]
},
"rel-urls": {
"https://sixtwothree.org": {
"text": "",
"rels": ["home", "me"]
}
}
}
Ruby
{
"items": [],
"rels": {
"me": ["https://sixtwothree.org"],
"home": ["https://sixtwothree.org"]
},
"rel-urls": {
"https://sixtwothree.org": {
"rels": ["home"],
"text": "Jason Garber"
}
}
}
Note: The Node parser on microformats.io appears to be offline.
So…
The test suite's rel
tests appear to conform to the spec as its written today. What I'd like help sorting out is what seems like an arbitrary (or, at least undocumented) decision to only aggregate rel
attribute values in the rel-urls
result structure. The extended attributes are, per the spec, worth capturing, but not worth capturing as arrays. That seems strange.
Can someone shed some light on the subject and/or can we update the spec to be more clear or to change behavior?
Edit 1: #39 is tangentially related to this, as well.
Edit 2: #32 is also related to this.