Try Cognition Now!

Try Online


What is Cognition?

The ultimate aim is to build a metadata-centric browser written in Perl with wxPerl and using either Mozilla Gecko or Apple WebKit as a rendering engine.

But that's a long way off. For now, it's just a parser for metadata embedded in HTML.

Cognition is licensed under the GNU GPL version 3.

So what kind of metadata can it understand?

It supports metadata embedded using the following methods:

Many of these methods make use of namespaces. Standard XML namespaces are mostly understood, and namespaces may also be linked to using RFC 2731. You may run into problems if you define the same prefix differently in different parts of the document. A number of namespaces are also predefined, so that stuff like <meta name="DC.creator"> will "just work" even if the author never explicitly defined the DC prefix.

A number of Microformats are also understood:

The document's structure is inferred from <hX> elements and a tree is built from them, including semantically used tables (i.e. tables which have @summary or <caption>), figures (see figure microformat) and XOXO lists. Sections inferred from headings are automatically given funny-looking identifiers (e.g. <>) for use in RDF @about and @resource unless they already have an @id.

Other miscellaneous buzzwords that Cognition uses or can grok are:

All this data is internally represented in a namespace-aware RDF-triple-like structure. The predefined values for the rel and rev attributes in HTML 3.2 onwards (including HTML5 drafts) are automatically pulled into the XHTML namespace. Microformatted data is assigned logical namespaces. (e.g. hCalendar data is given the namespace "urn:ietf:rfc:2445#", the namespace of the iCalendar standard, from which it inherits its names.)

On the horizon is support for external RDF data (linked to with rel="meta"); support for RDF data types (e.g. xsd:datetime), support for more microformats and the ability to export data from microformats (e.g. export hCard as vCard).

Note that both HTML and XHTML are supported equally. The stuff that strictly speaking should not work in HTML (e.g. XML namespaces, RDFa) does work: HTML is treated as if it were funny-looking XHTML.

What can Cognition do with this metadata?

Not a lot really. Spit it out as a Perl data structure, or export it as RDF.

Powered by…

hCalendar rel-license rel-tag XOXO
adr geo Dublin Core FOAF
Toby Inkster