Discussion:
Multiple keywords in dc:subject
James Holderness
2005-12-01 15:44:07 UTC
Permalink
I've been adding tagging/category support to my RSS aggregator and I've
noticed that a lot of people use the dc:subject element for tagging entries
in their RSS feeds (Atom too). The problem is that nobody does it
consistently and I can't seem to come up with a safe way of parsing
dc:subject that will work for everyone.

According to the Dublin Core Usage Guide when using multiple keywords you
should "either separate terms with semi-colons or use separate iterations of
the Subject element". [1]

I've never seen semi-colons used in an RSS feed, but multiple subject
elements are quite common (e.g. Danny Ayers' blog [2]) and I've also seen
commas used as a separator (e.g. Simon Willison's blog [3]). Both these
cases are fairly straightforward to parse (as long as Danny doesn't include
commas in his categories which I think is fairly safe if not guaranteed).
However, the real problem arises when people include multiple keywords in a
subject element separated only by a space (e.g. flickr[4] and
del.icio.us[5]).

If I split subject elements into multiple keywords whenever there is a space
I'll break Danny's feed and Simon's feed (they use tags such as "Semantic
Web" and "Web Services"). However if I treat a subject element with no
special separators as a single keyword/category then I'll break both flickr
and del.icio.us (I would imagine two of the biggest providers of tagged
feeds). I'm fairly sure flickr and del.icio.us are doing it wrong, but I'm
not sure it would be very easy to get them to fix their code.

So what is the solution? Any suggestions would be greatly appreciated.

Regards
James

[1] http://dublincore.org/documents/usageguide/elements.shtml#subject
[2] http://dannyayers.com/feed/rdf/
[3] http://simon.incutio.com/syndicate/rss1.0
[4] http://www.flickr.com/groups/interestingness/pool/feed/?format=atom_03
[5] http://del.icio.us/rss/



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Loading...