Discussion:
[FeedValidator] Validating dates
Bill Kearney
2005-11-24 15:21:11 UTC
Permalink
While that might be useful for HTML web pages it seems like a somewhat
bad
idea for what's intended as machine-readble XML files.
Surely the XML files we're talking about (feeds) are just as much or as
little 'machine-readable' as HTML web pages? I said the framework is
there, not that I use it to generate my feed dates!
Well, how a web framework might 'adjust' it's output for incoming HTML
browsers is probably quite a bif different than how XML output for a feed
should be produced. All readers are supposed to be able to process a feed.
There aren't expected to be different versions based on different incoming
RSS reader user-agents. Where it REALLY gets messy is how the embedded RSS
tools inside Firefox and IE7 all look like a browser to the web server.
Thus the web server diligently adulterates the contents of the XML thinking
it's HTML. Bad idea. Works great for web pages though.
Most humans aren't going to see the html source without a browser
processing it either.
That's not the point, the point is a web server framework may well have to
adjust the HTML output depending on what gee-whiz server-side and client
integration features are being used. As in, if it's a IE browser than use
ActiveX, while if it's plain old Lynx then don't bother. Two ends of the
extreme, of course. No RSS readers need such adjustments.

There is, potentially, some argument possible that certain reader programs
might benefit from being sent differing amounts of data. If you knew the
remote client was using a browser that had no ability to process something
then leaving it out might be an idea to consider. The only trouble is
you're then stuck in an endless loop making sure your framework stays up to
date with the remote client's feature set. And then it's back to the same
mess we have in the 'browser wars'.

If I had control over, let's say, a set of remote users all on cellphone
devices. If I knew that their reader programs had no ability to deal the
same range of data as regular desktop reader it might be worth stripping it.
If only to save a few bytes on the transfers. But I'd be more incluned to
just use a different URL so they don't end up with subscription problems.
As in, the set they have on the phone ends up being different and then they
can't sync intelligently. If anything it'd probably be better to use some
sort of gateway or proxy that did it instead of doing it at the source.
Atom even goes so far as to explicitly state which bits are 'language
sensitive' and which bits aren't. I'd have thought the bits that are
are crying out to be translated according to the standard HTTP
mechanisms.
Oh you're wandering down a very messy road when you talk about language
issues in feeds.
It's probably fair to say you don't want to process the data at all. At
least nowhere near the same way as how the HTML pages might get handled.
Yes, that is probably fair. In my case, our feed generating scripts
don't go through the same templating mechanism as our web-page
generating ones (where our web designers have complete freedom over
which bits get translated and which don't, and over the structure of the
output). But they do share the same code for interpreting HTTP language
headers and fetching translated content from our database. I use it
when it's appropriate, <atom:content> being the most obvious example.
It's apparent that not everything will benefit from being 'interpreted' in
that manner. For example, in a feed there's absolutely no need to alter how
a date appears. It's 2822 or 8601 style, period. No localized names of
months, days or ordering changes are needed. For a web page, however, it
would seem like a perfectly reasonable idea. Take an internally stored date
and present it to a german-reading user with language specific wording and
culturally appropriate ordering. That same user on a feed, however, would
have that done for them by the reader program at their end.

Using translation mechanisms on web pages has worthwhile benefits, but not
without carefully crafting the HTML pages to us it properly. A feed might
benefit similarly, but not without similarly careful processing.

-Bill Kearney
Syndic8.com




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
p***@public.gmane.org
2005-11-24 22:07:06 UTC
Permalink
[If this was Usenet I'd set Followups to rss-dev or similar; this is
certainly OT for feedvalidator-users.]
Post by Bill Kearney
While that might be useful for HTML web pages it seems like a somewhat
bad idea for what's intended as machine-readble XML files.
Surely the XML files we're talking about (feeds) are just as much or as
little 'machine-readable' as HTML web pages? I said the framework is
there, not that I use it to generate my feed dates!
Well, how a web framework might 'adjust' it's output for incoming HTML
browsers is probably quite a bif different than how XML output for a feed
should be produced.
Bill, I think we're mostly agreeing violently. I don't ever change my
output (xml or html) according to User-Agent, and I think that's about
as evil on the web as it would be for feeds.
Post by Bill Kearney
All readers are supposed to be able to process a feed. There aren't
expected to be different versions based on different incoming RSS reader
user-agents.
Absolutely!
Post by Bill Kearney
Where it REALLY gets messy is how the embedded RSS
tools inside Firefox and IE7 all look like a browser to the web server.
Thus the web server diligently adulterates the contents of the XML thinking
it's HTML. Bad idea. Works great for web pages though.
Does it? Mostly it just seems to prevent me from using the browser I
want to to do my online banking! Perhaps there's a lot of really good
User-Agent sniffing going on that works so well I'm not aware of it...

[stuff about User-Agent sniffing in feeds which I don't disagree with]
Post by Bill Kearney
Atom even goes so far as to explicitly state which bits are 'language
sensitive' and which bits aren't. I'd have thought the bits that are
are crying out to be translated according to the standard HTTP
mechanisms.
Oh you're wandering down a very messy road when you talk about language
issues in feeds.
Oh yes (but that is the only thing I was talking about).

HTTP language negotiation works perfectly for direct subscriptions (if
supported by the reader), but not so well (I imagine) for web-based
aggregators like Syndic8. For Syndic8 I could provide different
'autodiscovery' <atom:links> to feeds with a hard-coded language
parameter in the url, but that defeats my beautiful HTTP language
negotiation if someone subscribes to one of those hard coded links using
a standalone reader.

And then there's the issue of <atom:id>s...
Post by Bill Kearney
It's probably fair to say you don't want to process the data at all. At
least nowhere near the same way as how the HTML pages might get handled.
Yes, that is probably fair. In my case, our feed generating scripts
don't go through the same templating mechanism as our web-page
generating ones (where our web designers have complete freedom over
which bits get translated and which don't, and over the structure of the
output). But they do share the same code for interpreting HTTP language
headers and fetching translated content from our database. I use it
when it's appropriate, <atom:content> being the most obvious example.
It's apparent that not everything will benefit from being 'interpreted' in
that manner. For example, in a feed there's absolutely no need to alter how
a date appears. It's 2822 or 8601 style, period.
Absolutely: what the OP was apparently doing was unquestionably wrong.
But I'm not doing that!
Post by Bill Kearney
No localized names of months, days or ordering changes are needed.
It's not simply 'not needed': if you do that you get an invalid feed.
Post by Bill Kearney
For a web page, however, it
would seem like a perfectly reasonable idea.
For displayed dates sure, but there's at least one place in HTML where
you have machine interpreted dates just like you do in a feed. (<meta
name="date" ... />)

If you had a date in the free-form text of your RSS <description>, it is
perfectly reasonable IMHO to translate it if you can. (Along with the
rest of the text.)

In my case it doesn't work like that because my <description>s are just
blobs of free-form text with no internal structure or templating. But
that's a limitation of my system rather than a deliberate choice.

[...]
Post by Bill Kearney
Using translation mechanisms on web pages has worthwhile benefits, but not
without carefully crafting the HTML pages to us it properly. A feed might
benefit similarly, but not without similarly careful processing.
That's one of the good things about Atom. While it might be obvious to
you and me which bits of a feed may be translated, and which bits should
never be, when I wrote my Atom feed generator I didn't need to think
about it because I could translate (some of) only those elements
explicitly marked in the spec as 'language sensitive'.

Regards,

Peter
--
Peter Robinson
<http://www.ticketswitch.com/> Concerts, sport and theatre tickets



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Kearney
2005-11-24 23:22:04 UTC
Permalink
Post by p***@public.gmane.org
Bill, I think we're mostly agreeing violently. I don't ever change my
output (xml or html) according to User-Agent, and I think that's about
as evil on the web as it would be for feeds.
Heh, indeed, violently so. Always thought that was an odd saying.
Post by p***@public.gmane.org
Does it? Mostly it just seems to prevent me from using the browser I
want to to do my online banking! Perhaps there's a lot of really good
User-Agent sniffing going on that works so well I'm not aware of it...
Some frameworks can do effective things based on knowing the incoming
browser. One example is the ASP.net 2.0 framework, it understands how to
send user controls to the remote client based on the user-agent. But I'd
imagine there are more "broken" frameworks that get it wrong (or are
mis-applied) than situations where it works.
Post by p***@public.gmane.org
HTTP language negotiation works perfectly for direct subscriptions (if
supported by the reader), but not so well (I imagine) for web-based
aggregators like Syndic8. For Syndic8 I could provide different
'autodiscovery' <atom:links> to feeds with a hard-coded language
parameter in the url, but that defeats my beautiful HTTP language
negotiation if someone subscribes to one of those hard coded links using
a standalone reader.
There's no defined mechanism for negotiating language with feeds. That and
plenty of readers are multilingual. I've always found it's better to let
users of a given language select one in that language independently of
whatever 'default' language might be used. That and none of the readers
(I'm aware of) make proper use of language requesting in the first place.
I'm not sure that's a bad thing.
Post by p***@public.gmane.org
Absolutely: what the OP was apparently doing was unquestionably wrong.
But I'm not doing that!
Well, I don't want to characterize anything as "unquestionably wrong" just
not an ideal practice in the context of feeds. While it's true the dates
were being improperly adulterated I'm sure the other aspects of the
framework might offer useful mechanisms.
Post by p***@public.gmane.org
Post by Bill Kearney
No localized names of months, days or ordering changes are needed.
It's not simply 'not needed': if you do that you get an invalid feed.
We once again agree. I was more illustrating the point of why something
might want to adjust between database values and display formats. There's
no need to do this for feeds but it certainly has merit on an HTML page.
Post by p***@public.gmane.org
For displayed dates sure, but there's at least one place in HTML where
you have machine interpreted dates just like you do in a feed. (<meta
name="date" ... />)
Excellent point.
Post by p***@public.gmane.org
If you had a date in the free-form text of your RSS <description>, it is
perfectly reasonable IMHO to translate it if you can. (Along with the
rest of the text.)
Yes, quite useful. Presuming you've effectively identified the actual
cultural/localization methods used on the requesting client.
Post by p***@public.gmane.org
In my case it doesn't work like that because my <description>s are just
blobs of free-form text with no internal structure or templating. But
that's a limitation of my system rather than a deliberate choice.
Yeah, it might be fair to say one could take the idea "too far".
Post by p***@public.gmane.org
That's one of the good things about Atom. While it might be obvious to
you and me which bits of a feed may be translated, and which bits should
never be, when I wrote my Atom feed generator I didn't need to think
about it because I could translate (some of) only those elements
explicitly marked in the spec as 'language sensitive'.
I'm curious, what tools have you seen that effectively handle them?

-Bill Kearney




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
p***@public.gmane.org
2005-11-26 16:41:18 UTC
Permalink
Bill Kearney <wkearney-***@public.gmane.org> wrote:

[snips]
Post by Bill Kearney
Post by p***@public.gmane.org
HTTP language negotiation works perfectly for direct subscriptions (if
supported by the reader), but not so well (I imagine) for web-based
aggregators like Syndic8. For Syndic8 I could provide different
'autodiscovery' <atom:links> to feeds with a hard-coded language
parameter in the url, but that defeats my beautiful HTTP language
negotiation if someone subscribes to one of those hard coded links using
a standalone reader.
There's no defined mechanism for negotiating language with feeds.
That's like saying there's no defined mechanism for caching, compression
or encryption. Feeds delivered by HTTP have a mechanism available;
it's just that people like you and me have to decide whether using it
makes sense.
Post by Bill Kearney
That and plenty of readers are multilingual.
You mean a bilingual French & German speaker might want to read one feed
in French and one in German even if both feeds are available in both
languages? Certainly you're right that the 'transparent' HTTP mechanism
doesn't cope with that at all.
Post by Bill Kearney
I've always found it's better to let users of a given language select one
in that language independently of whatever 'default' language might be
used.
Yes, that probably makes sense, but until <atom:link> there was no
uniform way to do that. My feeds do have a cgi variable to set an
explicit language choice (actually, it gets prepended to the priority
list from HTTP).
Post by Bill Kearney
That and none of the readers (I'm aware of) make proper use of
language requesting in the first place. I'm not sure that's a bad thing.
You may be right!
Post by Bill Kearney
Post by p***@public.gmane.org
Absolutely: what the OP was apparently doing was unquestionably wrong.
But I'm not doing that!
Well, I don't want to characterize anything as "unquestionably wrong" just
not an ideal practice in the context of feeds. While it's true the dates
were being improperly adulterated I'm sure the other aspects of the
framework might offer useful mechanisms.
Heh :) All I meant was "unquestionably wrong" is the (language-based)
adulteration of <pubDate> (or whatever it was). No disrespect intended
to the OP. He was clueful enough to try to validate his feed, and he's
probably fixed that tiny bug by now...
Post by Bill Kearney
Post by p***@public.gmane.org
That's one of the good things about Atom. While it might be obvious to
you and me which bits of a feed may be translated, and which bits should
never be, when I wrote my Atom feed generator I didn't need to think
about it because I could translate (some of) only those elements
explicitly marked in the spec as 'language sensitive'.
I'm curious, what tools have you seen that effectively handle them?
<atom:link language="xx"> for translation autodiscovery? I haven't
looked, but Atom is fairly new and even if there aren't any now there
may well be in the future. Besides it's a bit of a chicken & egg
situation. Err, now I think about it, I haven't actually implemented
that in my feeds anyway...

As for traditional HTTP-based language negotiation, Safari and Firefox
certainly do it in their feed readers (as you'd expect). I assume other
browser-based readers will also. I'm slightly disappointed to see that
NetNewsWire doesn't. Shame really since IMHO it would be perfect for a
desktop aggregator like that.

Regards,

Peter
--
Peter Robinson
<http://www.ticketswitch.com/> Concerts, sport and theatre tickets



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Kearney
2005-11-26 18:06:00 UTC
Permalink
Post by p***@public.gmane.org
Post by Bill Kearney
There's no defined mechanism for negotiating language with feeds.
That's like saying there's no defined mechanism for caching, compression
or encryption. Feeds delivered by HTTP have a mechanism available;
it's just that people like you and me have to decide whether using it
makes sense.
It's not defined in the how RSS feeds are handled. What's possible with
making HTTP requests in a browser are not necessarily mimicked in RSS
readers. This is both a good and a bad thing, depending on what features
you're after.
Post by p***@public.gmane.org
Post by Bill Kearney
That and plenty of readers are multilingual.
You mean a bilingual French & German speaker might want to read one feed
in French and one in German even if both feeds are available in both
languages? Certainly you're right that the 'transparent' HTTP mechanism
doesn't cope with that at all.
And it's been my experience that no one solution works reliably enough not
to antagonize a really large part of the audience. Language is such a
tricky area. Made worse by developers often using only one language (often
english, but not always).
Post by p***@public.gmane.org
Heh :) All I meant was "unquestionably wrong" is the (language-based)
adulteration of <pubDate> (or whatever it was). No disrespect intended
to the OP. He was clueful enough to try to validate his feed, and he's
probably fixed that tiny bug by now...
Yes, I didn't want the postings to the thread to look like anyone was being
overly harsh toward the original posting. I'm not one for political
correctness but I have seen where discussions about language often get
dramatically misinterpreted. I don't think either of us are missing the
point. Hopefully nobody else will either.
Post by p***@public.gmane.org
As for traditional HTTP-based language negotiation, Safari and Firefox
certainly do it in their feed readers (as you'd expect). I assume other
browser-based readers will also.
Well, when you ass-u-me.... I wouldn't. I've seen some really bad examples
of misconfigured browsers. A worst-case scenario is how to deal with data
pasted into a korean web page served from an arabic configured http server
using a danish configured browser on a russian configured OS. There's four
different places where "language" and "encoding" is being set, which one's
correct? From what I've seen various tools make their assumptions based on
different parts, sometimes it almost seems random. Thus I think it leads to
the classic problem of people developing a "fixed" opinion on what's correct
based on what are often entirely INcorrect tool behaviors. Basically, "my
favorite tool says this is OK so all the rest of you are wrong", more or
less. I don't have any miracle answers here, just warnings.
Post by p***@public.gmane.org
I'm slightly disappointed to see that
NetNewsWire doesn't. Shame really since IMHO it would be perfect for a
desktop aggregator like that.
Again, I think it's due to the usual situation of the developer being
focused on a given language and not being aware of or interested in
addressing some of the more subtle aspects of multilingual access. I'm sure
if the issue was brought to any developers attention and explained
effectively they'd certainly be willing to consider adding features for it.
But, much like character encoding, language is a really poorly understood
facet of XML.

-Bill Kearney
Syndic8.com




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
p***@public.gmane.org
2005-11-27 17:52:49 UTC
Permalink
Bill Kearney <ml_yahoo-***@public.gmane.org> wrote:

[...]
Post by Bill Kearney
Post by p***@public.gmane.org
As for traditional HTTP-based language negotiation, Safari and Firefox
certainly do it in their feed readers (as you'd expect). I assume other
browser-based readers will also.
Well, when you ass-u-me.... I wouldn't.
Heh :)

But supposing I'm wrong in that IE (for example) chooses not to use HTTP
Accept-Language (or whatever it's called) when it's fetching a feed. I
don't think it should change what I'm doing. All that will happen is
that I'll send back feeds in English (my own default) unless a language
is explicitly specified in the url.

[...]
Post by Bill Kearney
I don't have any miracle answers here, just warnings.
Sensible ones!

Regards,

Peter



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Bill Kearney
2005-11-27 18:03:36 UTC
Permalink
Post by p***@public.gmane.org
But supposing I'm wrong in that IE (for example) chooses not to use HTTP
Accept-Language (or whatever it's called) when it's fetching a feed. I
don't think it should change what I'm doing. All that will happen is
that I'll send back feeds in English (my own default) unless a language
is explicitly specified in the url.
Actually I'll confess to not being entirely sure just how the /current/ crop
of readers and browsers handle language headers during requests for
resources. What they "should" be doing compared to how they've done it in
the past or even as the specs current dictate is an open question. My
commentary is based more on experience with seeing how various developers
have misinterpreted how all the pieces actually fit and how they were
'supposed to' fit.
Post by p***@public.gmane.org
Post by Bill Kearney
I don't have any miracle answers here, just warnings.
Sensible ones!
Well, thanks. I'd certainly like to see more tools really doing a better
job of language handling. But without more noise from more users and other
developers I'm afraid it'll continue to languish unappreciated and
underdeveloped. Sort of a shame really, as the specs have been remarkable
well reasoned in how things are 'supposed to be' handled. Few tools have
ever really met up with the specs.

-Bill Kearney




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
p***@public.gmane.org
2005-11-27 18:59:50 UTC
Permalink
I'd certainly like to see more tools really doing a better job of language
handling. But without more noise from more users and other developers I'm
afraid it'll continue to languish unappreciated and underdeveloped. Sort
of a shame really, as the specs have been remarkable well reasoned in how
things are 'supposed to be' handled. Few tools have ever really met up
with the specs.
Isn't that always the way!

Peter



Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/

Bill Kearney
2005-11-26 18:37:38 UTC
Permalink
Post by p***@public.gmane.org
Post by Bill Kearney
There's no defined mechanism for negotiating language with feeds.
That's like saying there's no defined mechanism for caching, compression
or encryption. Feeds delivered by HTTP have a mechanism available;
it's just that people like you and me have to decide whether using it
makes sense.
It's not defined in the how RSS feeds are handled. What's possible with
making HTTP requests in a browser are not necessarily mimicked in RSS
readers. This is both a good and a bad thing, depending on what features
you're after.
Post by p***@public.gmane.org
Post by Bill Kearney
That and plenty of readers are multilingual.
You mean a bilingual French & German speaker might want to read one feed
in French and one in German even if both feeds are available in both
languages? Certainly you're right that the 'transparent' HTTP mechanism
doesn't cope with that at all.
And it's been my experience that no one solution works reliably enough not
to antagonize a really large part of the audience. Language is such a
tricky area. Made worse by developers often using only one language (often
english, but not always).
Post by p***@public.gmane.org
Heh :) All I meant was "unquestionably wrong" is the (language-based)
adulteration of <pubDate> (or whatever it was). No disrespect intended
to the OP. He was clueful enough to try to validate his feed, and he's
probably fixed that tiny bug by now...
Yes, I didn't want the postings to the thread to look like anyone was being
overly harsh toward the original posting. I'm not one for political
correctness but I have seen where discussions about language often get
dramatically misinterpreted. I don't think either of us are missing the
point. Hopefully nobody else will either.
Post by p***@public.gmane.org
As for traditional HTTP-based language negotiation, Safari and Firefox
certainly do it in their feed readers (as you'd expect). I assume other
browser-based readers will also.
Well, as the saying goes "when you ass-u-me...." I wouldn't. I've seen
some really bad examples of misconfigured browsers. A worst-case scenario
is how to deal with data pasted into a korean web page served from an arabic
configured http server using a danish configured browser on a russian
configured OS. There's four different places where "language" and
"encoding" is being set, which one's correct? From what I've seen various
tools make their assumptions based on different parts, sometimes it almost
seems random. Thus I think it leads to the classic problem of people
developing a "fixed" opinion on what's correct based on what are often
entirely INcorrect tool behaviors. Basically, "my favorite tool says this
is OK so all the rest of you are wrong", more or less. I don't have any
miracle answers here, just warnings.

I'm certainly in favor of feeds and tools having new features.
Post by p***@public.gmane.org
I'm slightly disappointed to see that
NetNewsWire doesn't. Shame really since IMHO it would be perfect for a
desktop aggregator like that.
Again, I think it's probably due to the usual situation of the developer
being focused on a given language and not being aware of or interested in
addressing some of the more subtle aspects of multilingual access. I'm sure
if the issue was brought to any developers attention and explained
effectively they'd certainly be willing to consider adding features for it.
But, much like character encoding, language is a really poorly understood
facet of XML.

-Bill Kearney
Syndic8.com




Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/rss-dev/

<*> To unsubscribe from this group, send an email to:
rss-dev-unsubscribe-***@public.gmane.org

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
Loading...