Click here to get back home

When plain text page is treated as HTML

 HomeNewsGroups | Search

comp.infosystems.www.authoring.html - discuss HTML authoring here 

get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
When plain text page is treated as HTML Eric Lindsay 11-24-2005
Posted by Eric Lindsay on November 24, 2005, 8:16 am
Please log in for more thread options


This may be too far off topic, however I was looking at this page
http://www.hixie.ch/advocacy/xhtml about XHTML problems by Ian Hickson.

It is served as text/plain, according to Firefox
Response Headers - http://www.hixie.ch/advocacy/xhtml

Date: Wed, 23 Nov 2005 21:36:06 GMT
Server: Apache/1.3.33 (Unix) DAV/1.0.3 mod_fastcgi/2.4.2
mod_gzip/1.3.26.1a PHP/4.3.10 mod_ssl/2.8.22 OpenSSL/0.9.7e
Vary: Accept-Encoding,User-agent
X-Pingback: http://tracking.damowmow.com/
Content-Language: en-GB-Hixie
Last-Modified: Sat, 17 Sep 2005 12:16:19 GMT
Etag: "17063c7-4a12-432c0913"
Accept-Ranges: bytes
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/plain; charset=utf-8
Content-Encoding: gzip
Content-Length: 7452

200 OK

The page displays in Firefox, and in Opera as if the text were
surrounded by pre tags. In Safari 2, the page displays as a single long
(but word wrapped) string, as if Safari were treating it as HTML markup.

The interesting point to me is that the displayed contents are
incomplete. Safari has the contents, as looking at source confirms.
The places where the contents are not displayed are

show/hide quoted text
...
show/hide quoted text

which is replaced by *

show/hide quoted text
...
show/hide quoted text

which is not replaced by anything.

The document as displayed truncates on the next paragraph, when it
encounters


Given that the script element never closes, it seems reasonable to hide
the contents.

So my question is, should a browser display a file served as text/plain
the way Firefox and Opera do, or should a browser look deep inside the
file for HTML (or other tags) the way Safari does?

Or should it use some heuristic to second guess the server, given the
number of servers that do not correctly identify content-type?

If a browser pays attention only to the content-type as provided by the
server, what should it do about a file.css served as text/html instead
of text/css? Or isn't that a problem when the css file could be
considered to be included in the html file that calls it?

--
http://www.ericlindsay.com

Posted by Sherm Pendley on November 23, 2005, 8:42 pm
Please log in for more thread options



show/hide quoted text

show/hide quoted text

sherm--

--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org

Posted by Eric Lindsay on November 25, 2005, 6:51 am
Please log in for more thread options



show/hide quoted text

Thanks Sherm. I'm not a developer, so I can only use the bug reporting
menu item in Safari.

--
http://www.ericlindsay.com

Posted by Alan J. Flavell on November 23, 2005, 11:04 pm
Please log in for more thread options


On Thu, 24 Nov 2005, Eric Lindsay wrote:

show/hide quoted text

I've often seen plain-text documents from Hixie, but I must admit
I hadn't looked at their headers.

show/hide quoted text
[...]
show/hide quoted text
[...]
show/hide quoted text

Which is at least *suggestive* that there might be other variants
available, although we don't know what they are...

But a visit to http://www.hixie.ch/advocacy/ shows a conventional
directory listing. If there's any alternative version served out to
other browsers or in other character encodings, it would have to be
done by some kind of server conversion...? *Do* note that
accept-language is *not* one of the negotiation dimensions according
to that Vary header, even though there appears to be a French
translation available in the directory listing.

show/hide quoted text

Well no, it displays "as plain text". There are big differences
between the two assertions, when the material contains markup and
&-notations - which this does.

show/hide quoted text

Booooooh!

show/hide quoted text

This is fun stuff, but you really mustn't let yourself be so grossly
diverted from making real web pages, or you'll risk ending up like me
- posting too much about pedantic detail, and never getting around to
updating my sadly obsolescent web pages. Not good.

show/hide quoted text

Of course.

show/hide quoted text

Sigh. I've been battering on about the mandate of RFC2616, but
somehow it doesn't seem to have sunk home. See the notes below the
table at
http://ppewww.ph.gla.ac.uk/~flavell/www/content-type.html#browconf ,
which now take you directly to the relevant section of (the W3C's
HTML-ised copy of) RFC2616 -
http://www.w3.org/Protocols/rfc2616/rfc2616-sec7.html#sec7.2.1

show/hide quoted text

Absolutely and utterly not. RFC2616 forbids it.

show/hide quoted text

It would still be permissible for a browser to say to its user "excuse
me, this content seems to be the wrong type. At some risk to your
security, I could try to guess this, are you prepared to take that
chance?". What RFC2616 is ruling out is that a client agent should
take it upon itself to unilaterally second-guess, without informed
consent from its user. That's my best interpretation, anyway.

show/hide quoted text

Per RFC2616, it's mandated to ignore it, i.e to render the HTML
without it, and Mozilla does so[1]: that's correct behaviour.
Unfortunately, some other browsers are not so cautious. The web would
be a better place if they were.

[1] at least in its Standards mode.

Posted by Alan J. Flavell on November 23, 2005, 11:09 pm
Please log in for more thread options


On Wed, 23 Nov 2005, Alan J. Flavell wrote:

show/hide quoted text

show/hide quoted text

Sorry, I shot my mouth off too quickly on that point. It wasn't
"accept-charset" in that header, it was "accept-encoding". That's why
his server has sent gzip-ed content, because the browser said it was
willing to accept that encoding. Nothing to do with
character-encoding ("charset"). Sorry for that - spotted my mistake
just too late!

--
Post in haste, repent at leisure...


Similar ThreadsPosted
Again: When plain text page is treated as HTML January 11, 2006, 6:06 pm
text/plain form enctype September 22, 2005, 6:28 am
text/plain form enctype September 22, 2005, 7:00 am
How to make Text or button tag with plain flat look ? August 24, 2005, 12:16 pm
text based html editor with master page? December 15, 2008, 6:17 pm
Dreamweaver or Frontpage or Plain HTML January 8, 2006, 9:01 am
Dreamweaver or Frontpage or Plain HTML January 8, 2006, 7:56 am
Random display of 10 text / banners on a page?? November 21, 2005, 2:00 pm
keep successive text/images in same place on the page? April 25, 2008, 1:48 pm
Grammer teachers - a html page or an html page September 27, 2004, 10:53 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Driving a better car - Fuelzilla.com

Cabling site for homeowners and pros alike - Cabling-Design.com

Friends:

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap
Privacy Policy