I think you guys converted me...

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
I've been doing a little research of my own into this whole HTML vs.
XHTML business (I know, old news to all of you) and I finally am
beginning to see what the real problem is. Naturally I understood that
serving XHTML as text/html means you aren't even taking advantage of the
benefits of XHTML, but I figured I was at least "preparing" myself for
the future.

Anyhow, after finding several sites very much against XHTML (but some in
favor), and especially after reading: http://hixie.ch/advocacy/xhtml it
makes a lot more sense to me now why XHTML is considered more or less a
useless language.

But I do have one question still: I know that the main benefit of XHTML
will eventually be that you can integrate other XML-based languages
inside the markup, but what are some simple examples of why/how you
would even do this? Is this something a normal person might do on his
site, or is this a more advanced area that is used by businesses, etc.?


Re: I think you guys converted me...

Quoted text here. Click to load it

How: http://www.spartanicus.utvinternet.ie/mixed_namespace.xhtml
(requires an SVG enabled browser such as Opera)

Note that you can achieve the same result with HTML by embedding content
such as SVG:  http://www.spartanicus.utvinternet.ie/single_namespace.htm
(again requires an SVG enabled browser such as Opera)

I'm not aware of any significant advantage that the mixed namespace
method has over the embedding method.


Re: I think you guys converted me...

On Sat, 4 Feb 2006, Spartanicus wrote:

Quoted text here. Click to load it

Just for the record, both of your examples are also rendered by
firefox 1.5.

Re: I think you guys converted me...

Quoted text here. Click to load it

And they're also rendered properly in Firefox's little brother Camino  



Re: I think you guys converted me...

On Fri, 03 Feb 2006 21:31:15 -0500, John Salerno

Quoted text here. Click to load it

XHTML is demonstrably very far from useless.

I work mainly on content-management systems, not hand-authoring directly
to the web. My page content might move through half-a-dozen separate
processes before finally reaching the web. Through all of these steps,
using XHTML is a _massive_ benefit to me, rather than HTML. There is
just no question of this - XML processing tools are cheap, easy and
powerful, everything that SGML failed with completely.

For the final output, I can transform to HTML output and in some ways
this is even easier than XHTML  (it's hard to generate good Appendix C
XHTML from most XSLT tools)

Appendix C XHTML is a kludge, but it's a kludge that works on the web
and allows XHTML to be served in a way that's at no disadvantage
compared to HTML.  Now if I'm already going to be using XHTML
internally, then who benefits from pushing it into another yet format
just to serve out ?  I need a good argument for going back to HTML over
Appendix C.

Hixie's position is pure sophistry. He constructs a valid argument
against a proposition no-one is really advocating (Authoring XHTML so it
can be treated as either XML or HTML simultaneously) then he attacks the
real situation (serving XHTML to existing HTML browsers) on the basis of
spec subtleties that no credible browser ever supported. I'm still
waiting for a screenshot of a browser demonstrating the "infamous

I particular he creates two notable straw men
* The "xmlns" attribute is invalid HTML4.
 * The XHTML DOCTYPEs are not valid HTML4 DOCTYPEs.

Why should we care that the xmlns attribute is invalid ?  Why should we
care about validation at all ?   Validation is useful because it's
objective and simple (you're either valid or not), but equally there is
no benefit accruing from validation _itself_, only from achieving a
standard of "objective compatibiity" that is most easily achieved by
demonstrating validity.   In the case of xmlns though, the
well-established rules on ignoring unknown attributes can handle things
perfectly adequately.

As th the doctype validity issue, then this is bogus. The XHTML doctype
is perfectly valid, it's merely different. It's also well-established
for some years now and widely understood by browsers (to the extent that
browsers do anything useful with a doctype anyway).

If we take Hixie's own position of "Ivory tower SGML purist who hasn't
even noticed the M$oft barbarians at the gate", then doctypes have
always been flexible and extensible by SGML's rules. Now in web terms
this was a bad approach and never really worked (despite the number of
late-'90s HTML editors that tried). A custom doctype gains you nothing
on the web, it's not used by an SGML parser to extend HTML, and the very
best you can hope for is that a doctype is seen as one of a handful of
magic identifier strings that the browser recognises. Perhaps it's a
sadness that the web chose to never use this feature and go down this
route (personally I don't think so, but that's for another thread).
However I do not have to listen to arguments against varying a doctype
in a respectable  from someone who is simultaneously castigating me for
SHORTTAG !   Take your pick - the ivory standards or the real web
practices - you can't pick and choose whichever happen conveniently to
support your own position.

Hixie is asking me to throw away the processing capabilities of XML in
favour of pleasing the tiny handful of SGML-anoraks who even understand
what the problem is. This is no bargain.

Meanwhile the rest of the world sees MS Office and Dreamweaver as
appropriate HTML authoring tools, despite their absolutely glaring
holes.  The enemy here is bad and bogus markup with no structure
whatsoever, not XHTML.

Re: I think you guys converted me...

[Convincing pro XHTML argument snipped]
Quoted text here. Click to load it
What is wrong with Dreamweaver 8 in the context of your argument?
Pages that I write using it validate using both the built-in validator
and the one at http://validator.w3.org /.

Re: I think you guys converted me...

On Sat, 4 Feb 2006, Andy Dingley wrote:

Quoted text here. Click to load it

Good point.

In practical terms, yes, although it can reasonably be argued that it
*relies* on at least one browser bug.

Quoted text here. Click to load it

Didn't you just give at least one answer to that point, a moment ago?

Quoted text here. Click to load it

Whether you agree with his supporting arguments or not, there's
certainly one point where he's got it spot-on.  Vast swathes of
so-called Appendix-C XHTML are in fact unfit to be called XHTML -
they're nothing more than XHTML-ish-flavoured tag-soup - the very
thing that XML claimed it was going to save us from.

The clue is that those who promote the use of XHTML - amongst authors
who have no idea why they are making that choice - have taken us from
a situation where there was one horrible legacy of HTML-flavoured tag
soup, to a situation where there are two horrible legacies of tag
soup, with none of the benefits that were claimed for XHTML.  Most of
that stuff is useless as real XHTML anyway - it only gets rendered
tolerably because it's being parsed as "HTML with a deliberate bug".

As you have said yourself, it's easier to emit good HTML than it is to
emit good Appendix-C-compatible XHTML/1.0, *even* when your internal
process is XML-based.  And, since the latter offers *no benefits
whatever to the existing web* as compared to the former (and even
relies on a widespread browser bug, and brings with it some quite
unnecessary additional complications), why not just keep on emitting
HTML, *until* the web is ready to deploy real XHTML with some real
additional benefits relative to either flavour of "text/html" ?

Otherwise, I'd venture a hunch that XHTML (at least most of what
currently purports to be XHTML) is due to fester in its own dreck,
alongside the festering HTML-flavoured tag soup legacy, and we'll need
some alternative clean solution (don't ask me what it might be), in
place of the one which XML claimed to offer but which seems to be
failing - except for a few commendable exceptions ("present company",
and all that).

Quoted text here. Click to load it

A pity, then, that I didn't keep a screen shot of emacs-w3 before it
got deliberately broken to avoid the problem.  You don't have to
believe me, but it's nevertheless true.  A web search reminds me that
we were discussing it in 2001, but I'm not sure just when emacs-w3 got
nobbled in that way.  At that time, Toby Speight (for one) evidently
considered that the popular browsers were broken because of their
failure to implement this non-optional feature of SGML.

Quoted text here. Click to load it

Then we get into *real* sophistry, for example that HTML purports to
be an application of SGML while at the same time ruling-out constructs
which SGML forbids to be ruled-out.  But this line of argument would
get us nowhere, if you only care about "what works in practice" never
mind the theory.

Quoted text here. Click to load it

I don't think so.  Here's his key advice:

|| If you use XHTML, you should deliver it with the
|| application/xhtml+xml MIME type. If you do not do so, you should
|| use HTML4 instead of XHTML.

If you interpret that word "use" to refer to what you deliver to the
web, *irrespective* of your internal process, then it seems to me to
be good advice, and consistent with what you said already.

He's asking you, for the time being, to do what you already described
above - have your process emit good HTML.  You evidently don't have
any sympathy for the various pillars of the argument which he used to
support that advice, but it seems, from what you said above, that this
part of the advice is consistent with what you yourself said.

Your internal processes may be interesting to discuss, but in the
final analysis they're no concern of the web user: *their* only
justified concern is the quality of your final product as emitted from
your web server. As far as I'm concerned, you'd be welcome to code in
well-structured LaTeX, whatever, and convert that to HTML for the web
- the criterion being the quality of the final result, no matter what
your internal process.

Quoted text here. Click to load it

Indeed.  But we now have a widespread practical demonstration (as if
it wasn't obvious that this was going to happen) that encouraging
tag-soup cooks to cook a different flavour of tag-soup goes nowhere
towards improving the quality of the web.

I'd have to blame it on the W3C for failing to foresee the
consequences of them offering a transition path from HTML to so-called
XHTML, instead of making it plain that it was meant to be a clean
break from an unwelcome legacy.  That they would offer specifications
for "Transitional" and "Frameset" XHTML just made things worse.


Re: I think you guys converted me...

On Sat, 4 Feb 2006 15:45:43 +0000, "Alan J. Flavell"

Quoted text here. Click to load it

No, I gave one answer for one possible set of circumstances (simplistic
use of XSLT).  There's more to XML than XSLT. There are other ways to
serialise XSLT's output other than the default.

Quoted text here. Click to load it

This is certainly true, but is it any worse than HTML ?

Is XHTML expected to be any more parseeable by a non error-correcting
XML parser than a similar situation for HTML with an SGML parser ?  In
many ways XHTML _is_ better here - the well-formedness condition is
self-evident in the absence of a DTD and is easily tested by even a
crude editor.  Mangled tags are the sort of trivia that's either
perfect, or else we're allowed to be brutal in error recovery from it.

The more subtle problem, and from where tag soup really arises, is with
SGML. Clever DTD-based parsing rules are all very well when they're done
properly, but how often are they?

I saw this fragment (abbreviated) lately, together with a highly
confusing validation report  (maybe in this ng.) and a plaintive cry
about CSS problems.


Now why does the validator claim so vehemently that <link> has the
problem ?  Only someone who is familiar with the obscure <basefont>
_and_ with SGML parsing behaviour can understand this.

This is a problem inherent in the use of optional elements (sometimes),
or particularly in optional closing tags. In XML they're mandatory, so
that the document can be correctly parsed into its infoset, even without
knowing the DTD.

In XML, <basefont> could never follow <head> directly, there would
always have to be an explicit <body>. An XML parser would thus report
the errorr to be about <basefont> having been placed into the <head>
(and <link> is thus correct), rather than SGML's behaviour of seeing
<basefont> as implying the automatic position of <body> and thus
(incorrectly) seeing <link> as mis-placed.

SGML is all very clever, but it's no bloody use !  Real people, in suits
and ties, just can't work it.

Quoted text here. Click to load it

I don't recall XML ever claiming that. XHTML might have done, but this
is an aberration from the HTML "random hand-coding with bad editors"
camp.  XML (~HTML, ~web) has usually been quite reasonable about
compliance, well-formed at least if not actually valid..

RSS seems to have sufered from HTML contagion by proximity and is
probably the most badly formed disalect out there.

Quoted text here. Click to load it

_Three_ flavours of tag soup!  Lets not leave RSS out of this - as far
as character-level and syntactic encoding goes, it's by far the worst

Quoted text here. Click to load it

No, only in the case of trivial XSLT use.

There are many other ways I could be generating XHTML for output. The
popular PHP & template methods, even the expensive Obtree CMS,  generate
garbage with no pretence at XML well-formedness because they really are
pure-text writeln-based output.

Quoted text here. Click to load it

Certainly. But will the solution to this necessarily require the
protocol itself to be thrown away ?  

IMHO, we _will_ gradually improve average validation quality of most web
sites. This will be driven by non-desktop devices and the resultant
quality of the auto-transcoding of content onto them. Once big operators
realise that a valid and fluid site looks good on a phone as well as a
powerpoint presentation, then they'll slowly start to drop the rigid
pixelated PSD designs of recent years and look towards validity too.
Geocities homepages won't even notice.

Hixie's key point seems to be that premature use of XHTML, done badly,
will be damaging to XHTML in the long-run.  This is a reasonable view,
although I don't believe it myself.  I also doubt that Hixie believes it
either - given his attempts to really throw a clog into XHTML with his
HTML 5 schism.

Quoted text here. Click to load it

Does it?   I'd always understood that it was inspired by SGML, but long
conceded that it wasn't strictly a valid SGML application. I don't weep
for the passing of SHORTTAG certainly, because (for whatever reason) it
clearly is no longer part of HTML.

I don't much care whether doctypes are references or identifiers either.
Identifiers are obviously less flexible, but they seem to be adequate
for the web's purposes. There's also a long and complex argument that a
flexible DTD conveys no benefit anyway, unless you also bundle some sort
of processing model along with it - <marquee> doesn't become renderable
just because you've added it to a DTD, only if you've also bound it to
some rendering behaviour.

The XHTML doctypes though _are_ already widespread and recognised.
Hixie's position fails because they're either permitted by SGML's rules,
or they're already commonplace enough to stand as opaque identifiers.

Quoted text here. Click to load it

Isn't that what XHTML 2.0 is about ?  And that's _far_ worse !

#1A1A1A is the new black

Re: I think you guys converted me...

On Sun, 5 Feb 2006, Andy Dingley wrote:

[you made some points that I respectfully disagree with, but
there seems nothing to be gained by anyone if we get bogged down
in them, so I'll leave them be.  But a few points seem to call for

Quoted text here. Click to load it

It depends what criteria you take into account.  It's certainly no
better than HTML, but I'd say that in a number of respects it's worse.

Bearing in mind that - in a practical sense - HTML served as text/html
has to be parsed by some kind of tag-soup slurper with masses of error
fixup code; whereas we were told (by some, at least) that XHTML was
going to put an end to the need for all that fixup code - just a
simple parser, and predictable rendering routines.

It seems to me inevitable that when the masses do get it into their
heads to switch from text/html to application/xhtml+xml, there's going
to be massive clamouring for all these tag-soup documents to be
rendered "correctly" (in *their* sense of correctly, i.e "looks the
same as what MSIE used to do"), just like the mess that developed with

Quoted text here. Click to load it

If you wanted HTML without omitted tags, you could have had it with
SGML all along.  If you wanted to eliminate SHORTTAGS, you can do so
in SGML.

I'm not proposing that one should start on that now; I'm just saying
that you shouldn't use problems for which SGML *does have* a solution,
as your basis for saying that SGML is unsuitable.

Quoted text here. Click to load it

Taking out the parts for which SGML does have a solution, then, your
argument is based just on XML's concept of well-formedness.

Quoted text here. Click to load it

There's certainly far more in SGML than HTML needs.

Quoted text here. Click to load it

That's what these detailed arguments boil down to, indeed.

Quoted text here. Click to load it


I haven't quite worked out how that fits into any picture yet, so I'm
reserving judgment.

Quoted text here. Click to load it

How else would you interpret this, then?

|| An HTML document is an SGML document that meets the constraints of
|| this specification.


Site Timeline