Click here to get back home

HTML Tidy vs. HTML Validator

 HomeNewsGroups | Search | About
 comp.infosystems.www.authoring.html    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
HTML Tidy vs. HTML Validator VK 03-04-2006
Posted by VK on March 4, 2006, 8:02 am
Please log in for more thread options


Hi,

After the response on my request from W3C I'm still unclear about Tidy
vs. Validator discrepansies. That started with <IFRAME> issue, but
there is more as I know. Anyway, this very basic HTML page:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN"
"http://www.w3.org/TR/html401/strict.dtd">
<html>
<head>
<title>Demo</title>
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1">
</head>
<body>
<iframe src="http://www.w3.org"></iframe>
</body>
</html>

gives 0 errors / 0 warnings in Tidy. At the same time it tells me "This
page is not Valid -//W3C//DTD HTML 4.01 Strict//EN!" in W3C HTML
Validator which is totally correct as there is not IFRAME in HTML
Strict. At the same time I understand Tidy's behavior either because no
one need a validator choking in IFRAME - no one would use it then.

Nevertheless Tidy is linked on the w3.org front page and on
<http://www.w3.org/People/Raggett/tidy/> which seems as a direct
endorsement to me.

In my request to W3C I asked to add IFRAME to HTML Strict, but the
response was that HTML DTD's are frozen so everything will stay as it
is.

My question is then: is this Tidy's behavior an illegal adjustment made
by his creators, or it's a W3C informally blessed "loosiness"? If Mr.
Raggett himself could elaborate on this issue it would be great.


Posted by Lars Eighner on March 4, 2006, 8:17 am
Please log in for more thread options


In our last episode,
the lovely and talented VK
broadcast on comp.infosystems.www.authoring.html:

> Hi,

> After the response on my request from W3C I'm still unclear about Tidy
> vs. Validator discrepansies.

Tidy is a lint and a prettyprinter. It doesn't parse.

> That started with <IFRAME> issue, but
> there is more as I know. Anyway, this very basic HTML page:

><!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN"
> "http://www.w3.org/TR/html401/strict.dtd">
><html>
><head>
><title>Demo</title>
><meta http-equiv="Content-Type"
> content="text/html; charset=iso-8859-1">
></head>
><body>
><iframe src="http://www.w3.org"></iframe>
></body>
></html>

> gives 0 errors / 0 warnings in Tidy. At the same time it tells me "This
> page is not Valid -//W3C//DTD HTML 4.01 Strict//EN!" in W3C HTML
> Validator which is totally correct as there is not IFRAME in HTML
> Strict. At the same time I understand Tidy's behavior either because no
> one need a validator choking in IFRAME - no one would use it then.

Why don't you use loose then? If you have to have IFRAME there are DTDs
that have it, and 4.01 loose is one of them.

> Nevertheless Tidy is linked on the w3.org front page and on
><http://www.w3.org/People/Raggett/tidy/> which seems as a direct
> endorsement to me.

It is a good lint and prettyprinter. But it doesn't parse. You can even
add tags to Tidy.

> In my request to W3C I asked to add IFRAME to HTML Strict, but the
> response was that HTML DTD's are frozen so everything will stay as it
> is.

The response should have been: if you add elements like IFRAME to
strict, why not call it loose? There is a DTD with IFRAME. It is called
loose.

> My question is then: is this Tidy's behavior an illegal adjustment made
> by his creators, or it's a W3C informally blessed "loosiness"? If Mr.
> Raggett himself could elaborate on this issue it would be great.

--
Lars Eighner usenet@larseighner.com http://www.larseighner.com/
"Fascism should more properly be called corporatism, since it is the
merger of state and corporate power."-Benito Mussolini * When you write the
check to pay your taxes, remember there are two l's in "Halliburton."

Posted by VK on March 4, 2006, 8:57 am
Please log in for more thread options



Lars Eighner wrote:
> Tidy is a lint and a prettyprinter. It doesn't parse.

It does: change from Strict to Frameset or Transitional and different
tags will give you warnings or not. Also the explanation section in
Tidy window is called "HTML Validator". Try to use say "wrap" attribute
in form textarea and you'll get an error - in "HTML Validator" section.
Sorry, but it is much more than a "prettyprinter" may do. It is a
validator - or a program pretending to by such.


> Why don't you use loose then? If you have to have IFRAME there are DTDs
> that have it, and 4.01 loose is one of them.

My question was not about what to use. My question was about two
different outcomes (valid / invalid) for the very same page using HTML
Tidy and HTML Validator.

If HTML Tidy is not a validator, then it should say nothing about
<textarea wrap="soft"...> as it is not his business.
If it is (besides anything else) a validator, then it should scream
both about wrap and iframe - in HTML Strict.

If it's "mostly validator but just a prettyprinter in some selected
cases" then it should be spelled somewhere - with the list of
exceptions. Does it have sense?


Posted by David Dorward on March 4, 2006, 9:10 am
Please log in for more thread options


VK wrote:
> Lars Eighner wrote:
>> Tidy is a lint and a prettyprinter. It doesn't parse.
>
> It does: change from Strict to Frameset or Transitional and different
> tags will give you warnings or not. Also the explanation section in
> Tidy window is called "HTML Validator".

That is what it says - however it doesn't compare the markup to a DTD,
everything is (as far as I know) internalised, and it makes many errors. It
cannot be trusted as a validator.

> Try to use say "wrap" attribute in form textarea and you'll get an error -
> in "HTML Validator" section. Sorry, but it is much more than a
> "prettyprinter" may do. It is a validator - or a program pretending to by
> such.

As Lars said - it is a lint.

> If HTML Tidy is not a validator, then it should say nothing about
> <textarea wrap="soft"...> as it is not his business.

Not being a validator doesn't prevent it from performing error checking that
a validator would also do.

> If it's "mostly validator but just a prettyprinter in some selected
> cases" then it should be spelled somewhere - with the list of
> exceptions. Does it have sense?

The documentation for tidy is misleading. It shouldn't be marked as a
validator, but as an error checking tool (which doesn't cover everything
that a validator would cover).

--
David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
Home is where the ~/.bashrc is

Posted by Toby Inkster on March 4, 2006, 9:28 am
Please log in for more thread options


VK wrote:
> Lars Eighner wrote:
>
>> Tidy is a lint and a prettyprinter. It doesn't parse.
>
> Sorry, but it is much more than a "prettyprinter" may do. It is a
> validator - or a program pretending to by such.

Lars didn't say it was just a prettyprinter. He said it was a *linter* and
prettyprinter.

A linter does many of the same things that a validator does, though it
will allow some technically invalid things, which the author of the linter
decided were allowable, and may object to some technically valid things
that the author of the linter decided were objectionable.

Another difference between a linter and a validator is that the linter
will attempt to fix the errors it finds, whereas a validator will just
tell you about them.

Tidy is not a validator.

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Now Playing ~ ./police/every_breath_you_take.ogg


Similar ThreadsPosted
tidy html editor January 12, 2006, 10:04 pm
Tidy HTML - feedback requestes November 20, 2004, 5:20 pm
trouble using html tidy with template files March 4, 2006, 1:12 pm
html tidy, word 2003 and "smart quotes" April 13, 2005, 7:30 pm
Seeking whole site HTML validator (plus extras) May 25, 2005, 1:33 am
W3C HTML Validator Error - Invalid content-type November 13, 2004, 1:48 am
W3C Validator error?
is valid for doctype HTML 4.01 Strict
April 21, 2005, 12:46 pm
apt-get install wdg-html-validator - leider immer noch hundert. September 29, 2005, 4:13 pm
W3C's HTML validator unable to find PHP or content negotiated files? November 16, 2004, 8:49 pm
Problem with xhtml Validator at http://validator.w3.org/ July 27, 2005, 5:06 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap