Numbering the lines in HTML

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
99% of my HTML is generated by CGI scripts.

A common problem is finding the HTML that you want to change (in your
browser; View Source/Firebug/etc) then not knowing where it came from
in the CGI Script.

It would be easy to modify the CGI script such that any line which
starts with an HTML tag gets a prefix such as "<!-- 65 -->" indicating
that this particular line of HTML was generated in line #65 of my

Is there any conceivable way that inserting a comment such as this,
before what is *guaranteed* to be another HTML tag, could cause a
problem for the browser?

Obviously, anything which interrogates the actual HTML could fall foul
of such a change, but that's a manageable problem.

Steve Swift

Re: Numbering the lines in HTML

2012-03-01 13:10, Swifty wrote:

Quoted text here. Click to load it

Browsers can have very strange bugs, but it would be extremely odd to
choke on a normal simple comment that contains just digits (no "--", no
">", nothing that could conceivably mess up a silly wowser)

Quoted text here. Click to load it

Anything that processes HTML source, such as a search engine or a
browser add-on, can be expected to deal with normal simple comments. Any
software that fails to do that will fail miserably with a huge number of
HTML documents.

On the other hand, comments aren't particularly structured, and it is
difficult to use information in them. Comments can be retrieved from the
DOM tree using scripting, but they cannot be styled, for example.

If you use attributes instead, the information in them can be addressed
in a simple manner in scripting and even used in styling. If you can
e.g. arrange so that instead of emitting a comment right before a start
tag, you add an attribute to that tag, you could produce markup like

<p data-l=65>bla bla

Attributes starting with data- are suited for such use, since by HTML5
drafts, they are effectively reserved for "private use" and will never
get any defined semantics in HTML or default processing in browsers (my
wording; but this seems to be the idea).

In addition to just finding the attributes in HTML source, you could
e.g. turn them to visible information using CSS like

[data-l] { position: relative; }
[data-l]:before { position: absolute; top: 0; left: 0;
   border: solid #333 1px; font-weight: bold; padding: 0 0.2em;
   content: attr(data-l); color: #040; background: #ffd; opacity: 0.7; }
[data-l]:hover:before  { opacity: 1; }

Looks a bit messy, but the idea is simple: using CSS, you can, with the
usual CSS Caveats, turn normally invisible information in markup to
visible. And if you would use this just for debugging, it would be
sufficient that the styling works on _your_ browser.

If the generation of HTML markup is unbuffered so that you need to emit
line number information before knowing which tag will be generated, you
could put the attribute in an extra element with empty content,
<a data-l=65></a>
But his could be more risky, since elements with empty content might
disturb rendering (and might be disallowed in some contexts, e.g. you
must not put <a> in the <head> part or between <option> elements).

Yucca, /

Re: Numbering the lines in HTML

On Thu, 01 Mar 2012 14:26:57 +0200, "Jukka K. Korpela"

Quoted text here. Click to load it

Thanks for that suggestion (and the comforting words about my proposed

In some of my pages, I've inserted lines in the format:

... such that a simple program can find all the published data in the
page by scanning for lines which start: "<!-- DATA "

This was a courtesy to other members in my team, who wanted to extract
data from my webpage, but we were all concerned about the consequences
if I changed the active HTML in the page.

So, I could now adopt the data- attribute, albeit at some slight extra
cost to my colleagues.

Steve Swift

Re: Numbering the lines in HTML

2012-03-01 15:25, Swifty wrote:

Quoted text here. Click to load it

That's possible, and should pose no functional problems (though some
email address harvesters _could_ scan comments, too). A more structured
approach would be to present the data in <meta> tags. However, there is
no working standard for them. And on the other hand, some <meta> tags
_are_ recognized by some robots and other programs; this involves both
risks and possibilities, of course.

As a curiosity, Lynx still seems to recognize tags like
so that the user can just hit the "C" key to launch an e-mail writing
interface, with the destination address automatically picked up from the
tag. But other browsers probably all ignore it, and the rev attribute is
being phased out (in HTML5).

In HTML5, metadata is being wikidized*):
But I don't see anything suitable for expressing the author's email
address (<meta name=author ...> is specifically for giving the author's
name). I think this reflects the unrealistic and uncontrolled nature of
the approach.

But you could use e.g.

According to HTML5, you can even scatter <meta> tags around the body,
and browsers allow that, too, they just don't care the least about most
<meta> tags. This way, you could indicate the authorship of various
parts of an HTML document. It's not superior to using comments, just a
little more structured, and the information could be turned to visible,
when desired, relatively easy, on some browsers. (You would probably
need meta { display: inline; } since meta tags may have display: none as
the default.)

*) to wikidize: to use a wiki system to do something purported to look
like standardization.

Yucca, /

Re: Numbering the lines in HTML

On 3/1/2012 4:10 AM, Swifty wrote:
Quoted text here. Click to load it

I don't know the nature of the work you do, but my first thought on
reading this was, "doesn't he use a templating system?"

Most of the HTML I generate via CGI scripts sits in templates that are
external to the script.

Quoted text here. Click to load it

I would think that copious comments in your CGI script would better
serve the purpose.

Re: Numbering the lines in HTML

On Thu, 01 Mar 2012 08:18:46 -0700, Scott Bryce

Quoted text here. Click to load it

Ah, but that doesn't make it easier to find the code in the CGI that
generated that <TABLE> tag. Some of my scripts generate HTML with
hundreds of <TABLE> tags. Putting the comment just before the <TABLE>
tags tells me exactly where to go in the CGI script.

And anyway, my scripts already have copious comments in them. It is so
I will have a chance of understanding how they work in six months
time. At the rate I'm going, I may be lucky to understand the use of a
pencil in 6 months time... :-(

Steve Swift

Site Timeline