encoding of scripts

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

using HTML 4.01 (not xhtml), I have recently discovered that this:

<script>var x='</script>';</script>

is not valid HTML - the fact that there is an end script tag in quotes
causes the parser to stop recognising the script. initially my reaction was
that this is not a surprise because I had failed to HTML encode the script
contents, so my second attempt was this:

<script>var x='&lt;/script&gt;';</script>

however this it DOES NOT WORK - the variable ends up containing the text

can someone point me at part of the w3c specification that states how script
tags are parsed differently to other tags in HTML.

interestingly i have also discovered that this:

<script>if (3<5);</script>

IS valid html (and seems even to be valid XHTML) even though it is not valid


Re: encoding of scripts

Andy Fish schreef:
Quoted text here. Click to load it

What about:

<script>var x='<\/script>';</script>
Mind the added \

Erwin Moller

Re: encoding of scripts

Quoted text here. Click to load it

http://www.w3.org/TR/html4/sgml/dtd.html#Script :

<!ENTITY % Script "CDATA" -- script expression -->


<!ELEMENT SCRIPT - - %Script;          -- script statements -->

Quoted text here. Click to load it

Apart from the missing required "type" attribute, yes.  The content
type of the script element in HTML4 is CDATA, which means everything
up to the first occurrence of </ is read as-is.

Quoted text here. Click to load it

This is not possible since XHTML is XML.

The content type of the script element in XHTML1 is PCDATA, which that
your original idea of using
var= '&lt;foo&gt;'

means the same as

in a raw javascript file.  Note that this doesn't actually work "in
the wild", because most users have broken browsers (eg: IE).

The best thing to do is to never ever have anything in your script
elements and only include scripts in separate files.


Re: encoding of scripts

On Mon, 2 Jun 2008, Andy Fish wrote:

Quoted text here. Click to load it

In how many newsgroups did you multipost?

Re: encoding of scripts

Scripsit Andy Fish:

Quoted text here. Click to load it

The fact that there is an end tag causes that. Quotes do not matter.
They are just data characters in this context.

Quoted text here. Click to load it

By HTML 4.01 rules, yes. There the content model is CDATA, which means
that entity references are not recognized, and "&" is just a data

Quoted text here. Click to load it

They aren't. The _content_ of the <script> _element_ is special. This
can be found in the HTML 4.01 specs simply by looking at the description
of that element; it points to
which refers to an appendix that explains ways to overcome the "</"
problem, such as prefixing "/" with "\" in JavaScript. In JavaScript,
you could also write
var x='<'+'/script>';
but that looks a bit more hackish.

Quoted text here. Click to load it

No it isn't, but that's due to the lack of the type="..." attribute. If
you fix that, then it is valid. That's because the digit "5" isn't a
name start character.

Quoted text here. Click to load it

It isn't valid in XHTML, since by XHTML rules, "<" must not appear in
any context as such except as the starting character of a tag.

In XHTML, the content model of <script> is #PCDATA, so _there_ you could
use &lt; to stand for "<". But it's not wise to use XHTML as the
delivery format of a web page, because IE does not support XHTML.

Quoted text here. Click to load it

It would be impossible for a document to be non-valid XML if it is valid
XHTML. This immediately follows from the _definition_ of validity.

There is a simple way to get rid of such complexities: write your script
into an external file and refer to it via <script type="text/javascript"

Jukka K. Korpela ("Yucca")

Re: encoding of scripts

thanks for all the replies - i understand it all now

unfortunately i can't write all my scripts in separate js files because this
is all javascript that i'm generating on the fly on the server, but i have
amended my quoting/encoding functions to detect '</' and split it into 2
concatenated strings


Quoted text here. Click to load it

Site Timeline