Click here to get back home

reading Big5 as ASCII: bad idea?

 HomeNewsGroups | Search | About
 comp.infosystems.www.authoring.html    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
reading Big5 as ASCII: bad idea? enrique 03-14-2005
Get Chitika Premium
Posted by enrique on March 14, 2005, 7:39 am
Please log in for more thread options


Our server-side software is reading in Big5-encoded data as ASCII when
the web pages are generated. It seems to work most of the time, since
the HTML meta tag is declaring Big5 as the charset. However, every now
and then certain ASCII characters, like the quote (") for example, gets
read in and creates Javascript errors when the browser renders them.

I think this is a direct side effect of processing our Big5-encoded
files as ASCII. Can anyone confirm my suspicions on this?

I'm thinking perhaps the software should be reading these files as
binary Big5-encoded, instead of ASCII and then the Chinese content
won't be converted to HTML-reserved characters like the troublesome
quote.

Another question: I see in the character chart for Big5 that Latin
letters and characters are supported, including the quote character.
Will a Big5-encoded quote character (not the ASCII quote) cause
Javascript issues as well? I'm hoping that so long as Chinese-only
content is contained in HTML tags using the "lang" attribute set
appropriately (for Chinese), the browser won't attempt to render the
Big5-encoded quote as a Javascript string delimiter.

Thank you.

epp



Posted by Harlan Messinger on March 14, 2005, 12:06 pm
Please log in for more thread options


enrique wrote:
> Our server-side software is reading in Big5-encoded data as ASCII when
> the web pages are generated. It seems to work most of the time, since
> the HTML meta tag is declaring Big5 as the charset. However, every now
> and then certain ASCII characters, like the quote (") for example, gets
> read in and creates Javascript errors when the browser renders them.
>
> I think this is a direct side effect of processing our Big5-encoded
> files as ASCII. Can anyone confirm my suspicions on this?

Big 5 doesn't have anything to do with it. Even if the data was ASCII
and was being read as ASCII, if you're generating code like

        var address = "58 "Q" Street";

because a data field reads

        58 "Q" Street

then you will break the Javascript. On *any* platform this is an issue
with data that contains symbols that are also used within the code to
delimit that data. They need to be escaped. In the example above, you'd
have to have

        var address = "58 "Q" Street";

Switching to single quotes as the string delimiter wouldn't do any good,
because then you'd have

        var name = 'Florence O'Malley';

which you'd have to change to

        var name = "Florence O'Malley';

[follow-ups to comp.lang.javascript]


Similar ThreadsPosted
reading rss October 11, 2007, 2:21 pm
Reading excel table into PHP March 4, 2008, 11:41 pm
Idea: Abbreviated HTML October 7, 2004, 4:05 pm
HELP - Manager wants page counters - I don't think it is a good idea. August 11, 2004, 10:00 am
I have the idea about the search engine, it can improve the experience for the web searcher December 7, 2005, 12:57 am
Using character entities in us-ascii July 26, 2004, 5:31 am
Micro-pump is cool idea for future computer chips July 13, 2006, 6:16 am
Encoding text as bold in ASCII (Possible ???) October 21, 2004, 1:06 pm
Simple high-ascii character encoding August 25, 2005, 3:52 am
=?Big5?B?obe487DqsdCofKFBqU2lQKzJtdetXqTxuKOzVQ==?= April 27, 2005, 9:56 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap