Click here to get back home

represent any Unicode character by means of a markup string coded in us-ascii

 HomeNewsGroups | Search | About
 comp.infosystems.www.authoring.html    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
represent any Unicode character by means of a markup string coded in us-ascii lkrubner 05-27-2005
Posted by lkrubner on May 27, 2005, 10:08 pm
Please log in for more thread options


>Alan J. Flavell          Oct 7 2004, 1:44 pm show options
>>On Thu, 7 Oct 2004, Shmuel (Seymour J.) Metz wrote:
>> >I think you mean "multiple character encoding schemes".
>> Yes, although a different character set would imply a different
>> encoding scheme.
>
>Absolutely not. That's the whole point!
>
>In (X)HTML you can (if you so choose) represent any Unicode character
>by means of a markup string coded in us-ascii, even. The use of other
>encoding schemes is merely a convenience when the desired character
>repertoire fits a particular pattern, but whichever encoding scheme
>you choose, you still - in principle - have access to any other
>Unicode character you need, by means of &-notation.

I could change any Unicode character to its html notation, if only I
had a way to find out the Unicode value of the characters in the string
I'm given. But given a random set of string inputs, possibly copy and
pasted from WordPerfect or Microsoft Word or BBedit on a Mac, I don't
know how to find the Unicode value of those characters.



Posted by Alan J. Flavell on May 28, 2005, 2:48 pm
Please log in for more thread options


On Sat, 27 May 2005, lkrubner@geocities.com wrote:

> I could change any Unicode character to its html notation, if only I
> had a way to find out the Unicode value of the characters in the
> string I'm given.

What's the context here? In order to know what "characters" you have
been given, you need to know what encoding they are represented in. If
they're not an encoding of Unicode itself, then you can normally refer
to the appropriate cross-mapping table at the Unicode site to
determine the corresponding hexadecimal Unicode value. That's the
value that you'd need (converted to decimal if you so choose) in the
&#...; representation in HTML.

> But given a random set of string inputs, possibly copy and pasted
> from WordPerfect or Microsoft Word or BBedit on a Mac, I don't know
> how to find the Unicode value of those characters.

If you're talking about forms submission, then the usual arrangement
is that the characters are submitted using the same character encoding
as the page which contains the form which they're submitted from.
For working with modern browsers, I'd normally recommend that you use
utf-8 for that. (No good with NN4.*).

http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html

(But if you've been sent utf-8 and you're willing to store files in
utf-8 then you don't really *have* to use &#...; representation
anyway. It's your choice, really.)

You're then reliant on what the client platform actually does when
copy/pasting from another application window into the form.

That can have some unexpected glitches, since Word (especially older
versions) has a nasty habit of changing to a non-standard font e.g
Symbol and inserting a Latin letter (e.g W) to get a symbol (e.g Omega
or Ohm sign). This doesn't really work in HTML - MS of course will
fool its users by repeating the error in MSIE, but a properly
conforming www-compatible browser will display the W that the markup
asked for - not the symbol that was intended.


Similar ThreadsPosted
Best accessibility practice? - HTML markup to represent dates September 29, 2005, 12:18 am
How to parse overlapping string in a HTML file and modify the string?...@@. February 1, 2005, 9:20 pm
unicode and numeric character reference in html October 18, 2007, 4:12 pm
Can a GUI be coded entirely in HTML? May 23, 2008, 2:16 am
why does unicode.org offer many scripts if unicode is a single code for all characters? May 27, 2005, 6:03 pm
Blank UA String October 11, 2008, 7:55 pm
What are the limits on a GET query string? September 11, 2005, 7:30 pm
passing a string in html January 5, 2006, 1:01 pm
Query String appended to URL May 11, 2006, 9:33 am
Table width is ignored if cell contains query string August 24, 2005, 4:58 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap