Click here to get back home

Welsh language - ISO-8859-1 or Unicode ?

 HomeNewsGroups | Search | About
 comp.infosystems.www.authoring.html    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Welsh language - ISO-8859-1 or Unicode ? Simon 06-24-2008
Get Chitika Premium
Posted by Harlan Messinger on June 24, 2008, 2:42 pm
Please log in for more thread options
Simon wrote:
> Hello -
>
> I'm working on a team that is planning to add Welsh language support to a
> large existing IT system which is partially web-based and
> English-language-only so far. I've heard that 2 characters in Welsh
> (w-circumflex and y-circumflex) are not supported in our default ISO-8859-1
> character set, so a partial move to Unicode for internal storage of text
> might be required.
>
> I haven't yet found a Welsh-language website that uses these 2 characters,
> so are they actually used much in Welsh? Is not supporting them likely to
> cause problems?

It could be a support problem (though I don't know why, given the
availability of UTF-8 as well as the option of numeric character
references): see the note at the bottom of

http://www.menai.ac.uk/clicclic/

As made clear at

http://www.cs.cf.ac.uk/fun/welsh/Lesson01.html

the circumflex really is supposed to appear in these locations. (Note
that even on this page, section 1.2 explains that because of support
issues, they are using their own ugly work-around for accented
characters.) Examples are given: "ty^" = "house", along with the pair
"gw^ydd" = "goose" and "gwy^dd" = "trees", which are pronounced differently.

Posted by on June 25, 2008, 4:33 am
Please log in for more thread options
Simon wrote:

> I've heard that 2 characters in Welsh (w-circumflex and y-circumflex)
> are not supported in our default ISO-8859-1 character set, so a
> partial move to Unicode for internal storage of text might be required.
>
> I haven't yet found a Welsh-language website that uses these 2 characters,
> so are they actually used much in Welsh? Is not supporting them likely to
> cause problems?

ISO-8859-1 does not even contain a euro sign (€), which seems to be
an even stronger argument to move to Unicode asap than the missing
Ŵ ŵ Ŷ ŷ for Welsh.

Posted by Jukka K. Korpela on June 25, 2008, 12:35 pm
Please log in for more thread options
Scripsit Andreas Prilop:

> ISO-8859-1 does not even contain a euro sign (€), which seems to be
> an even stronger argument to move to Unicode asap than the missing
> Ŵ ŵ Ŷ ŷ for Welsh.

Not really, because
a) the UK does not use the euro currency
b) the euro sign can conveniently be written using the entity reference
€
c) the euro sign should not be used in normal text, according to
reputable language authorities; instead, the currency name should be
written, except perhaps in tables and other contexts where saving space
is crucial.

For commercial pages oriented towards countries using the euro, the euro
sign is needed, but it’s not really comparable to the issue of letters
needed for proper writing of a language.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/


Posted by Andreas Prilop on June 25, 2008, 12:47 pm
Please log in for more thread options
On Wed, 25 Jun 2008, Jukka K. Korpela wrote:

>> ISO-8859-1 does not even contain a euro sign (), which seems
>> to be an even stronger argument to move to Unicode asap
>
> Not really, because
> a) the UK does not use the euro currency

By that logic, they won't need a dollar sign on their keyboards.

> b) the euro sign can conveniently be written using the entity
> reference €

In HTML. But IIRC, the OP wrote of some "large existing IT system"
with internal ISO-8859-1 character set. I wonder if one could
write € there.

Posted by Blinky the Shark on June 25, 2008, 6:49 pm
Please log in for more thread options
Andreas Prilop wrote:

<snip>

For the record, I see no unusual font behavior in the From field from that
post.


--
Blinky
Is your ISP dropping Usenet?
Need a new feed?
http://blinkynet.net/comp/newfeed.html


Similar ThreadsPosted
Using Hindi Language with Unicode January 5, 2007, 12:20 pm
why does unicode.org offer many scripts if unicode is a single code for all characters? May 27, 2005, 6:03 pm
More than one language in a page October 21, 2008, 5:28 pm
HTML Template Language? December 31, 2005, 9:48 am
Limiting the language in a text box to english only November 7, 2004, 8:00 am
Charsets on multi-language website September 10, 2005, 4:38 am
W3C discussion of link types and language February 24, 2006, 11:26 am
Foreign language characters in forms December 5, 2008, 10:37 am
Change of natural language inside alt text? September 8, 2004, 3:37 am
Public identifier language: meaningless or nonsense December 23, 2004, 2:45 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap