Click here to get back home

Welsh language - ISO-8859-1 or Unicode ?

 HomeNewsGroups | Search | About
 comp.infosystems.www.authoring.html    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Welsh language - ISO-8859-1 or Unicode ? Simon 06-24-2008
Posted by Simon on June 24, 2008, 1:00 pm
Please log in for more thread options
Hello -

I'm working on a team that is planning to add Welsh language support to a
large existing IT system which is partially web-based and
English-language-only so far. I've heard that 2 characters in Welsh
(w-circumflex and y-circumflex) are not supported in our default ISO-8859-1
character set, so a partial move to Unicode for internal storage of text
might be required.

I haven't yet found a Welsh-language website that uses these 2 characters,
so are they actually used much in Welsh? Is not supporting them likely to
cause problems?

Thanks



Posted by Simon on June 24, 2008, 1:09 pm
Please log in for more thread options
> Hello -
>
> I'm working on a team that is planning to add Welsh language support to a
> large existing IT system which is partially web-based and
> English-language-only so far. I've heard that 2 characters in Welsh
> (w-circumflex and y-circumflex) are not supported in our default
ISO-8859-1
> character set, so a partial move to Unicode for internal storage of text
> might be required.
>
> I haven't yet found a Welsh-language website that uses these 2 characters,
> so are they actually used much in Welsh? Is not supporting them likely to
> cause problems?
>
> Thanks
>

I've just found a webpage that uses y-circumflex at the end of the third
paragraph, so it can't be that uncommon:
http://news.bbc.co.uk/welsh/hi/newsid_7460000/newsid_7462500/7462534.stm

This webpage uses ISO-8859-1 with entities for the y-circumflex. Using
entities would be very messy in my application, so if support for these
characters is needed, I would have to go for Unicode.
I guess my question still is: would not supporting these 2 characters be
considered bad practice for a Welsh-language business application?



Posted by Geoff Berrow on June 24, 2008, 7:44 pm
Please log in for more thread options
contained the following:

>
>Unfortunately (for me) that webpage uses character entities to represent the
>characters outside ISO-8859-1. This isn't really a workable approach for me,
>because the text I'm displaying will be stored and processed in various
>databases and applications (web and non-web). I will probably end up storing
>and processing the data using UCS-2 or similar and generating webpages in
>UTF-8.


Surely you can add the character entities using a script when the pages
are generated?
--
Geoff Berrow 0110001001101100010000000110
001101101011011001000110111101100111001011
100110001101101111001011100111010101101011

Posted by anahata on June 25, 2008, 3:40 am
Please log in for more thread options
On Tue, 24 Jun 2008 19:15:48 +0100, Simon wrote:

>
> I will probably
> end up storing and processing the data using UCS-2 or similar and
> generating webpages in UTF-8.

I'll add my vote for UTF-8 as the way to go if there's a choice. Either
way there'll be some problems but UTF-8 is likely to be more future-proof
in the long run.

Oh, and I have encountered the need for a w with circumflex, but that was
an old song title so it might have been an archaic Welsh form.

--
Anahata
anahata@treewind.co.uk ==//== 01638 720444
http://www.treewind.co.uk ==//== http://www.myspace.com/maryanahata


Posted by Dr J R Stockton on June 25, 2008, 3:48 pm
Please log in for more thread options
In comp.infosystems.www.authoring.html message <kv6dnRvtYfLOa_zVnZ2dnUVZ
8hednZ2d@posted.plusnet>, Wed, 25 Jun 2008 02:40:03, anahata
>
>Oh, and I have encountered the need for a w with circumflex, but that was
>an old song title so it might have been an archaic Welsh form.

A search for "Welsh Water" rapidly locates <http://www.welshwater.com/>,
in the foot of which is "Dwr Cymru Cyf 2008." Copy'n'paste into here
has not reproduced the circumflex over the w; but graphic copy'n'paste
into Paint, zoomed, reveals it well.

See also <http://www.dwrcymru.co.uk/Welsh/Contactus/index.asp>, or
<http://cy.wikipedia.org/wiki/Hafan>.

ISTM unlikely that the Welsh could manage without their word for water.

--
(c) John Stockton, nr London UK. replyYYWW merlyn demon co uk Turnpike 6.05.
Web <URL:http://www.uwasa.fi/~ts/http/tsfaq.html> -> Timo Salmi: Usenet Q&A.
Web <URL:http://www.merlyn.demon.co.uk/news-use.htm> : about usage of News.
No Encoding. Quotes precede replies. Snip well. Write clearly. Mail no News.

Similar ThreadsPosted
Using Hindi Language with Unicode January 5, 2007, 12:20 pm
why does unicode.org offer many scripts if unicode is a single code for all characters? May 27, 2005, 6:03 pm
HTML Template Language? December 31, 2005, 9:48 am
Limiting the language in a text box to english only November 7, 2004, 8:00 am
Charsets on multi-language website September 10, 2005, 4:38 am
W3C discussion of link types and language February 24, 2006, 11:26 am
Change of natural language inside alt text? September 8, 2004, 3:37 am
Public identifier language: meaningless or nonsense December 23, 2004, 2:45 pm
Language codes / tags in Northern Ireland ? May 18, 2007, 10:01 am
How to set Firefox default language to test internationalization? May 31, 2008, 11:45 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap