uc() and utf8

I have some data in file coded in windows-1250 (cp1250). My script is written
in utf-8 codepage and run on perl 5.8.x. I need to convert some strings to
uppercase but uc() fail. What is the right way?

This example illustrate a problem.

use strict;
use utf8;
use Unicode::Lite;

print "Content-Type: text/html\n\n<html><head>\n",
    "<META HTTP-EQUIV=\"Content-Type\" CONTENT=\"text/html; charset=utf-8\">",

# string in cp1250 codepage
my $wintxt="\xec\x9a\xe8\xf8\x9e\xfd\xe1\xed\xe9";

# convert to utf8
my $utftxt=convert('CP1250','UTF8',$wintxt);

print "<br>Win: $wintxt, ",uc($wintxt),"<br>utf8:$utftxt, ",uc($utftxt),"\n";
print "</body></html>\n";

Petr Vileta, Czech republic
(My server rejects all messages from Yahoo and Hotmail. Send me your mail from
another non-spammer site please.)

Please reply to <petr AT practisoft DOT cz>

Re: uc() and utf8

Petr Vileta wrote:
I suspect that convert() does not turn on the UTF-8 flag of it's return
value. I suggest you check out perldoc Encode. The functionality you
want is probably this:
my $utftxt = Encode::decode( 'CP1250', $wintxt );

I suggest you also look into, and play around with, the functions
is_utf8(), _utf8_on(), _utf8_off() and from_to(). This will give you a
good overall picture of how Perl does UTF-8, but you probably won't need
these here.


Re: uc() and utf8

Petr Vileta wrote:
See "perldoc perllocale".

Gunnar Hjalmarsson
Site Timeline