[newbie] Problems with character output.

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Hi group.

I've installed ActivePerl on win XP and I'm having some problems. I've
tried documentation at activestate but found nothing on this topic.

When a string contains a non-english character (for ex. [thats
á if you don't see it]) and the script prints that string to the
screen, I get a lousy character, like if the cmd shell didn't support
this character. But I can type it directly in a shell prompt and it
shows ok.

I've tried playing with locale (use locale;) and perl uses es_es (that's
my system's locale, aka es_ES.1252), so those character should print
out. Windows local is also set to Spanish.

I tried "use utf8;", "use iso-8859-1;" and "use latin1;" to no avail.
With utf-8 I get a whole lot of warnings and instead of weird chars I
get a blank space.

I've also tried encoding my strings:

use Encode;
$u = ""; # The value of this string is an "a" acute
$s = decode("latin1", $u);
print $s, "\n";

gives no result. Instead of an "a tilde" I get a Greek Beta. I've also
tried "iso-8859-1" and "windows-1252", but to the same effect. I'm quite
lost. This is just a wild guess: could there be any problem with the
console itself? Could I be *so* lucky to find a bug?

I'd really appreciate any help on this.


Re: [newbie] Problems with character output.


Quoted text here. Click to load it

The editor and your console use different code pages.

John                               MexIT: http://johnbokma.com/mexit/">http://johnbokma.com/mexit/
                           personal page:       http://johnbokma.com/">http://johnbokma.com/
        Experienced programmer available:     http://castleamber.com/">http://castleamber.com/
            Happy Customers: http://castleamber.com/testimonials.html">http://castleamber.com/testimonials.html

Re: [newbie] Problems with character output.

On Sat, 9 Oct 2004, Reven wrote:

Quoted text here. Click to load it

Your problem is in using a command window (which by default is
effectively providing an MS-DOS environment).

Quoted text here. Click to load it
Quoted text here. Click to load it

I think you mean "a-acute" (in iso-8859-1 or windows-1252 coding, that
would be 0xE1)

Quoted text here. Click to load it

Here's a clue.  Visit
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC /
and inspect the CP437 (USA national DOS codepage) or CP850
(multinational DOS codepage) tables, where you will find that 0xE1
represents "small letter sharp-s" (the German double-s character),
which has fooled you by looking rather like a Greek beta.

Quoted text here. Click to load it

In a sense, yes.  This is a longstanding misunderstanding, which
Microsoft have not put much effort into documenting for the end user:
right from the start of MS Windows, the DOS command window has
implemented the MS-DOS character "code pages", which pre-date current
8-bit character coding conventions (such as iso-8859-x and
windows-125y for various x and y).

Quoted text here. Click to load it

At best it could be described as "documented as broken", but the
documentation is very hard to find if you don't know what you're
looking for.  I haven't studied this issue specifically in XP, but I
first met it in Win95, and again later (and somewhat differently) in

You may be able to fool it by changing your DOS window font from its
initial setting (does yours say "Raster Fonts", as my Win2000 system
is doing?) to e.g "Lucida Console".  However, doing that globally
might have some unpleasant effects on any software which was actually
designed to run under DOS (for example, DOS box-drawing characters
will come out funky).

There may be some useful terms in this posting that you can use to
Google for other answers related to this issue.  Good luck.

Site Timeline