Click here to get back home

CGI.pm: encoding problems

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
CGI.pm: encoding problems Ben Bullock 06-09-2006
Posted by Ben Bullock on June 9, 2006, 10:53 am
Please log in for more thread options


I have a problem with inputing utf-8 via a text window using CGI.pm. This
problem concerns UTF8 so apologies for posting something with Chinese
characters in it.

The following code is a minimal working example of the problem with a lot of
extraneous material removed. It needs to be run under a web server to see
the problem. When the text is submitted using the form, the default text of
Chinese characters (they are the numbers from one to four) are munged into
some gibberish stuff, and the test of the input, which checks whether the
input is valid Chinese numerals, fails:

Input text:

一二三四

Output of program:

Input 一二三四 was not a valid number

Thank you very much for any assistance, suggestions or advice about this
problem.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Begin script (to end of message)

#!/usr/bin/perl
use warnings;
use strict;
use CGI;
use utf8;
binmode (STDOUT, ":utf8");
my $query = CGI->new();
$query->charset('UTF-8');
print $query->header();
my $kanji;
if ($query->param('kanji')) {
my $inputnumber = $query->param('kanji');
if ($inputnumber =~ /^([一二三四五六七八九十]+)$/) {
$kanji = $1;
} else {
print "<p>Input $inputnumber was not a valid number</p>";
$kanji = "";
}
} else {
$kanji = "一二三四";
}
print $query->start_form(-method => 'POST',-action => $query->url());
print $query->textarea(-name => 'kanji',
-default => $kanji);
print $query->submit();
print $query->endform();
print "<table><tr>\n<th>Value</th><td>",
$kanji, "</td></tr>\n", "</table>\n</form>\n<p>\n";
print $query->end_html();


Posted by Dr.Ruud on June 9, 2006, 5:16 pm
Please log in for more thread options


Ben Bullock schreef:

> use warnings;
> use strict;
> use CGI;
> use utf8;
> binmode (STDOUT, ":utf8");

Try to replace those 5 lines with these (reordered) 4:

use strict;
use warnings;
use encoding 'utf8' ;
use CGI;

This would also set the PerlIO layer of STDIN to ':utf8'.

See perldoc encoding.

--
Affijn, Ruud

"Gewoon is een tijger."



Posted by Mumia W. on June 9, 2006, 8:13 pm
Please log in for more thread options


Dr.Ruud wrote:
> Ben Bullock schreef:
>
>> use warnings;
>> use strict;
>> use CGI;
>> use utf8;
>> binmode (STDOUT, ":utf8");
>
> Try to replace those 5 lines with these (reordered) 4:
>
> use strict;
> use warnings;
> use encoding 'utf8' ;
> use CGI;
>
> This would also set the PerlIO layer of STDIN to ':utf8'.
>
> See perldoc encoding.
>

I still get the problem when running Ben's program. The problem is that
using the CGI module to initialize the textarea works the first time and
not the second; however, bypassing CGI.pm and writing the textarea
directly using print seems to work consistently.

The bug might be logic related, but it's more likely CGI.pm-related.

There is a "hint" that the CGI.pm on my Sarge system is not UTF-8 ready.
This appears at the top of every page of output:
<?xml version="1.0" encoding="iso-8859-1"?>

This happens even when the HTTP header says utf8.


Posted by Ben Bullock on June 10, 2006, 1:41 am
Please log in for more thread options


Thanks to Dr. Ruud and Mumia W. for their replies. Thanks to Dr. Ruud I was
able to get this working, but I also noticed a couple of interesting
phenomena in debugging this program. As Mumia W. says the text in the box is
done incorrectly. Also, if I use my own "<input" box the input is mangled,
and if I use the "straight" function calls of CGI.pm rather than the
object-oriented ones, things stop working again, so it does look rather like
there is something wrong inside CGI.pm. If anyone is interested, let me know
and I'll post example code.

Thanks again.


Posted by Mumia W. on June 10, 2006, 9:20 am
Please log in for more thread options


Ben Bullock wrote:
> Thanks to Dr. Ruud and Mumia W. for their replies. Thanks to Dr. Ruud I
> was able to get this working, but I also noticed a couple of interesting
> phenomena in debugging this program. As Mumia W. says the text in the
> box is done incorrectly. Also, if I use my own "<input" box the input is
> mangled, and if I use the "straight" function calls of CGI.pm rather
> than the object-oriented ones, things stop working again, so it does
> look rather like there is something wrong inside CGI.pm. If anyone is
> interested, let me know and I'll post example code.
>
> Thanks again.
>

How were you able to get it working? Re-ordering the prologue and using
utf8 didn't work for me.


Similar ThreadsPosted
Mail::Sender: Encoding of subject November 17, 2004, 12:56 am
Frontier::Daemon encoding question August 15, 2005, 10:40 am
CPAN make test error with XML::Encoding October 7, 2004, 11:28 am
CPAN make test error with XML::Encoding October 7, 2004, 11:28 am
Problems using GD.pm January 8, 2005, 8:05 pm
Problems when using Net::MSN 1.022 May 4, 2005, 1:20 pm
LWP problems July 11, 2005, 3:50 pm
PPM Problems on Win.XP November 5, 2007, 4:26 pm
LWP Problems (Authentication?) September 5, 2004, 7:12 am
Net::FTP mdtm problems December 14, 2004, 11:00 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap