Click here to get back home

DBD::mysql and UTF-8

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
DBD::mysql and UTF-8 yorhel 08-13-2005
Posted by YorHel on August 14, 2005, 2:26 am
Please log in for more thread options



YorHel wrote:
> xhoster@gmail.com wrote:
> > > xhoster@gmail.com wrote:
> >
> > > well... I don't get the exact same data when putting something in de
> > > DB, as getting something from the DB, after a lot of testing, this is
> > > the only way I found.
> >
> > I think you maybe you are testing the wrong thing. Since you want Perl to
> > play nice with others, you should do the experimentation on that focus,
> > i.e. getting Perl to read what others have written, rather than getting
> > Perl to read what Perl has written.
> >
> > >
> > > >
> > > > > That works like a charm within perl, the problem is though, that the
> > > > > data isn't really stored as I want it.
> > > >
> > > > How do you know?
> > > >
> > >
> > > Sorry, forgot to mention that the command-line 'mysql' gives me the
> > > same output as phpMyAdmin, and I have added
> > > 'default-character-set=utf8' at /etc/my.cnf, so I can assume that that
> > > client gives me the "real" output.
> >
> > Is this in the [client] section or one of the server sections? (My client
> > can't even read the /etc/my.cnf file, so of course doesn't care what it
> > says.)
>
> $ mysql --help | grep default-character-set
> default-character-set utf8
>
> seems ok...
>
> >
> > If you insert the utf-8 data using phpMyAdmin and then select it with
> > command line mysql, does it show you what you expected?
> >
>
> Yes, phpMyAdmin and cli mysql give me the same results.
>
> > Anyway, if I knew exactly what Perl was inserting[1], what it was getting
> > back[2], what mysql thought it had[3], and what php thought it had, I would
> > be able to offer more help. Or, if you accept php as being the gold
> > standard, insert something with php and then retrieve it with Perl and use
> > syntax [2] on it.
> >
> > > >
> > > > You just reported that your method works like a charm with mysql and
> > > > Perl.
> > > >
> > >
> > > But I also like to use other tools on the same database, so the IS a
> > > problem :)
> >
> > Yes, but I'm not sure who the problem is with :(.
> >
> > [1] print join ",", map ord, split //, $somevar;
> > [2] print join ",", map ord, split //, $result; # do both before and after
> > # the decode call.
> > [3] SELECT ascii(mid(text,1,1)) FROM test WHERE name = 'test'
> > SELECT ascii(mid(text,2,1)) FROM test WHERE name = 'test'
> > SELECT ascii(mid(text,3,1)) FROM test WHERE name = 'test'
> > etc.
> >
>
> $rows = $dbh->do("UPDATE test SET text = '$somevar' WHERE name =
> 'test'");
> print "\n[1] " . join ",", map ord, split //, $somevar;
>
> $obj = $dbh->prepare("SELECT text FROM test WHERE name = 'test'");
> $obj->execute();
> my $result = ($obj->fetchrow_array())[0];
> print "\n[2] be4 " . join ",", map ord, split //, $result;
> $result = decode('UTF-8', $result);
> print "\n[2] aft " . join ",", map ord, split //, $result;
>
> ## Same data, but inserted with phpMyAdmin
> my $obj = $dbh->prepare("SELECT text FROM test WHERE name =
> 'fromphp'");
> $obj->execute();
> my $fromphp = ($obj->fetchrow_array())[0];
> print "\n[2] php " . join ",", map ord, split //, $fromphp;
>
> my $count = 1; my @res;
> while(1) {
> $obj = $dbh->prepare("SELECT ascii(mid(text, $count, 1))" .
> "FROM `test` WHERE name = 'test'");
> $obj->execute();
> my $res = ($obj->fetchrow_array())[0];
> last if !$res;
> push(@res, $res);
> $count++;
> }
> print "\n[3] " . join ",", @res;
>
>
> gives me
>
> [1] 235,228,227,232,223,231,236,297,237,238
> [2] be4
> 195,171,195,164,195,163,195,168,195,159,195,167,195,172,196,169,195,173,195,174
> [2] aft 235,228,227,232,223,231,236,297,237,238
> [2] php 235,228,227,232,223,231,236,63,237,238
> [3]
> 195,194,195,194,195,194,195,194,195,197,195,194,195,194,195,194,195,194,195,194
>
> It seems to me that DBD::mysql just won't send/receive UTF-8, what I
> get from the php-inserted row is almost the same as $somevar, except
> that it doesn't give me all characters (and therefore isn't real UTF-8)
>

Hate to reply to myself, but after some more googling, I found that
when I cal a
$dbh->do("SET NAMES 'utf8'");
right after the DBH->connect(), I can put the real UTF-8-data in the
DB, so phpMyAdmin and cli mysql both give the real UTF-8 output,
instead of those weird characters. With this, it doesn't matter whether
I call the decode() on $somevar before sending the UPDATE query, the
data will still be inserted as UTF-8.
But then again, I need to set the utf8-flag on $result with decode(),
to get the well-formatted data, which sound like a hack to me, am I not
supposed to get the UTF-8-ed data when I call $obj->fetchrow_array()?
And the use of the "SET NAMES 'utf8'"-call also seems like an hack to
me, why do I need to use that when the database I am using is already
defined as "UTF-8".

Ah well, I am glad I have a "solution" to the problem now :)


> Thnx for your help so far :)
>
> > Xho
> >
> > --
> > -------------------- http://NewsReader.Com/ --------------------
> > Usenet Newsgroup Service $9.95/Month 30GB



Posted by Knut Haugen on August 14, 2005, 7:57 pm
Please log in for more thread options


[ YorHel ]

[ on UTF-8 problems with DBD::Mysql ]

> Hate to reply to myself, but after some more googling, I found that
> when I cal a
> $dbh->do("SET NAMES 'utf8'");
> right after the DBH->connect(), I can put the real UTF-8-data in the
> DB, so phpMyAdmin and cli mysql both give the real UTF-8 output,
> instead of those weird characters. With this, it doesn't matter whether
> I call the decode() on $somevar before sending the UPDATE query, the
> data will still be inserted as UTF-8.
> But then again, I need to set the utf8-flag on $result with decode(),
> to get the well-formatted data, which sound like a hack to me, am I not
> supposed to get the UTF-8-ed data when I call $obj->fetchrow_array()?
> And the use of the "SET NAMES 'utf8'"-call also seems like an hack to
> me, why do I need to use that when the database I am using is already
> defined as "UTF-8".

I have experienced the very same problem and the reason, as far as I
could determine, is that Mysql has something called 'client character
set' and 'connection character set' which get set on a client
connection. Since DBD::Mysql doesn't read the .my.cnf file the
defaults are used, which are swedish latin 1. One would think (and
hope) that mysql would use database charset or table charset or even
default charset, but it doesn't seem to do so (I ran 4.1.12 when
doing this).

See <URL: http://dev.mysql.com/doc/mysql/en/charset-connection.html>
for further details (it doesn't cover perl specifics though).

I also resorted to "SET NAMES 'utf8'" which solved the issue. I,
luckily, didn't have to use encode and decode since the data was UTF-8
on the way into the db, and should be printed as UTF-8 also.

--
Knut
Matchbox cars and soda cans


Posted by Slaven Rezic on August 13, 2005, 9:28 pm
Please log in for more thread options


yorhel@gmail.com writes:

> Hello,
>
> I have a perl (5.8.7) application which tries to put UTF-8 data in a
> mysql (4.1.13) database (with database collation = utf8_general_ci),
> using DBI (1.48) and DBD::mysql (3.0002), like this:
>
> use Encode;
> my $utf8data = decode('UTF-8', $somevar);

This function should be used if $somevar is a variable which contains
utf-8 data in octets to convert it into perl characters.

> my $rows = $dbh->do("UPDATE test SET text = '$utf8data' WHERE name =
> 'test'");

I am not sure that either DBI or DBD::mysql or mysql can handle perl
characters. Probably you should rather supply UTF-8 octets here.

> I can get the same data from the database like this:
>
> my $obj = $dbh->prepare("SELECT text FROM test WHERE name = 'test'");
> $obj->execute();
> my $result = ($obj->fetchrow_array())[0];
> $result = decode('UTF-8', $result);

Again decode?

Regards,
Slaven

--
Slaven Rezic - slaven <at> rezic <dot> de

tkrevdiff - graphical display of diffs between revisions (RCS, CVS or SVN)
http://ptktools.sourceforge.net/#tkrevdiff


Similar ThreadsPosted
Problem with DBD::DB2 and UTF8. April 14, 2006, 11:31 am
UTF8 on DBI with Perl April 1, 2007, 11:30 am
[RESOLVED] DBD::mysql unresolved symbol _intel_fast_memcpy & mysql-standard-5.0.22-linux-i686-icc-glibc23 August 29, 2006, 6:33 pm
MySQL 5.0 and Perl DBD-MySQL December 3, 2005, 1:10 am
Embedded mysql with DBD::mysql July 30, 2008, 10:15 am
DBD:mysql doesn't read mysql option file /etc/my.cnf file January 27, 2005, 11:19 pm
Problem with DBD-mysql November 20, 2004, 2:34 am
DBD::mysql problem November 24, 2004, 12:24 am
problem installing DBD-mysql January 10, 2005, 5:51 pm
To install mysql for DBI in Perl 5.6 May 3, 2005, 2:28 am

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap