Unicode characters in e-mail / database

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View
Kimmo Laine wrote:
Quoted text here. Click to load it

Unless next_page.php generates PHP, the script with this include will
only get HTML.

Quoted text here. Click to load it


    if (isset($_GET['foo'])) {
      echo '<?php echo $_GET[\'foo\']; ?>';
    } else {
      echo '<?php echo \'Not available\'; ?>';

File not found: (R)esume, (R)etry, (R)erun, (R)eturn, (R)eboot

Re: Unicode characters in e-mail / database

Quoted text here. Click to load it

That is a problem, if you do not know what encoding it is typed in. So
you will have to determine the encoding. For example, utf-8 bytes are
perfectly valid as iso-8859-1 bytes, but look very different.

Quoted text here. Click to load it

I wonder how anyone can enter russian in a form that only supports
iso-8859-1. BUT, even if the website is served in iso-8859-1 (are you
really sure?), you can give an "accept-charset" attribute in the form
element in HTML. So no, it does not really have to change. If you want
to render all languages on that site though, I would recommend it.

Quoted text here. Click to load it

What encoding is the text? And what encoding does the server expect?

Quoted text here. Click to load it

In e-mail (just as with a web page), you just state the encoding with
the content-type (for instance, Content-Type: text/plain; charset=utf-8)

The real problem with encodings is that there is a difference between a
text and a string. A string is just a chain of bytes, whereas a text is
a chain of bytes with an encoding. Every program or system I know of
stores texts as strings, so this means that you will have to track the
encoding used in a separate fashion. By far the easiest way to go is to
dictate the preferred encoding to the browser and the database.

If you want to set MySQL to use utf-8, start a connection with "SET
NAMES utf8;". If you want to talk utf-8 with a browser, use a
Content-Type header or configure this in PHP.INI. To support multiple
PHP servers, you can query the current charset from the PHP.INI file
using the ini_get function (beware that iso-8859-1 is used when this
setting is empty).

Hope this helps.
Willem Bogaerts

Application smith
Kratz B.V.

Re: Unicode characters in e-mail / database

Thank you to both of your for your replies. Via switching to the
HTMLMimeMail class and using the content-type headers suggested, my
HTML and plain text e-mails are now echoing the characters correctly.

My database problem, however, remains. Even after using "SET NAMES
utf8" I'm having the same problem - as I understand it this should be
the first query I execute after establishing the connection? Should I
be using any specific collation (the table is latin1_swedish_ci at the
moment) - I'm guessing it should be switched to UTF-8?

Final question, is there any way to retrieve old data from the
database (which is broken in the same fashion) and restore it to the
correct Unicode values?

Thanks again for the rapid responses.

Michael Price

Re: Unicode characters in e-mail / database

Michael Price wrote:

Quoted text here. Click to load it

Data is not "broken". Data is in a different character encoding than the one
you're using.

You *need* to know the character encoding used, and then apply an encoding
conversion function.

Iván Sánchez Ortega -ivansanchez-algarroba-escomposlinux-punto-org-

Now listening to: Yonderboi - Bar Lounge Classics Vol. 3 CD 1 (2002) - [1]
Road Movie (6:50) (96.000000%)

Site Timeline