non-english text question

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

I'm working on application which stores web page content.  Generally I'm
turning the whole page into base64 for ease of storage (into a TEXT field).

But I have another field which opens a socket to the page, sucks down
the HTML source, runs strip_tags and other PHP cleansing functions on
it, and inserts the remaining words into a mySQL TEXT column which is
straight text (not turned to base64).

I encounter a problem with foreign languages when I do a mysqldump.
Some of the characters are non-standard ASCII and I can't merely "cat"
the file back in to a mySQL database.

How do folks of non-latin alphabets deal with this?  Thanks,

Paul Bramscher

Site Timeline