|
Posted by Paul Bramscher on June 14, 2005, 2:31 pm
Please log in for more thread options
I'm working on application which stores web page content. Generally I'm
turning the whole page into base64 for ease of storage (into a TEXT field).
But I have another field which opens a socket to the page, sucks down
the HTML source, runs strip_tags and other PHP cleansing functions on
it, and inserts the remaining words into a mySQL TEXT column which is
straight text (not turned to base64).
I encounter a problem with foreign languages when I do a mysqldump.
Some of the characters are non-standard ASCII and I can't merely "cat"
the file back in to a mySQL database.
How do folks of non-latin alphabets deal with this? Thanks,
Paul Bramscher
brams006@umn.edu
|