Posting UTF-8 form data with WWW::Mechanize and LWP

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

I just spent all day figuring this out so I am posting online so other
people with similar problems know how to solve it.

Basically I had UTF-8 korean data that I wanted to submit through a
form using WWW::Mechanize, which uses LWP. However, I found that this
line here:

        $q =~ s/([^$URI::uric])/$URI::Escape::escapes/go;

in was trashing my values - the encoding being sent to the
server was truncated. It turns out that the %escapes hash only is valid
for ASCII chars 0-255, not the higher UTF-8 chars. I proved this by
finding this line in

# Build a char->hex map
for (0..255) {
    $escapes = sprintf("%%%02X", $_);

Thus, at the beginning of my script, I added:

    for (256 .. 65535)
        $URI::Escape::escapes =

Further, for all UTF-8 encoded content I receive back, I ran this:
    my $content = pack "U0C*", unpack "C*", $mech->content;

Before attempting to parse it (doing regular expressions on it with
foreign language data and such).

I couldn't find anything else on the web about using WWW::Mechanize
with UTF-8 form data, and what I have done above seems to work but is a
hack, I know. If anyone has better suggestions or if I've missed some
built-in facility of Escape or LWP, please reply, but again I'm posting
all this so if someone else runs across a similar problem they will be
able to solve it faster that I.


Site Timeline