Click here to get back home

handling UTF-8 characters in LWP module

 HomeNewsGroups | Search | About
 comp.lang.perl.modules    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
handling UTF-8 characters in LWP module devs 08-31-2006
Posted by devs on August 31, 2006, 10:39 pm
Please log in for more thread options


hello,
i am trying to write a bot to download wkipedia artictles using
WWW:Wikipedia, a subclass of LWP::UserAgent. pages returned by the
wikipedia
server contains utf8 characters such as LATIN CAPITAL LETTER O WITH
DIAERESIS. however, i see that the lwp module is not handling the
search
results as utf8 encoded. i see that th e character =D6 is treated as
three
individual bytes and not a single character. how do i specify that the
lwp useragent must handle utf8 chars?

thanks in advance,
dave


Similar ThreadsPosted
Spreadsheet::Read special characters handling November 20, 2006, 2:33 am
Problem with DBD::DB2 and UTF8. April 14, 2006, 11:31 am
UTF8 on DBI with Perl April 1, 2007, 11:30 am
Do Win32::ODBC module support Chinese characters when used with MS Access? September 15, 2004, 10:33 pm
LWP, timeouts and error handling September 5, 2004, 3:31 am
Error handling with Compress::Zlib October 21, 2004, 12:26 pm
Looking for portable signal handling to implement Server Daemon July 29, 2004, 12:33 am
Removing non-printing characters ... October 7, 2004, 8:47 pm
replacing nonprintable characters in a file June 3, 2005, 4:21 pm
replacing characters with their ASCII codes August 20, 2005, 8:50 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap