|
Posted by harryfmudd [AT] comcast [DOT] on December 9, 2006, 2:43 pm
Please log in for more thread options
harryfmudd [AT] comcast [DOT] net wrote:
> Al wrote:
>
>> Hi,
>> Any suggestions for handling Asian characters from the original Excel?
>> Perl's binmode setting helps to support accented characters fine.. but
>> when you go beyond the 256 bits.. seems that the Spreadsheet::Read Perl
>> module may have no way of knowing what Excel's encoding is.
>>
>> I'd like to input an excel that has Asian characters, process with
>> perl, and then write a csv or xml file (utf-8 encoded) with proper
>> Asian content.
>>
>> A
>
>
> I'm not an expert on non-ASCII character sets, so the following is
> somewhat provisional. But the thread has been fallow for about a day and
> a half, and I figure if I say something horribly wrong someone will jump
> at the opportunity to correct me.
>
> Anyhow, this is what I _think_ the situation is.
>
> I've never used Spreadsheet::Read, but the docs look like it's an
> umbrella module, and under the hood it selects the correct module to
> read the spreadsheet you gave it. The docs also seem to say that for
> Excel it's Spreadsheet::ParseExcel.
>
> Spreadsheet::ParseExcel apparantly will take a filehandle instead of a
> spreadsheet name, giving you the opportunity to set the encoding you
> want when you open the input file or when you binmode() it. See the docs
> for Encode::PerlIO.
>
> I could have sworn I saw documentation somewhere in the Encode-related
> modules for a subroutine that would try to guess the encoding of a chunk
> of text, but at the moment I can't find it.
>
> Tom Wyant
It's Encode::Guess. Duh.
Tom Wyant
|