Newbie Character encoding problem with XML::Parser

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Well, not really a newbie.

I resurrected an old tool I had written a year or so ago that parses XML
documents (using XML::Parser) and displays some structure data in
various Tk widgets (including a Tk::Text window).

Back when I wrote it, it worked just fine (on Perl 5.6.1, XML::Parser
ver 2.31, libexpat 0.1.0). Fine being defined as XML::Parser and the
underlying expat lib not messing with character entities. So a dash,
encoded as – in the source document, would be displayed as –
in the Tk::Text widget and eventually saved as –.

So now I move my program to a new platform (Perl 5.8.0, XML::Parser ver
2.31, libexpat 1.5.0). Now it (I've verified that its either XML::Parser
or expat) rewtiting the character entities to something. – is being
to x96, which appears in Tk as -1 <-wierd characters..

I can trap the output of the Tk widgets and translate them back to
chcracter entities. But I'd rather find a way to stop it. Any way to
make the new Perl/expat operate like the old one? The XML::Parser::Expat
options 'NoExpand' and 'ProtocolEncoding' don't seem to affect this

Any ideas?  
Paul Hovnanian
Ask not for whom the <CONTROL-G> tolls.

Site Timeline