Click here to get back home

How to detect text charset (UTF-8 or Latin-1)

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
How to detect text charset (UTF-8 or Latin-1) Thomas Armstrong 01-15-2008
Get Chitika Premium
Posted by Thomas Armstrong on January 15, 2008, 11:59 am
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,


> There
> is a rogue carriage return (0xd) in the string

> Is there something I can do to deal with this
> situation?


Repair the corrupted file:

perl -p -i -e 'tr/\r//d' bad_file


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Posted by Lawrence Statton on January 15, 2008, 12:22 pm
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,


> There
> is a rogue carriage return (0xd) in the string

> Is there something I can do to deal with this
> situation?


Repair the corrupted file:

perl -p -i -e 'tr/\r//d' bad_file


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Posted by smallpond on January 15, 2008, 12:45 pm
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,


> There
> is a rogue carriage return (0xd) in the string

> Is there something I can do to deal with this
> situation?


Repair the corrupted file:

perl -p -i -e 'tr/\r//d' bad_file


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Posted by Joe Smith on January 15, 2008, 2:16 pm
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,


> There
> is a rogue carriage return (0xd) in the string

> Is there something I can do to deal with this
> situation?


Repair the corrupted file:

perl -p -i -e 'tr/\r//d' bad_file


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Posted by Jürgen Exner on January 15, 2008, 3:53 pm
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,


> There
> is a rogue carriage return (0xd) in the string

> Is there something I can do to deal with this
> situation?


Repair the corrupted file:

perl -p -i -e 'tr/\r//d' bad_file


--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas

Similar ThreadsPosted
How to convert latin1 to utf8 February 25, 2005, 8:53 am
filename charset and internal Perl utf8 June 8, 2006, 2:48 pm
How to detect text file encoding in Perl May 20, 2006, 11:52 am
Text::Levenshtein and utf8 woes March 26, 2006, 3:45 am
Displaying utf8 text in perl -d September 14, 2007, 5:57 pm
Convert utf-8 to latin1 December 24, 2005, 3:17 pm
How to detect an undefined SV* value in XS? December 6, 2004, 1:21 pm
Detect Popups May 20, 2005, 8:44 pm
How to detect a dead parent? August 18, 2005, 9:19 pm
runs but can't detect the error May 31, 2006, 8:22 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap