|
Posted by Raj on December 12, 2007, 10:27 am
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,
> There
> is a rogue carriage return (0xd) in the string
> Is there something I can do to deal with this
> situation?
Repair the corrupted file:
perl -p -i -e 'tr/\r//d' bad_file
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
|
|
Posted by RedGrittyBrick on December 12, 2007, 10:47 am
Please log in for more thread options
>>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,
> There
> is a rogue carriage return (0xd) in the string
> Is there something I can do to deal with this
> situation?
Repair the corrupted file:
perl -p -i -e 'tr/\r//d' bad_file
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
|
|
Posted by Tad J McClellan on December 12, 2007, 10:19 pm
Please log in for more thread options > Raj wrote:
>> I have large text passages containing names of database tables,
>> procedures, packages, variables etc having the underscore character as
>> a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
>> Z0-9_]+\b" but it matches all words in the passage.
>
> Similarly "[ab]+" matches "aaa" and "aa" though neither contain "b".
>
> Try "\b[a-zA-Z0-9]+_[a-zA-Z0-9_]+\b"
>
> Or "\b\w+_\w+\b"
Three (six?) useless uses of word boundary in the quotes above...
Every pattern there will behave identically without any \b's.
--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher0cmdat/"
|
|
Posted by RedGrittyBrick on December 13, 2007, 5:11 am
Please log in for more thread options Tad J McClellan wrote:
>> Raj wrote:
>>> I have large text passages containing names of database tables,
>>> procedures, packages, variables etc having the underscore character as
>>> a part of the name. eg. rsp_names_friends_master. I tried "\b[a-zA-
>>> Z0-9_]+\b" but it matches all words in the passage.
>> Similarly "[ab]+" matches "aaa" and "aa" though neither contain "b".
>>
>> Try "\b[a-zA-Z0-9]+_[a-zA-Z0-9_]+\b"
>>
>> Or "\b\w+_\w+\b"
>
>
> Three (six?) useless uses of word boundary in the quotes above...
>
> Every pattern there will behave identically without any \b's.
>
>
TFTC
$ perl -e 'print "$_\n" for "_aa-bbb.cc_[d_d]" =~ /\w+/g'
_aa
bbb
cc_
d_d
$ perl -e 'print "$_\n" for "_aa-bbb.cc_[d_d]" =~ /\w+_\w+/g'
d_d
In Perl programs I've written, I don't think I've ever used \b. Perhaps
I should have analyzed the OP's RE completely rather than only
commenting on the primary reason for the problem.
|
|
Posted by Raj on December 12, 2007, 10:54 pm
Please log in for more thread options >>
>> [snip]
>>
>> > if I print "$1\n",
>> > the file prints just fine. But, if I do something like print "$1 after
>> > \n", the whole output is messed up. If I print "before $1\n", nothing
>> > prints at all. If I print "before $1 after\n", only after prints.
>>
>> not really sure, but could be a rogue "\r" in $1,
> There
> is a rogue carriage return (0xd) in the string
> Is there something I can do to deal with this
> situation?
Repair the corrupted file:
perl -p -i -e 'tr/\r//d' bad_file
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
|
| Similar Threads | Posted | | Matching single character words | April 17, 2006, 10:30 pm |
| regular expression for english words | May 12, 2005, 11:50 am |
| Regular expression to match only strings NOT containing particular words | October 19, 2007, 1:00 am |
| Re: Regular expression to match only strings NOT containing particular words | October 19, 2007, 12:40 pm |
| Question about "?" character in Perl Regular Expression | January 2, 2008, 2:58 am |
| regular expression negate a word (not character) | January 25, 2008, 8:16 pm |
| regular expression, matching sub item | January 30, 2006, 6:04 pm |
| matching a complicated url in a regular expression | January 13, 2007, 12:32 pm |
| matching chunks of data with a regular expression | August 26, 2004, 7:02 am |
| Regular Expression check for non matching string | September 22, 2005, 8:27 am |
|