Click here to get back home

Regex for "at start of line OR preceded by space".

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Regex for "at start of line OR preceded by space". Robbie Hatley 04-27-2008
Get Chitika Premium
Posted by A. Sinan Unur on April 27, 2008, 7:46 pm
Please log in for more thread options
1h4nVnZ2dnUVZ_vOdnZ2d@giganews.com:

> I needed a regex that says "either at the start of a line, OR
> preceded by some whitespace".

The only difference between this criterion and "preceded by whitespace"
can occur at the beginning of the string. Therefore:

#!/usr/bin/perl

my $x = <<EOSTR;
Test1 Test2
Test3 Test4 Test5
Test6
Test7 Test8
Test9
Test0a

EOSTR

print "$1\n" while $x =~ /(?:\A|\s+)(\S+)/g ;



--
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

Posted by dummy on April 28, 2008, 4:06 am
Please log in for more thread options
On Sun, 27 Apr 2008 23:46:31 GMT, "A. Sinan Unur"

>1h4nVnZ2dnUVZ_vOdnZ2d@giganews.com:
>
>> I needed a regex that says "either at the start of a line, OR
>> preceded by some whitespace".
>
>The only difference between this criterion and "preceded by whitespace"
>can occur at the beginning of the string. Therefore:
>
>#!/usr/bin/perl
>
>my $x = <<EOSTR;
>Test1 Test2
> Test3 Test4 Test5
> Test6
>Test7 Test8
>Test9
> Test0a
>
>EOSTR
>
>print "$1\n" while $x =~ /(?:\A|\s+)(\S+)/g ;

On my XP machine that produces:
Test1
Test2
Test3
Test4
Test5
Test6
Test7
Test8
Test9
Test0a

But this:

use strict; use warnings;
while (<DATA>) {
print "$1\n" if /^\s*(\S+)(?:\s|$)/;
}
__DATA__
Test1 Test2
Test3 Test4 Test5
Test6
Test7 Test8
Test9
Test0a

Gives:
Test1
Test3
Test6
Test7
Test9
Test0a

which I think is better?

Posted by A. Sinan Unur on April 28, 2008, 4:50 am
Please log in for more thread options
dummy@phony.info wrote in

> On Sun, 27 Apr 2008 23:46:31 GMT, "A. Sinan Unur"
>
>>1h4nVnZ2dnUVZ_vOdnZ2d@giganews.com:
>>
>>> I needed a regex that says "either at the start of a line, OR
>>> preceded by some whitespace".
>>
>>The only difference between this criterion and "preceded by
>>whitespace" can occur at the beginning of the string. Therefore:
>>
>>#!/usr/bin/perl
>>
>>my $x = <<EOSTR;
>>Test1 Test2
>> Test3 Test4 Test5
>> Test6
>>Test7 Test8
>>Test9
>> Test0a
>>
>>EOSTR
>>
>>print "$1\n" while $x =~ /(?:\A|\s+)(\S+)/g ;
>
> On my XP machine that produces:
> Test1
> Test2
> Test3
> Test4
> Test5
> Test6
> Test7
> Test8
> Test9
> Test0a
>
> But this:
>
> use strict; use warnings;
> while (<DATA>) {
> print "$1\n" if /^\s*(\S+)(?:\s|$)/;
> }
> __DATA__
> Test1 Test2
> Test3 Test4 Test5
> Test6
> Test7 Test8
> Test9
> Test0a
>
> Gives:
> Test1
> Test3
> Test6
> Test7
> Test9
> Test0a
>
> which I think is better?

How can that be better?

Read the OP's criterion again:

>>> I needed a regex that says "either at the start of a line, OR
>>> preceded by some whitespace".

Yours misses Test2, Test4, Test5 and Test8 which are all preceded by
whitespace.

Sinan

--
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

Posted by dummy on April 29, 2008, 1:28 am
Please log in for more thread options
On Mon, 28 Apr 2008 08:50:53 GMT, "A. Sinan Unur"

>dummy@phony.info wrote in
>
>> On Sun, 27 Apr 2008 23:46:31 GMT, "A. Sinan Unur"
>>
>>>1h4nVnZ2dnUVZ_vOdnZ2d@giganews.com:
>>>
>>>> I needed a regex that says "either at the start of a line, OR
>>>> preceded by some whitespace".
>>>
>>>The only difference between this criterion and "preceded by
>>>whitespace" can occur at the beginning of the string. Therefore:
>>>
>>>#!/usr/bin/perl
>>>
>>>my $x = <<EOSTR;
>>>Test1 Test2
>>> Test3 Test4 Test5
>>> Test6
>>>Test7 Test8
>>>Test9
>>> Test0a
>>>
>>>EOSTR
>>>
>>>print "$1\n" while $x =~ /(?:\A|\s+)(\S+)/g ;
>>
>> On my XP machine that produces:
>> Test1
>> Test2
>> Test3
>> Test4
>> Test5
>> Test6
>> Test7
>> Test8
>> Test9
>> Test0a
>>
>> But this:
>>
>> use strict; use warnings;
>> while (<DATA>) {
>> print "$1\n" if /^\s*(\S+)(?:\s|$)/;
>> }
>> __DATA__
>> Test1 Test2
>> Test3 Test4 Test5
>> Test6
>> Test7 Test8
>> Test9
>> Test0a
>>
>> Gives:
>> Test1
>> Test3
>> Test6
>> Test7
>> Test9
>> Test0a
>>
>> which I think is better?
>
>How can that be better?
>
>Read the OP's criterion again:
>
>>>> I needed a regex that says "either at the start of a line, OR
>>>> preceded by some whitespace".
>
>Yours misses Test2, Test4, Test5 and Test8 which are all preceded by
>whitespace.
>
>Sinan

OOPS!

Posted by Robbie Hatley on May 3, 2008, 1:46 am
Please log in for more thread options

"A. Sinan Unur" wrote:

> "Robbie Hatley" wrote:
>
> > I needed a regex that says "either at the start of a line, OR
> > preceded by some whitespace".
>
> The only difference between this criterion and "preceded by whitespace"
> can occur at the beginning of the string. Therefore:
>
> #!/usr/bin/perl
>
> my $x = <<EOSTR;
> Test1 Test2
> Test3 Test4 Test5
> Test6
> Test7 Test8
> Test9
> Test0a
>
> EOSTR
>
> print "$1\n" while $x =~ /(?:\A|\s+)(\S+)/g ;

Close! But ultimately, no cigar. The preceding space
(if any) must not be part of the match, because this is
actually being used in a s/// command, so i don't
want to strip/alter/replace the leading spaces.

Get rid of the "+" and add in a lookbehind, and it works:

$x =~ s/(?:\A|(?<=\s))$Regex1/Prefix$&/g ;

(Can't use "\s+" in a look-behind, because look-behinds
must always be fixed-width.)

The "^" assertion can also be used instead of "\A":

$x =~ s/(?:^|(?<=\s))$Regex1/Prefix$&/g ;

(Also see Dr. Ruud's post, and my reply to it.)

--
Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant



Similar ThreadsPosted
Matching spaces at start of line November 24, 2005, 9:30 pm
start printing at the end of the previous line July 18, 2006, 10:37 am
how to add a space using a regex August 31, 2005, 2:01 am
Leading Space with REGEX March 29, 2007, 12:20 pm
matching a pattern with a space or no space?? November 9, 2005, 7:45 am
Multiple regex on one line June 6, 2005, 10:38 am
Multi Line Match and Regex November 27, 2006, 10:08 pm
Regex matching a integer in a line February 21, 2007, 1:54 am
using regex to select line matches June 15, 2007, 4:21 pm
multiple regex pattern matching per line? September 4, 2004, 2:24 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap