Click here to get back home

Regex for "at start of line OR preceded by space".

 HomeNewsGroups | Search | About
 comp.lang.perl.misc    Post an article   get this group's latest topics as an RSS feed add this group's latest topics to your My MSN content add this group's latest topics to your My Yahoo content
Subject Author Date
Regex for "at start of line OR preceded by space". Robbie Hatley 04-27-2008
Posted by Robbie Hatley on April 27, 2008, 2:17 am
Please log in for more thread options

I needed a regex that says "either at the start of a line, OR
preceded by some whitespace".

The whitespace (if any) is not to be part of the match.
That part I know how to do with lookbehind:

(?<=\s)($Regex1)

Start of line is easy too:

^($Regex1)

but when I tried to or them together:

my $Regex2 = qr;

But for some reason, it matches the empty string at the beginning
of every input string. Why is that?

What I finally came up with that works is:

my $Regex2 = qr;

That's pretty messy, tho. Are there easier ways of
doing this that I don't see?

--
Curious,
Robbie Hatley
perl -le 'print "4o6e7o4f0w5llc7m"'
perl -le 'print "0ttp//7ww.7ell.3om/~4onewolf/"'




Posted by Gunnar Hjalmarsson on April 27, 2008, 2:54 am
Please log in for more thread options
Robbie Hatley wrote:
> I needed a regex that says "either at the start of a line, OR
> preceded by some whitespace".
>
> The whitespace (if any) is not to be part of the match.
> That part I know how to do with lookbehind:
>
> (?<=\s)($Regex1)
>
> Start of line is easy too:
>
> ^($Regex1)
>
> but when I tried to or them together:
>
> my $Regex2 = qr;
>
> But for some reason, it matches the empty string at the beginning
> of every input string. Why is that?
>
> What I finally came up with that works is:
>
> my $Regex2 = qr;
>
> That's pretty messy, tho. Are there easier ways of
> doing this that I don't see?

It's hard to tell, since you don't show us what's in $Regex1 together
with some sample data.

Assuming that possible whitespace _may_ be part of the match, while you
capture what's matched by the $Regex1 part, you can do:

my $Regex2 = qr;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Posted by Robbie Hatley on May 3, 2008, 12:12 am
Please log in for more thread options
"Gunnar Hjalmarsson" wrote:

> Robbie Hatley wrote:
> > I needed a regex that says "either at the start of a line, OR
> > preceded by some whitespace".
> >
> > The whitespace (if any) is not to be part of the match.
> > That part I know how to do with lookbehind:
> >
> > (?<=\s)($Regex1)
> >
> > Start of line is easy too:
> >
> > ^($Regex1)
> >
> > but when I tried to or them together:
> >
> > my $Regex2 = qr;
> >
> > But for some reason, it matches the empty string at the beginning
> > of every input string. Why is that?
> >
> > What I finally came up with that works is:
> >
> > my $Regex2 = qr;
> >
> > That's pretty messy, tho. Are there easier ways of
> > doing this that I don't see?
>
> It's hard to tell, since you don't show us what's in $Regex1 together
> with some sample data.

My question was generic instead of specific on purpose.
That's what "$Regex1" is. It's a generic regex, not a
specific one. If I'd meant a specific one, I'd have posted
a specific one.

> Assuming that possible whitespace _may_ be part of the match,

In my immediate application, no. But the issue is barely
relevant, if at all.

> while you capture what's matched by the $Regex1 part,
> you can do:
>
> my $Regex2 = qr;

Forces the white space to be at the beginning of the line.
In other words, "whitespace AND start-of-line".
My needs are, "whitespace OR start-of-line".

(I'd use "word boundary", but that can be a symbol, and
I want only instances of the $Regex1 which are preceded by
space or at start of line.)

If $Regex1 is "asdf", I'd like to match the first and
third (but not the second or fourth) asdf in:

asdfyuio qwer uiop %$asdf vbnm asdfijk dkguwy fjuasdf
^^^^ ^^^^ ^^^^ ^^^^
YES NO YES NO
(SOL) (symbols) (space) (letters)

The messy regex I gave does just that; yours only matches
the first instance. Different logic.

--
Cheers,
Robbie Hatley
perl -le 'print "4o6e7o4f0w5llc7m"'
perl -le 'print "0ttp//7ww.7ell.3om/~4onewolf/"'



Posted by Frank Seitz on April 27, 2008, 3:25 am
Please log in for more thread options
Robbie Hatley wrote:
>
> my $Regex2 = qr;
>
> But for some reason, it matches the empty string at the beginning
> of every input string. Why is that?

Because | has a low precedence.

> What I finally came up with that works is:
>
> my $Regex2 = qr;
>
> That's pretty messy, tho. Are there easier ways of
> doing this that I don't see?

qr

Frank
--
Dipl.-Inform. Frank Seitz; http://www.fseitz.de/
Anwendungen für Ihr Internet und Intranet
Tel: 04103/180301; Fax: -02; Industriestr. 31, 22880 Wedel

Posted by Robbie Hatley on May 3, 2008, 2:07 am
Please log in for more thread options

"Frank Seitz" wrote:

> Robbie Hatley wrote:
> >
> > my $Regex2 = qr;
> >
> > But for some reason, it matches the empty string at the beginning
> > of every input string. Why is that?
>
> Because | has a low precedence.

Ah, I wish I'd seen this post before some of the others, but
oh well. Yes, I figured that out a few minutes ago while
replying to Dr. Ruud's post. (As you can tell, I'm still
new at some of the subtleties of Perl REs.)

> > What I finally came up with that works is:
> >
> > my $Regex2 = qr;
> >
> > That's pretty messy, tho. Are there easier ways of
> > doing this that I don't see?
>
> qr

Yes. Or better, qr, so that $1
is the match. Or perhaps qr
which has *NO* backreferences, and use $& instead of $1.

Anyway, thanks for the help.

--
Cheers,
Robbie Hatley
lonewolf aatt well dott com
www dott well dott com slant user slant lonewolf slant



Similar ThreadsPosted
Matching spaces at start of line November 24, 2005, 9:30 pm
start printing at the end of the previous line July 18, 2006, 10:37 am
how to add a space using a regex August 31, 2005, 2:01 am
Leading Space with REGEX March 29, 2007, 12:20 pm
matching a pattern with a space or no space?? November 9, 2005, 7:45 am
Multiple regex on one line June 6, 2005, 10:38 am
Multi Line Match and Regex November 27, 2006, 10:08 pm
Regex matching a integer in a line February 21, 2007, 1:54 am
using regex to select line matches June 15, 2007, 4:21 pm
multiple regex pattern matching per line? September 4, 2004, 2:24 pm

Our other projects:

Art Dolls, Fairies and Mermaids - Sunnyfaces.net

Roy's Linux, Programming and Search Engines messages

1-Script XML SitemapXML Sitemap