|
Posted by Robbie Hatley on May 3, 2008, 12:12 am
Please log in for more thread options "Gunnar Hjalmarsson" wrote:
> Robbie Hatley wrote:
> > I needed a regex that says "either at the start of a line, OR
> > preceded by some whitespace".
> >
> > The whitespace (if any) is not to be part of the match.
> > That part I know how to do with lookbehind:
> >
> > (?<=\s)($Regex1)
> >
> > Start of line is easy too:
> >
> > ^($Regex1)
> >
> > but when I tried to or them together:
> >
> > my $Regex2 = qr;
> >
> > But for some reason, it matches the empty string at the beginning
> > of every input string. Why is that?
> >
> > What I finally came up with that works is:
> >
> > my $Regex2 = qr;
> >
> > That's pretty messy, tho. Are there easier ways of
> > doing this that I don't see?
>
> It's hard to tell, since you don't show us what's in $Regex1 together
> with some sample data.
My question was generic instead of specific on purpose.
That's what "$Regex1" is. It's a generic regex, not a
specific one. If I'd meant a specific one, I'd have posted
a specific one.
> Assuming that possible whitespace _may_ be part of the match,
In my immediate application, no. But the issue is barely
relevant, if at all.
> while you capture what's matched by the $Regex1 part,
> you can do:
>
> my $Regex2 = qr;
Forces the white space to be at the beginning of the line.
In other words, "whitespace AND start-of-line".
My needs are, "whitespace OR start-of-line".
(I'd use "word boundary", but that can be a symbol, and
I want only instances of the $Regex1 which are preceded by
space or at start of line.)
If $Regex1 is "asdf", I'd like to match the first and
third (but not the second or fourth) asdf in:
asdfyuio qwer uiop %$asdf vbnm asdfijk dkguwy fjuasdf
^^^^ ^^^^ ^^^^ ^^^^
YES NO YES NO
(SOL) (symbols) (space) (letters)
The messy regex I gave does just that; yours only matches
the first instance. Different logic.
--
Cheers,
Robbie Hatley
perl -le 'print "4o6e7o4f0w5llc7m"'
perl -le 'print "0ttp//7ww.7ell.3om/~4onewolf/"'
|