FAQ 6.17 Why don't word-boundary searches with "\b" work for me?

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

This message is one of several periodic postings to comp.lang.perl.misc
intended to make it easier for perl programmers to find answers to
common questions. The core of this message represents an excerpt
from the documentation provided with Perl.


6.17: Why don't word-boundary searches with "\b" work for me?

    (contributed by brian d foy)

    Ensure that you know what \b really does: it's the boundary between a
    word character, \w, and something that isn't a word character. That
    thing that isn't a word character might be \W, but it can also be the
    start or end of the string.

    It's not (not!) the boundary between whitespace and non-whitespace, and
    it's not the stuff between words we use to create sentences.

    In regex speak, a word boundary (\b) is a "zero width assertion",
    meaning that it doesn't represent a character in the string, but a
    condition at a certain position.

    For the regular expression, /\bPerl\b/, there has to be a word boundary
    before the "P" and after the "l". As long as something other than a word
    character precedes the "P" and succeeds the "l", the pattern will match.
    These strings match /\bPerl\b/.

            "Perl"    # no word char before P or after l
            "Perl "   # same as previous (space is not a word char)
            "'Perl'"  # the ' char is not a word char
            "Perl's"  # no word char before P, non-word char after "l"

    These strings do not match /\bPerl\b/.

            "Perl_"   # _ is a word char!
            "Perler"  # no word char before P, but one after l
    You don't have to use \b to match words though. You can look for
    non-word characters surrrounded by word characters. These strings match
    the pattern /\b'\b/.

            "don't"   # the ' char is surrounded by "n" and "t"
            "qep'a'"  # the ' char is surrounded by "p" and "a"
    These strings do not match /\b'\b/.

            "foo'"    # there is no word char after non-word '
    You can also use the complement of \b, \B, to specify that there should
    not be a word boundary.

    In the pattern /\Bam\B/, there must be a word character before the "a"
    and after the "m". These patterns match /\Bam\B/:

            "llama"   # "am" surrounded by word chars
            "Samuel"  # same
    These strings do not match /\Bam\B/

            "Sam"      # no word boundary before "a", but one after "m"
            "I am Sam" # "am" surrounded by non-word chars


Documents such as this have been called "Answers to Frequently
Asked Questions" or FAQ for short.  They represent an important
part of the Usenet tradition.  They serve to reduce the volume of
redundant traffic on a news group by providing quality answers to
questions that keep coming up.

If you are some how irritated by seeing these postings you are free
to ignore them or add the sender to your killfile.  If you find
errors or other problems with these postings please send corrections
or comments to the posting email address or to the maintainers as
directed in the perlfaq manual page.

Note that the FAQ text posted by this server may have been modified
from that distributed in the stable Perl release.  It may have been
edited to reflect the additions, changes and corrections provided
by respondents, reviewers, and critics to previous postings of
these FAQ. Complete text of these FAQ are available on request.

The perlfaq manual page contains the following copyright notice.


    Copyright (c) 1997-2002 Tom Christiansen and Nathan
    Torkington, and other contributors as noted. All rights

This posting is provided in the hope that it will be useful but
does not represent a commitment or contract of any kind on the part
of the contributers, authors or their agents.

Site Timeline