Regex gurus question

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

I have a string which will always contain a letter followed by
numbers, eg "x12345"

I want to take the numbers and put them in another variable.

So I do this:

$test = "x12345";

$_ = $test;
$justTheNumbers = $1;

print "justTheNumbers is $justTheNumbers\n";

and that works, $justTheNumbers is "12345", which is what I want.

The three-line way I do it seems retarded to me though.  Is there a
simpler way to do it?

I mean, something that would occur more or less naturally to somebody
more familiar with Perl and Regex?  I suppose there's always some
simpler way to do anything ... it seems like I must be missing
something obvious though.

Joe Cosby /
YOU can be more like "Bob" than you are now!

Re: Regex gurus question

Joe Cosby < wrote in

Quoted text here. Click to load it

s/will/is supposed to/;

Good programmers never say "always" or "never."  hehehe.  Unless you do
only trivial programming, Mr. Murphy is going to pay you several visits.  
You need to get in the habit of making your habitat unattractive to him.

Quoted text here. Click to load it

The big problem here (other than the awkwardness of the code) is that if
Mr. Murphy does show up at your door, $justTheNumbers will contain a value
that might *look* OK, but really isn't.  For example, if you got a string
with *no* numbers in it, $justTheNumbers would get the numbers from the
last good string instead.  If you got a string like "x12a345" it would get
"12" which is misleading.

A general rule is that if you're using capturing parentheses, do *not* make
use of the $digit variables unless you've actually tested to make sure the
match succeeded.  Note that a list assignment from a match, as in Gunnar's
first solution, takes care of this; if the match fails, the result will be
undef rather than junk.

The philosophy of "defensive programming" suggests that you should write

($justTheNumbers) = $test =~ /^[[:alpha:]](/d+)$/ or die "unexpected format
in $test: [$test]";

It may *look* like a lot of extra effort, but scores of programmers have
found that the few extra minutes of coding that such techniques entail
saved them many *hours* of time wasted tracking down subtle bugs.

I'll admit, though, that Gunnar's second solution (sloppier, but at least
it will result in an empty string if there are no digits) was the first
thing that came to my mind.

Re: Regex gurus question

Eric Bohlman wrote:
Quoted text here. Click to load it

I guess I'm a sloppy programmer. I would have written it as:

( my $just_digits = $test ) =~ s/\D//g;

I would put the test for correct format, or for any input validation
immediately after the input is received. Extra validation during
processing then becomes redundant. Having said that, validation for such
things as range will have to be done after this statement. So there is
an exception to every guideline.

    --- Shawn

Re: Regex gurus question

Shawn Corey wrote:
Quoted text here. Click to load it

That will work fine if the OP's actual data is the same as the example he
presented ($test = "x12345";) however if the actual data looks something like
"x12345 y12345 z12345" then that will not produce a correct result.

use Perl;

Re: Regex gurus question

John W. Krahn wrote:

Quoted text here. Click to load it

But "x12345a" could also create problems. As I said, input validation
would be done before processing. By this point in the program, $test
will have valid data.

    --- Shawn

Re: Regex gurus question

Joe Cosby wrote:
Quoted text here. Click to load it

You can do it in one step:

     ($justTheNumbers) = $test =~ /(\d+)/;

Or, if the only thing you need to do is cutting off the first character:

     $justTheNumbers = substr $test, 1;

Gunnar Hjalmarsson

Re: Regex gurus question

On Thu, 07 Oct 2004 01:25:41 +0200, Gunnar Hjalmarsson

Quoted text here. Click to load it

Thanks, much appreciated.  

Joe Cosby /
"The ministry of communication is duty-bound to make the use of the
 Internet impossible."    - Taliban leader Mullah Mohammad Omar

Re: Regex gurus question

On Wed, 06 Oct 2004 16:10:18 -0700, Joe Cosby wrote:
Quoted text here. Click to load it

You could easily write what you have written in only one line;

  ( $justTheNumbers ) = $test =~ m/(\d+)/;

However, why use a regular expression at all?  If you are _sure_ that the
string always will begin with only one character, you could use 'substr';

  $justTheNumbers = substr( $test, 1 );

"Software is like sex: It's better when it's free." (Linus Torvalds)

Site Timeline