Cost of qr// vs m//

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

Here is a piece of code:


use strict;
use warnings;

use Time::HiRes;
use Benchmark;

my @strings =  map { sprintf "%08X\n", rand(0xffffffff); } 1 .. 100;

my $r = qr/some/;

sub compiled
     /$r/ for (@strings)

sub live
     /some/ for (@strings)

my $results = Benchmark::timethese(100000, {
     'compiled'  =>  \&compiled,
     'live'      =>  \&live,


Running it gives me:

Benchmark: timing 100000 iterations of compiled, live...
   compiled:  2 wallclock secs ( 2.67 usr +  0.00 sys =  2.67 CPU) @  
37453.18/s (n=100000)
       live:  1 wallclock secs ( 0.90 usr +  0.00 sys =  0.90 CPU) @  
111111.11/s (n=100000)
              Rate compiled     live
compiled  37453/s       --     -66%
live     111111/s     197%       --

On: This is perl 5, version 14, subversion 2 (v5.14.2) built for  

I don't really understand these results: qr// seems to cost more, but I  
don't find anything in the perldoc about that.

Do I miss an error in this benchmark?
Does somebody have any information about that overhead I see?

If I had to guess, I would suspect dereferencing cost of a Regexp ref.  
Could it be right?


Re: Cost of qr// vs m//

Adrien BARREAU schrieb am 28.03.2014 13:32:
Quoted text here. Click to load it

That's done twice ...
    $r for (@strings)
and you will see a speed advantage for the compiled version.

Regards, Horst
<remove S P A M 2x from my email address to get the real one>

Re: Cost of qr// vs m//

Quoted text here. Click to load it

That's hardly suprising, given that this code doesn't do a regexp-match
at all :-).

Re: Cost of qr// vs m//

Rainer Weikusat schrieb am 28.03.2014 14:57:
Quoted text here. Click to load it
Ah, sorry, I misunderstood the 'used standalone' in the qr section of
perldoc perlop.

Regards, Horst
<remove S P A M 2x from my email address to get the real one>

Re: Cost of qr// vs m//

Quoted text here. Click to load it


Quoted text here. Click to load it

This mystery is easily explained when looking the the decompiled/
disassembled internal represention (I've omitted everything except the
actual loop). 'live' becomes

[rw@sable]/tmp#perl -MO=Concise,live
-           <1> null K/1 ->b
a              <|> and(other->7) K/1 ->b
9                 <0> iter s ->a
7                    </> match(/"some.*3"/) v/RTIME ->8
8                    <0> unstack s ->9

In contrast to that, 'compiled' is

-           <1> null K/1 ->i
h              <|> and(other->b) K/1 ->i
g                 <0> iter s ->h
e                    </> match() vK/RTIME ->f
d                       <|> regcomp(other->e) sK/1 ->e
b                          <1> regcreset sK/1 ->c
c                             <0> padsv[$r:601,602] s ->d
f                    <0> unstack s ->g

For the qr'ed case, it actually calls into the top-level regexp compiler
routine (pp_regcomp) on each iteration which gets the already compiled
regexp out of the passed argument in case contained a (reference) to an
already compiled regexp instead of calling the 'real' regexp compiler.
Judging from the (5.10.1) C-code, the compiled regexp is also copied to
'a temporary object' for each match.

A more interesting result: Adding a

my $other = 'some';

sub interpolated
    /$other/ for @strings;

shows that this is faster (at least for me) as well. Presumably, this
happens because the 'last regex compiled for this op' is cached 'in the
op' and it will be re-used without recompilation if the 'source pattern'
didn't really change. In this case, no 'temporary copy' is made.

Re: Cost of qr// vs m//

On 2014-03-28 14:51, Rainer Weikusat wrote:

Quoted text here. Click to load it

With a recent Perl:

perl -Mstrict -wE'
   use Benchmark ":hireswallclock";

   say "\nPerl $]\n";

   my @strings = map { sprintf "%08X\n", rand(0xffffffff); } 1 .. 100;

   my $qr   = qr/some/;
   my $some = "some";

   my $results = Benchmark::timethese( -3, {
     compiled => sub { /$qr/   for @strings },
     literal  => sub { /some/  for @strings },
     interpol => sub { /$some/ for @strings },

   say "";

Perl 5.019006

Benchmark: running compiled, interpol, literal for at least 3 CPU seconds...
   compiled: 3.14617 wallclock secs ( 3.13 usr +  0.00 sys =  3.13 CPU)  
@ 19945.69/s (n=62430)
   interpol: 3.03571 wallclock secs ( 3.01 usr +  0.00 sys =  3.01 CPU)  
@ 59009.30/s (n=177618)
    literal: 3.09564 wallclock secs ( 3.09 usr +  0.00 sys =  3.09 CPU)  
@ 106284.79/s (n=328420)

              Rate compiled interpol  literal
compiled  19946/s       --     -66%     -81%
interpol  59009/s     196%       --     -44%
literal  106285/s     433%      80%       --


Re: Cost of qr// vs m//

Quoted text here. Click to load it

I do not see how it explains anything…

Quoted text here. Click to load it
                                          ^^^^ it
Quoted text here. Click to load it

With 5.8.8 (the last version for which I bear some responsibility),
the timing is
  qr     2.28
  q     2.19
  inline 1.78
for (I do believe in Benchmark):

  D:\ilya\math>time D:\Programs\win32_utils\perl\bin\perl.exe -wle "$r=qr/some/; /$r/ for 1e5..1e7"
  0.00user 0.07system 0:02.28elapsed 3%CPU (0avgtext+0avgdata 252416maxresident)k
  0inputs+0outputs (1021major+0minor)pagefaults 0swaps

  D:\ilya\math>time D:\Programs\win32_utils\perl\bin\perl.exe -wle "$r= q/some/; /$r/ for 1e5..1e7"
  0.01user 0.12system 0:02.19elapsed 6%CPU (0avgtext+0avgdata 252416maxresident)k
  0inputs+0outputs (1021major+0minor)pagefaults 0swaps

  D:\ilya\math>time D:\Programs\win32_utils\perl\bin\perl.exe -wle "           /some/ for 1e5..1e7"
  0.00user 0.09system 0:01.78elapsed 5%CPU (0avgtext+0avgdata 252416maxresident)k
  0inputs+0outputs (1021major+0minor)pagefaults 0swaps

*This* is reasonable, and matches what I intended with qr//.  The
observed with newer versions behaviour MUST be a bug.


Re: Cost of qr// vs m//

Quoted text here. Click to load it
Quoted text here. Click to load it
Quoted text here. Click to load it

It's much better than the current behaviour, but reusing the precompiled
regex is still *slower* than just recompiling the regex every time. I
thought the intention was to be faster - ideally almost as fast an
inline regexp.

(Of course /some/ is an atypically simple regex, the results may be
different with a more complex regex).


   _  | Peter J. Holzer    | Fluch der elektronischen Textverarbeitung:
|_|_) |                    | Man feilt solange an seinen Text um, bis
| |   |         | die Satzbestandteile des Satzes nicht mehr
__/   | | zusammenpaßt. -- Ralph Babel

Re: Cost of qr// vs m//

Quoted text here. Click to load it

There is no recompilation entering the equation at all.  At most,
there is a check that the string did not change from the last time,
AND THEN a reuse of previously compiled expression.  

Quoted text here. Click to load it

No, the intent was to be “as fast as possible” — but not faster.  The
problem (IIRC) is the extra redirections needed for “foolproofing”.  I did not
want people expecting that
  my $rx = qr(foo);    # Actually, *my* implementation was using: my $rx = study /foo/;  
would make something assessible in $$rx (so that one may change
internal implementation of RExes without breaking people’s XSUBs — as
v5.20 did with COW).  So I made $$rx being undef, and “hid” the
compiled form in a magic slot of $$rx.

These extra redirections make things slower, but made it much easier
to manipulate this feature through p5p police.  Theoretically, with
Perls of today, this may be changed.

Quoted text here. Click to load it

Yes, the slowdown is really tiny — usually I fight for every cache
miss, but here I thought it is going to be hidden by the noise.


BTW, looking at timing other people did with newer Perls, somebody
thought that malloc()+free() is a free lunch.  Even with “my”
malloc(), the cost of malloc()+free() is about the same as dispatch of
3 Perl opcodes.  (And, with trivial code as the discussed one, opcode
dispatch should be the principal contribution into the runtime.)

Hope this helps,

Site Timeline