Optimisation for tight loops... - Page 2

•  Subject
• Author
• Posted on

Re: Optimisation for tight loops...

Mostly because it's not worth it. Loop unrolling is a technique that
works well for real CPUs because jumps are expensive (or rather, they
were expensive - I'm not sure if loop unrolling is still worthwhile on
modern CPUs), but that doesn't apply to typical bytecode interpreters.

Here's a simple example. A loop which fills an array with a constant
value:

#!/usr/bin/perl

use warnings;
use strict;

use Benchmark qw(:hireswallclock timethese cmpthese);

my @a;

my \$results = timethese(-3, {
normal => sub {
for (0 .. 999_999) {
\$a[\$_] = 0;
}
},
unroll4 => sub {
for (0 .. 249_999) {
\$a[\$_ * 4 + 0] = 0;
\$a[\$_ * 4 + 1] = 0;
\$a[\$_ * 4 + 2] = 0;
\$a[\$_ * 4 + 3] = 0;
}
},
unroll4c => sub {
my \$c = 0;
for (0 .. 249_999) {
\$a[\$c++] = 0;
\$a[\$c++] = 0;
\$a[\$c++] = 0;
\$a[\$c++] = 0;
}
},
}
);

cmpthese(\$results);
__END__

Rate  unroll4 unroll4c   normal
unroll4  4.04/s       --     -33%     -37%
unroll4c 6.00/s      49%       --      -6%
normal   6.37/s      58%       6%       --

The result is not surprising: The unrolled versions are slower.
The extra computations more than make up for the lower loop overhead.
In machine code, special addressing modes could be used in "unroll4"
instead of the explicit multiplication and addition, or the computations
could be performed in parallel to the memory accesses. Similarly in
"unroll4c".

Most of the time the values don't change underneath, and
in many cases this could be detected by the compiler. But such analysis
is expensive. When the program is compiled on every run, this is hard to
amortize, even if there is a noticable speedup, which doesn't seem to be
the case for loop unrolling.

There are other techniques which are cheaper and gain more, for example JIT.

hp

Re: Optimisation for tight loops...

Krishna Chaitanya wrote:

If by "tight loop" one means the sort of thing
you need for pixel-by-pixel image processing,
FFT or real time sound generation, yeah, perl is too slow.

Otherwise - not an issue.

BugBear

Re: Optimisation for tight loops...

Most "gurus" would say that you would be better served by profiling your
code to see where any real bottlenecks are, than by listening to random
uninformed guesswork.

sherm--

--
Sherm Pendley
<http://camelbones.sourceforge.net
Cocoa Developer