Optimisation for tight loops... - Page 2

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

Threaded View

Re: Optimisation for tight loops...

Quoted text here. Click to load it

Mostly because it's not worth it. Loop unrolling is a technique that
works well for real CPUs because jumps are expensive (or rather, they
were expensive - I'm not sure if loop unrolling is still worthwhile on
modern CPUs), but that doesn't apply to typical bytecode interpreters.

Here's a simple example. A loop which fills an array with a constant


use warnings;
use strict;

use Benchmark qw(:hireswallclock timethese cmpthese);

my @a;

my $results = timethese(-3, {
    normal => sub {
        for (0 .. 999_999) {
            $a[$_] = 0;
    unroll4 => sub {
        for (0 .. 249_999) {
            $a[$_ * 4 + 0] = 0;
            $a[$_ * 4 + 1] = 0;
            $a[$_ * 4 + 2] = 0;
            $a[$_ * 4 + 3] = 0;
    unroll4c => sub {
        my $c = 0;
        for (0 .. 249_999) {
            $a[$c++] = 0;
            $a[$c++] = 0;
            $a[$c++] = 0;
            $a[$c++] = 0;


           Rate  unroll4 unroll4c   normal
unroll4  4.04/s       --     -33%     -37%
unroll4c 6.00/s      49%       --      -6%
normal   6.37/s      58%       6%       --

The result is not surprising: The unrolled versions are slower.
The extra computations more than make up for the lower loop overhead.
In machine code, special addressing modes could be used in "unroll4"
instead of the explicit multiplication and addition, or the computations
could be performed in parallel to the memory accesses. Similarly in

Quoted text here. Click to load it

Most of the time the values don't change underneath, and
in many cases this could be detected by the compiler. But such analysis
is expensive. When the program is compiled on every run, this is hard to
amortize, even if there is a noticable speedup, which doesn't seem to be
the case for loop unrolling.

There are other techniques which are cheaper and gain more, for example JIT.


Re: Optimisation for tight loops...

Krishna Chaitanya wrote:
Quoted text here. Click to load it

If by "tight loop" one means the sort of thing
you need for pixel-by-pixel image processing,
FFT or real time sound generation, yeah, perl is too slow.

Otherwise - not an issue.


Re: Optimisation for tight loops...

Quoted text here. Click to load it

Most "gurus" would say that you would be better served by profiling your
code to see where any real bottlenecks are, than by listening to random
uninformed guesswork.


Sherm Pendley
Cocoa Developer

Site Timeline