Fast Factorial Script: 10000! in 1.5sec.

Do you have a question? Post it now! No Registration Necessary.  Now with pictures!

•  Subject
• Author
• Posted on

Speed race: my script computes 10000! in 1.5sec. (No recursion
this time.) Can you make a script that's faster?  :-)

Actually, I wrote 2 versions. The first version is
unlimited-precision; it prints every digit of 10000!
in about 2.0 seconds:

#! /usr/bin/perl
#  /rhe/scripts/math/factorial
use v5.14;
use strict;
use warnings;
use Math::BigInt;
my \$x = \$ARGV[0];
my \$f = Math::BigInt->bfac(\$x);
say "\$x! = \$f";
exit 0;

But the number it prints for 10000! is too big to even look at,
much less comprehend, because it's over 35000 digits. So I wrote
a low-precision version, which only takes about 1.5sec to compute
10000!:

#! /usr/bin/perl
#  /rhe/scripts/math/lpfactorial
use v5.14;
use strict;
use warnings;
use Math::BigInt;
my \$x = \$ARGV[0];
my \$f = Math::BigInt->bfac(\$x);
\$f->bround(5);
my \$s = \$f->bsstr();
say "\$x! = \$s";
exit 0;

I wonder how BigInt's "bfac()" achieves such speed?
I tried iterative multiplication (\$f *= \$_ for 1..\$x)
but it was much slower, about 5.0sec instead of 1.5sec,
so obviously bfac() is doing something different.

And, in case you're wondering,
10000! = about 2.8463E35659

Vastly larger than a googol (1E100).

But vastly smaller than a googolplex
(1E100000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000).

--
Cheers,
Robbie Hatley
Midway City, CA, USA
perl -le 'print "4o6e7o4f0w5llc7m"'
http://www.well.com/user/lonewolf/

gamma function?

-- HASM

Re: Fast Factorial Script: 10000! in 1.5sec.

TXR language (that I made):

Via built-in "n perm k" function (numbe of sequence sof k items chosen from a set of n),
not printing the results:

\$ time ./txr -e '(n-perm-k 10000 10000)'
real    0m0.096s
user    0m0.072s
sys     0m0.020s

C code:

val n_perm_k(val n, val k)
{
return rising_product(plus(minus(n, k), one), n);
}

This rising_product, in turn, naively iterates through the numbers and
multiplies them together; nothing special. The code is in C, but it uses the
high level functions like plus, mul, which dispatch on the type in an
unoptimized, and call into the bignum library.  The bignum library is Michael
Bromberger's MPI, with some hacks here and there by me.

Interpreted code using a naive for loop is quite a bit slower:

\$ time ./txr -e '(for ((i 2) (acc 1)) ((<= i 10000)) ((set acc (* acc i)) (inc i)))'
real    0m0.152s
user    0m0.124s
sys     0m0.028s

This is 32 bit code, on this hardware:

\$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 42
model name      : Intel(R) Core(TM) i5-2310 CPU @ 2.90GHz
stepping        : 7
cpu MHz         : 2862.094
cache size      : 6144 KB

TXR Lisp is completely, utterly unoptimized; evaluation proceeds by
interpretation of the raw abstract syntax tree, represented as nested lists.

So far I haven't really put any effort into making it go fast.

Anything significantly slower has to go out of its way to achieve bad performance.

You can see that a lot of time is chewed up in the interpretation of the for
loop, when you compare a more functional approach. Instead of calling a built-in
function, we can write some functional code. This competes very well against
the C rising_product, in spite of the overheads of range producing a lazy list,
which is consumed inside reduce:

\$ time ./txr -e '[reduce-left * (range 2 10000)]'
real    0m0.099s
user    0m0.072s
sys     0m0.024s

The conversion of binums to text is very inefficient, using a naive algorithm that
divides by 10 repeatedly and takes the remainder; the time jumps to 1.4 seconds.
A big improvement could be made there; there are much faser better ways to turn
bignums into base ten (and other radix) digits.  It's a good idea to separate
this if you want to properly benchmark just the raw calculation:

\$ time ./txr -p '[reduce-left * (range 2 10000)]' > /dev/null
real    0m1.403s
user    0m1.384s
sys     0m0.016s

As far as bignums go, I chose the MPI library for its simplicity and relatively
clean code (plus liberal license compatible with BSD). It's not a feature-rich
or fast library.  The integer square root was quite bad; I rewrote it to make
it faster. The multiplication is basic; it doesn't even use Karatsuba or
anything.

For a faster factorial, we probably don't want to be doing this naive
multiplication.  Basically, we don't want to be accumulating one huge number,
and multiplying it by the next tiny number in the sequence.

Re: Fast Factorial Script: 10000! in 1.5sec.

El 15/04/15 a las 02:05, Robbie Hatley escribiÃ³:

\$ time perl test.gmp
(a very big number)
real    0m0.028s
user    0m0.024s
sys    0m0.004s

No recursive, off course.

--
http://www.telecable.es/personales/gamo/
The generation of random numbers is too important to be left to chance

Re: Fast Factorial Script: 10000! in 1.5sec.

[...]

Another remark I can't help making here: Most loops I encounter in
practice look rather like this and while this can also be written as

sub of {
my \$a;

for (\$a = int(\$_[0] / 2) + 1, my (\$l, \$r, \$o) = (\$a * \$a, 1, 1), my \$c;  \$c = \$l - \$r; \$o += 2, \$r += \$o) {
\$a *= \$c;
}

return \$a;
}

this "isn't necessarily recommended if you're optimizing for
maintainability" ...

Re: Fast Factorial Script: 10000! in 1.5sec.

On 20.04.15 13:30, Rainer Weikusat wrote:

An attempt at rewriting from the point of view of functionist
principles, using LET bindings only (single assignments).
The performance is not as bad as I had thought it would be.

sub odd_fac2
{
my (\$accu, \$left, \$right, \$odd, \$cur);

(\$accu, \$odd, \$right) = (@_ > 1 ? (shift) : int((shift) / 2) + 1,
(shift || 1),
(shift || 1));
\$left = shift || \$accu * \$accu;
if (\$cur = \$left - \$right) {
return odd_fac2(\$accu * \$cur, \$odd + 2, \$right + (\$odd + 2),
\$left);
}

return \$accu;
}

Re: Fast Factorial Script: 10000! in 1.5sec.

[...]

An attempt at rewriting from the point of view of functionist

A trick I learned (by accident, ie, I taught myself, not because anybody
bothered teaching stuff like that) while being confronted with
functional languages for the first time at university Kaiserslautern:
Whenever the 'structure' of the argument list has to change, write a new
function. Considering this (and considering that assignments ought to be
avoided unless as workaround for limitations of Perl and that we should
surely have dynamic scoping available):

sub odd_fac
{
return odd_m_fac(int(\$_[0] / 2) + 1);
}

sub odd_m_fac
{
return odd_r_fac(\$_[0], 1, 3) for \$left = \$_[0] * \$_[0];
}

sub odd_r_fac
{
my (\$accu, \$right, \$odd) = @_;

return 0 unless \$accu;
return \$_ ? \$_ : \$accu for odd_r_fac(\$accu * (\$left - \$right), \$right + \$odd, \$odd + 2);
}

Re: Fast Factorial Script: 10000! in 1.5sec.

[...]

... ops ...

sub odd_m_fac
{
for \$left (\$_[0] * \$_[0]) {
return odd_r_fac(\$_[0], 1, 3)
}
}

more functionalisms (was: Fast Factorial Script: 10000! in 1.5sec.)

[...]

I don't really like this, either, because Georg was handling the return
values in a much more sensible way.

sub odd_r_fac
{
my (\$accu, \$right, \$odd) = @_;

for (\$left - \$right) {
return \$_ ? odd_r_fac(\$accu * \$_, \$right + \$odd, \$odd + 2) : \$accu;
}
}

NB: This purposely uses the foreach block form to emphasize the
simiarity to a (let ((..)) ...).

Re: more functionalisms

These work nicely if you have very large numbers as once you break into
exponentials all is lost.

I calcuated 1000! to the integer value - no exponential.
I used normal integers not double precision that adds memory and more
troubles.

I used an array and multiplied the elements by the next value and kept
each element 9 or less.  Carrying when needed to the next array element
and on and on.  It takes time, but the number comes out.

I realize Perl has some power in large numbers on 32 and 64 bit
machines.  But in 8 bit machines one did fancy tricks.

Martin

On 4/20/2015 4:25 PM, Rainer Weikusat wrote: