Do you have a question? Post it now! No Registration Necessary. Now with pictures!
posted on
May 18, 2010, 8:18 pm
May 18, 2010, 8:18 pm
I recently implemented a C hashing algorithm in Perl and wasted about
8 hours of my time trying to debug it because of a stupid assumption
on my part. When doing a lot of shifts and bitwise xor's it's easy
for 32bit integers to overflow into double precision floating point
values. I recognized this early on and thought I was capturing only
the lowest 32 bits with ($x &= 0xffffffff). That simply does not work!
The way I got around it was with ($x %= 4294967296). In hindsight it's
easy for me now to recognize why the first not only doesn't work but
doesn't even make sense with the mixed number types. I hope this helps
someone else. Happy bittwiddling!
Re: bittwiddling on 32bit machines
Depending on what you're doing, promotion to double *should* safely give
you 53 bits of integer precision, but you may want to consider
rebuilding your perl with Duse64bitint to avoid type conversion all
over the place. The standard FreeBSD build has been built with this
option for a while now, and there is some talk of using it for the
Debian builds of 5.12.
Ben
Re: bittwiddling on 32bit machines
Having int wider than float type is a VERY large can of worms (well,
at least it was the last time I was looking; did things change
recently?). What is reported above (which is, essentially, a bug of
the implementation of `use integer') is a very small can of worms.
Does not look like a fair trade...
Ilya
Re: bittwiddling on 32bit machines
Looks like I might have been not very specific. The can of worms I
meant is with the semantic, not implementation  things "not behaving
as expected" vs things behaving "obviously wrong".
This thread is one example of the "first category"  if one forgets
what happens with `use integer', which is obviously a bug (BTW, is it
reported yet?).
Yours,
Ilya
Re: bittwiddling on 32bit machines
Can you give an example of things not behaving as expected? AFAICS using
long integers makes numeric semantics *more* predictable, by avoiding
conversion to floats for longer.
In any case, this has been an issue ever since perl started building on
64bit platforms with only a 53bit double type.
Not by me. With a perl with 32bit IVs I get
~% perl E'
my $x = (1<<30); say $x;
$x <<= 1; say $x;
$x <<= 1; say $x;'
1073741824
2147483648
0
I presume it's the final '0' you consider a bug, not the negative value?
I'm not sure I agree that's a bug, since perldoc integer says
 Finally, "use integer;" also has an additional affect on the bitwise
 operators. Normally, the operands and results are treated as
 unsigned integers, but with "use integer;" the operands and results
 are signed. This means, among other things, that ~0 is 1, and 2 &
 5 is 6.

 Internally, native integer arithmetic (as provided by your C
 compiler) is used. This means that Perl’s own semantics for
 arithmetic operations may not be preserved.
and a C program the does the same operations on a signed 32bit integer
gives the same results.
Ben
Re: bittwiddling on 32bit machines
Technically there are no "integer types", they are integral types.
"int" is referred to as the machine word (the natural width of the
data registers) which provide for a bit width in the range of
char <= short <= int <= long. All being unsigned.
"signed" is the shortcut here, not unsigned.
"Integers" are defined by the set of whole numbers .
I sure would hate to look at C source code that had
"signed" in it, although some compilers have the option to
convert the default to unsigned (/J on MS), except where
"signed" is denoted.
So, we could agree that C program integral types are either
unsigned, or the default, which is signed (the synonym for the default),
and that signed int is redundant (in the default case), and is would be
more specific as a 32bit long.
sln
Re: bittwiddling on 32bit machines
Integer is what it is, a single whole number from the set of all whole numbers.
There is no type of integer. There are subsets, they are called integral types
in computing science.
If the phrase is loosely used, thats a shame. C/C++ standard and all references
to
it use integral types as classifier of the subsets.
sln
Re: bittwiddling on 32bit machines
I was presuming that such things happens left and right, so I did not
try to remember the problems people were reporting. But I may try to guess...
And: I do not have Perl with 64bit IV AND 64bit NV around, so all
this is just a conjecture: I expect the the following things to break:
rounding by int($n + 0.t);
$x + 0.5 >= $x
etc... (These are just the first things which come to my mind; real
problems are usually less "obvious"...)
For bitwise operations, the negative value "behaves the same" as the
positive one, right? (At least it WAS the reason why I designed the
Perl "numeric system" the way it is...) Then I have no problem with
the negative number. And I definitely have no problem with 0.
I was under assumption that the OP's problem was that shifts were
propagating to floats, and the results (above UV_MAX) converted to
UVs were becoming UV_MAX... But maybe I misread what he wrote...
Yours,
Ilya
Re: bittwiddling on 32bit machines
what is $n here (conceptually)? An integer? Then it doesn't make sense
to round it. A floating point value? Then adding 0.5 doesn't change the
type and whether IVs are 32 bit or 64 bit doesn't matter.
Again, this is already "broken" (i.e., a false assumption) on large
numbers, making IVs 64 bit doesn't change that.
A better example would be where something like
$large_even_integer/2
where the result is exact in integer arithmetic but must be rounded in
FP arithmetic. However, perl (at least since 5.8.8) does the right thing
here: The result is an IV with the exact result. (how does it do this?)
hp
Re: bittwiddling on 32bit machines
The code is in pp_divide in pp.c. Basically, if there is a chance the
division will produce an exact integer result that can't be represented
as a float, the code will
 make both numbers positive
 do the division in unsigned ints
 if the result was exact (checked with an unsignedint multiply)
 return an unsigned int if the result should be positive,
 return a (negated) signed int if the result should be negative
and it fits,
 otherwise cast the int to a float and return the negation of
that (which may be inaccurate, but that can't be helped);
 else
 do a floatingpoint division
Ben
Re: bittwiddling on 32bit machines
Not specific enough.
Because it is already an integer. Rounding it is useless since it is
guarantueed to give you the same value (unless you want to round to the
nearest integer representable as an FP value, then the expression does
what you want).
Please be more specific.
Sorry, I misread the ">=" as ">".
But from what I see, ($x + 0.5 >= $x) actually seems to be true for all
numbers, because the comparison is done on the NVs.
That, however, does lead to a result which may surprise the unwary:
#!/usr/bin/perl
use warnings;
use strict;
no warnings 'portable';
use Devel::Peek;
my $x1 = 18014398509481728.000000; # exact FP value
my $x2 = 18014398509481729; # exact 64 bit int value
if ($x1 >= $x2) {
print "surprise!\n";
}
hp
Re: bittwiddling on 32bit machines
Specific enough for me. What part of
perldoc perlnumber
do you not agree with?
I do not see how a correct answer may be deemed "useless". Do you
want to allow sqrt(1) to be 75 since "it does not make sense to apply
sqrt() to 1"?
So one moved dust to be under a different carpet. This comparison
"looks" true, but only due to another problem...
Yes, this is what I meant by "looking true..."...
Yours,
Ilya
Re: bittwiddling on 32bit machines
Not for me. A number in a program isn't just a number. It represents
something. To determine which operations on that number make sense you
need to know what it represents, how accurate that represention is (and
needs to be), what properties it has. Then you can say whether a given
representation in a progrem (e.g. as a 64bit FP number) is adequate and
whether computing int($n + 0.5) makes sense or not.
Who says I don't agree with it? You were vaguely referring to it in
response to a very specific claim:
I don't know why you referred to it, or which part of perldoc perlnumber
is supposed to proof that my claim is wrong, and I won't guess. Either
you refer to the specific part of perldoc perlnumber which is applicable
here or you let it rest.
Not the answer is useless, but the operation. If I have a variable $x in
my program which at some point contains an integral value, why would I
write
$y = int($x + 0.5);
instead of simply
$y = $x;
at this point? I could just as well write
$y = $x * 1;
or
$y = $x + 0;
or
$y = $x + 1E300  1E300;
which are similarly useless (unless executed specifically for the
sideeffects caused by the representation).
No, but I do want to allow sin(1E100) to return an (more or less) random
number between 1 and +1, because ε(1E100) ≫ 2π. The input to the
function is useless, so I must expect the output to be useless, too.
Similarly ($n + 0.5) is only useful if the result can actually be
represented in an FP number with adequate precision. Perl doesn't save
you from thinking about stuff like that. Actually, you have to think
more than in statically typed languages: A typeless language gives you
more freedom but demands more discipline.
hp
Site Timeline
 » FAQ 9.15 How do I parse a mail header?
 — Next thread in » PERL Discussions
 » Don't understand: when('foo' and/or 'bar')
 — Previous thread in » PERL Discussions
 » s suffix question
 — Newest thread in » PERL Discussions
 » Adblock Testscript problem
 — The site's Newest Thread. Posted in » HTML Markup Language