#### Do you have a question? Post it now! No Registration Necessary. Now with pictures!

**posted on**

- Scott Stark

July 21, 2004, 6:51 pm

way to prevent the same number from being used more than once? All I

can think of is keeping a list of used numbers, and having some kind

of "if used, next" statement, which seems lame. (Theoretically that

could keep going forever, if it randomly kept selecting the same

numbers.) Any thoughts?

thanks

Scott

## Re: Using rand(), how to I avoid repeats?

Scott> I'm using rand() to generate a series of random numbers. Is

Scott> there any way to prevent the same number from being used more

Scott> than once? All I can think of is keeping a list of used

Scott> numbers, and having some kind of "if used, next" statement,

Scott> which seems lame. (Theoretically that could keep going forever,

Scott> if it randomly kept selecting the same numbers.) Any thoughts?

You probably need to deal from a deck then.

use List::Util qw(shuffle); # might need this from CPAN

my @shuffled;

while (time passes) {

@shuffled = shuffle(1..100) unless @shuffled;

my $number = shift @shuffled;

etc

}

print "Just another Perl hacker,"; # the original

--

Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095

Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.

See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

## Re: Using rand(), how to I avoid repeats?

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

>

> thanks

> Scott

Sounds like you want some variation of a shuffle. Take a look at

Algorithm::Numerical::Shuffle on cpan.

Generate an array of numbers, shuffle it, then shift off how many items

you need.

## Re: Using rand(), how to I avoid repeats?

sstark@us.ibm.com (Scott Stark) wrote in message

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

Keeping a hash of 'used' numbers could indeed be lame in that

performance would degrade as the percentage of numbers used became

higher and higher.

A better approach might be to start with an array (hash?) with an

entry for every number in the acceptable range of random numbers,

where the key and the value are identical to begin with. Here is an

example representation after this array has been created:

$random[1] = 1;

$random[2] = 2;

$random[3] = 3;

#etcetera

$random[16] = 16; #range max.

The first time you choose from a range of 1 to 16 lets say 3 is the

random number. Then you could splice a couple of array slices, so

that the 'higher' half takes up at the spot where you just eliminated

one of your possibilities:

$random[1] = 1;

$random[2] = 2;

#$random[3] = 3;

$random[3] = 4;

$random[4] = 5;

#etcetera

$random[15] = 16;

The next time you will choose a random number from 1 to 15: the new

maximum. If it just happens to return 15, for instance, you would

then get the value specified by that array index: 16.

There are a few gotchas, such as being sure not to try to create more

random numbers than are in the original range, and managing the array

minimum and maximum values as you continue to slice and splice.

Happy coding!

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

Keeping a hash of 'used' numbers could indeed be lame in that

performance would degrade as the percentage of numbers used became

higher and higher.

A better approach might be to start with an array (hash?) with an

entry for every number in the acceptable range of random numbers,

where the key and the value are identical to begin with. Here is an

example representation after this array has been created:

$random[1] = 1;

$random[2] = 2;

$random[3] = 3;

#etcetera

$random[16] = 16; #range max.

The first time you choose from a range of 1 to 16 lets say 3 is the

random number. Then you could splice a couple of array slices, so

that the 'higher' half takes up at the spot where you just eliminated

one of your possibilities:

$random[1] = 1;

$random[2] = 2;

#$random[3] = 3;

$random[3] = 4;

$random[4] = 5;

#etcetera

$random[15] = 16;

The next time you will choose a random number from 1 to 15: the new

maximum. If it just happens to return 15, for instance, you would

then get the value specified by that array index: 16.

There are a few gotchas, such as being sure not to try to create more

random numbers than are in the original range, and managing the array

minimum and maximum values as you continue to slice and splice.

Happy coding!

## Re: Using rand(), how to I avoid repeats?

you may be able to construct a linear congruence psuedo random number

generator with the desired period.

see for instance page 8 of this pdf:

http://delta.cs.cinvestav.mx/~lixo/simulation/Chap07Slides.pdf

or this pdf:

http://www.maths.adelaide.edu.au/pure/dept/hons2002/aglen.pdf

if you don't mind your numbers being very large, you could with very

high probability generate unique numbers by calculating some

cryptographic hash function of an each time incremented value.

both of these methods will result in a predictable sequence of random

numbers.

but may be more practical when the maximum number value you need is

larger

then is practical for use with a shuffled list.

willem

## Re: Using rand(), how to I avoid repeats?

sstark@us.ibm.com (Scott Stark) wrote in message

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

You've got a couple of answers suggesting you use modules that allow

you to "shuffle" an array, and I can't disagree: modules are usually

the way to go as they have been used and tested more than any home

grown solution.

But still I wanted to share with you the home-grown code I've rolled

since my original suggestion. Beware, its a little clunky. It could

stand some perfecting but "works" for me, at least for the dozen or so

times I tested it:

sub uniq

#based on random

my($min, $max, $qty) = @_;

# Assumes that the three arguments are integers themselves!

# Assumes $min < $max

# Assumes $qty <= $max - $min + 1

my @set, @uniq;

for (my $i=0; $i <= $max - $min; $i++) {

$set[$i] = $i;

}

for (my $i=0; $i < $qty; $i++) {

my $rand = int rand($max - $min - $i + 1);

$uniq[$i] = $set[$rand] + $min;

@set = (@set[0..$rand-1], @set[$rand+1..scalar(@set)-1]);

print $uniq[$i], "\n"; #display only, can comment out

}

return @uniq;

}

@uniq

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

You've got a couple of answers suggesting you use modules that allow

you to "shuffle" an array, and I can't disagree: modules are usually

the way to go as they have been used and tested more than any home

grown solution.

But still I wanted to share with you the home-grown code I've rolled

since my original suggestion. Beware, its a little clunky. It could

stand some perfecting but "works" for me, at least for the dozen or so

times I tested it:

sub uniq

___random___int_in ($$$) {#based on random

___int___in in perlfaq4, data manipulationmy($min, $max, $qty) = @_;

# Assumes that the three arguments are integers themselves!

# Assumes $min < $max

# Assumes $qty <= $max - $min + 1

my @set, @uniq;

for (my $i=0; $i <= $max - $min; $i++) {

$set[$i] = $i;

}

for (my $i=0; $i < $qty; $i++) {

my $rand = int rand($max - $min - $i + 1);

$uniq[$i] = $set[$rand] + $min;

@set = (@set[0..$rand-1], @set[$rand+1..scalar(@set)-1]);

print $uniq[$i], "\n"; #display only, can comment out

}

return @uniq;

}

@uniq

___rands = uniq___random___int___in(11,20,5); # for testing## Re: Using rand(), how to I avoid repeats?

> sstark@us.ibm.com (Scott Stark) wrote in message

> > I'm using rand() to generate a series of random numbers. Is there any

> > way to prevent the same number from being used more than once? All I

> > can think of is keeping a list of used numbers, and having some kind

> > of "if used, next" statement, which seems lame. (Theoretically that

> > could keep going forever, if it randomly kept selecting the same

> > numbers.) Any thoughts?

>

> You've got a couple of answers suggesting you use modules that allow

> you to "shuffle" an array, and I can't disagree: modules are usually

> the way to go as they have been used and tested more than any home

> grown solution.

>

> But still I wanted to share with you the home-grown code I've rolled

> since my original suggestion. Beware, its a little clunky. It could

> stand some perfecting but "works" for me, at least for the dozen or so

> times I tested it:

Yes, it does what it should, and yes, it

> sub uniq_random_int_in ($$$) {

^^^^^

Don't give a function a prototype unless there is a distinct advantage.

Normally, perl subs are not prototyped.

> #based on random_int_in in perlfaq4, data manipulation

> my($min, $max, $qty) = @_;

> # Assumes that the three arguments are integers themselves!

> # Assumes $min < $max

> # Assumes $qty <= $max - $min + 1

> my @set, @uniq;

This doesn't declare @uniq. "my (@set, @uniq)" is what you need. You

didn't run this under strict, did you?

> for (my $i=0; $i <= $max - $min; $i++) {

> $set[$i] = $i;

> }

That could be more simply written

@set = 0 .. $max - $min;

But what you really want in @set is the set of possible results, so

you can make that

@set = $min .. $max;

> for (my $i=0; $i < $qty; $i++) {

> my $rand = int rand($max - $min - $i + 1);

The number "$max - $min - $i + 1" is exactly the number of elements

left in @set in the $i-th step. You start with $max - $min + 1 elements

in the 0-th step, and you lose one element each time $i increases. So

my $rand = int rand @set;

does the same thing. The selection process is now even clearer:

Select a random number between 0 and the number of elements left

(exclusive). Pick that element.

> $uniq[$i] = $set[$rand] + $min;

When @set contains the numbers $min .. $max directly, you don't need

the addition. $set[$rand] is the wanted random number.

> @set = (@set[0..$rand-1], @set[$rand+1..scalar(@set)-1]);

^^^^^^

"@set" is already in scalar context, no need for "scalar".

But Perl's splice() function is what you really want here:

splice( @set, $rand, 1);

This returns the value(s) that are spliced out, so you can find

the random index and move the indicated element from @set to @uniq

in one step:

push @uniqe, splice( @set, rand @set, 1);

> print $uniq[$i], "\n"; #display only, can comment out

> }

> return @uniq;

> }

>

> @uniq_rands = uniq_random_int_in(11,20,5); # for testing

Putting it all together, this is how I would write it:

sub uniq

my($min, $max, $qty) = @_;

my @set = ( $min .. $max);

map splice( @set, rand @set, 1), 1 .. $qty;

}

Anno

> > I'm using rand() to generate a series of random numbers. Is there any

> > way to prevent the same number from being used more than once? All I

> > can think of is keeping a list of used numbers, and having some kind

> > of "if used, next" statement, which seems lame. (Theoretically that

> > could keep going forever, if it randomly kept selecting the same

> > numbers.) Any thoughts?

>

> You've got a couple of answers suggesting you use modules that allow

> you to "shuffle" an array, and I can't disagree: modules are usually

> the way to go as they have been used and tested more than any home

> grown solution.

>

> But still I wanted to share with you the home-grown code I've rolled

> since my original suggestion. Beware, its a little clunky. It could

> stand some perfecting but "works" for me, at least for the dozen or so

> times I tested it:

Yes, it does what it should, and yes, it

***is***a little clunky.> sub uniq_random_int_in ($$$) {

^^^^^

Don't give a function a prototype unless there is a distinct advantage.

Normally, perl subs are not prototyped.

> #based on random_int_in in perlfaq4, data manipulation

> my($min, $max, $qty) = @_;

> # Assumes that the three arguments are integers themselves!

> # Assumes $min < $max

> # Assumes $qty <= $max - $min + 1

> my @set, @uniq;

This doesn't declare @uniq. "my (@set, @uniq)" is what you need. You

didn't run this under strict, did you?

> for (my $i=0; $i <= $max - $min; $i++) {

> $set[$i] = $i;

> }

That could be more simply written

@set = 0 .. $max - $min;

But what you really want in @set is the set of possible results, so

you can make that

@set = $min .. $max;

> for (my $i=0; $i < $qty; $i++) {

> my $rand = int rand($max - $min - $i + 1);

The number "$max - $min - $i + 1" is exactly the number of elements

left in @set in the $i-th step. You start with $max - $min + 1 elements

in the 0-th step, and you lose one element each time $i increases. So

my $rand = int rand @set;

does the same thing. The selection process is now even clearer:

Select a random number between 0 and the number of elements left

(exclusive). Pick that element.

> $uniq[$i] = $set[$rand] + $min;

When @set contains the numbers $min .. $max directly, you don't need

the addition. $set[$rand] is the wanted random number.

> @set = (@set[0..$rand-1], @set[$rand+1..scalar(@set)-1]);

^^^^^^

"@set" is already in scalar context, no need for "scalar".

But Perl's splice() function is what you really want here:

splice( @set, $rand, 1);

This returns the value(s) that are spliced out, so you can find

the random index and move the indicated element from @set to @uniq

in one step:

push @uniqe, splice( @set, rand @set, 1);

> print $uniq[$i], "\n"; #display only, can comment out

> }

> return @uniq;

> }

>

> @uniq_rands = uniq_random_int_in(11,20,5); # for testing

Putting it all together, this is how I would write it:

sub uniq

___random___int_in {my($min, $max, $qty) = @_;

my @set = ( $min .. $max);

map splice( @set, rand @set, 1), 1 .. $qty;

}

Anno

## Re: Using rand(), how to I avoid repeats?

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once?

A lot of ways.

> All I can think of is keeping a list of used numbers, and having some

> kind of "if used, next" statement, which seems lame.

Well, I'd keep a hash of used numbers. And "redo" would likely more sense

than "next". If your numbers are floats, you may have some problem with

the numeric and string representations being nonidentical, but I highly

doubt that will be a problem. It is not lame at all, unless you are

sampling very densely among all possible values. (If you are sampling very

densely, i.e. you will before you are done need a high fraction of all

possible values, then I would just shuffle the list of all of those

possible values.)

> (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

If your random number generator degenerates thus, then in that case you are

pretty much screwed regardless.

Xho

--

-------------------- http://NewsReader.Com/ --------------------

Usenet Newsgroup Service $9.95/Month 30GB

## Re: Using rand(), how to I avoid repeats?

ctcgag@hotmail.com wrote in message

> sstark@us.ibm.com (Scott Stark) wrote:

> > I'm using rand() to generate a series of random numbers. Is there any

> > way to prevent the same number from being used more than once?

>

> A lot of ways.

>

> > All I can think of is keeping a list of used numbers, and having some

> > kind of "if used, next" statement, which seems lame.

>

> Well, I'd keep a hash of used numbers.

There may be more than one way to do it, but this is definitely worse

than:

1) the above-stated ideas of using a module to shuffle an array

2) The "idea" behind my home rolled solution (the implementation could

definitely be improved)

As the hash of used numbers grows, chances increase that to generate

each random number, rand() will needlessly be called more than once,

maybe even several times. I've experienced this. I wrote a similar

program at my first programming job for a bingo supply company.

Random bingo grids, and as I recall rows and or columns have to add up

to the same number?! Well I kept an array of used combinations and

the darn thing took longer to run than it did to write! Never again

would I do that. Maybe thats why I took such an interest in this

problem.

> sstark@us.ibm.com (Scott Stark) wrote:

> > I'm using rand() to generate a series of random numbers. Is there any

> > way to prevent the same number from being used more than once?

>

> A lot of ways.

>

> > All I can think of is keeping a list of used numbers, and having some

> > kind of "if used, next" statement, which seems lame.

>

> Well, I'd keep a hash of used numbers.

There may be more than one way to do it, but this is definitely worse

than:

1) the above-stated ideas of using a module to shuffle an array

2) The "idea" behind my home rolled solution (the implementation could

definitely be improved)

As the hash of used numbers grows, chances increase that to generate

each random number, rand() will needlessly be called more than once,

maybe even several times. I've experienced this. I wrote a similar

program at my first programming job for a bingo supply company.

Random bingo grids, and as I recall rows and or columns have to add up

to the same number?! Well I kept an array of used combinations and

the darn thing took longer to run than it did to write! Never again

would I do that. Maybe thats why I took such an interest in this

problem.

## Re: Using rand(), how to I avoid repeats?

ctcgag@hotmail.com said:

>sstark@us.ibm.com (Scott Stark) wrote:

>> All I can think of is keeping a list of used numbers, and having some

>> kind of "if used, next" statement, which seems lame.

....

>> (Theoretically that could keep going forever, if it randomly kept

>> selecting the same numbers.) Any thoughts?

>

>If your random number generator degenerates thus, then in that case you

>are pretty much screwed regardless.

Not so -- think of a situation where your number space is f.ex.

1000 numbers, and you don't want repetitions. You'll end up with a

significant and inreasing possibility of retries for each new number.

For the last number, there's only a 1/1000 chance that the random

algorithm comes up with the desired number. I made a small script

to test this; even with range of 3 (create randomised list of numbers

of 1 to 3) I got with short testing 12 "false guesses" in generating

the list. With larger ranges the number of false guesses got significantly

worse -- with just range of 1000, the worst results I got had over 3000000

false guesses. Results with less than 1000000 false guesses seemed to be

rare.

So, for a list of random numbers without repetitions, the shuffled list

algorithms are definitely better.

--

Wolf a.k.a. Juha Laiho Espoo, Finland

(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V

PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++

"...cancel my subscription to the resurrection!" (Jim Morrison)

>sstark@us.ibm.com (Scott Stark) wrote:

>> All I can think of is keeping a list of used numbers, and having some

>> kind of "if used, next" statement, which seems lame.

....

>> (Theoretically that could keep going forever, if it randomly kept

>> selecting the same numbers.) Any thoughts?

>

>If your random number generator degenerates thus, then in that case you

>are pretty much screwed regardless.

Not so -- think of a situation where your number space is f.ex.

1000 numbers, and you don't want repetitions. You'll end up with a

significant and inreasing possibility of retries for each new number.

For the last number, there's only a 1/1000 chance that the random

algorithm comes up with the desired number. I made a small script

to test this; even with range of 3 (create randomised list of numbers

of 1 to 3) I got with short testing 12 "false guesses" in generating

the list. With larger ranges the number of false guesses got significantly

worse -- with just range of 1000, the worst results I got had over 3000000

false guesses. Results with less than 1000000 false guesses seemed to be

rare.

So, for a list of random numbers without repetitions, the shuffled list

algorithms are definitely better.

--

Wolf a.k.a. Juha Laiho Espoo, Finland

(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V

PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++

"...cancel my subscription to the resurrection!" (Jim Morrison)

## Re: Using rand(), how to I avoid repeats?

> ctcgag@hotmail.com said:

> >sstark@us.ibm.com (Scott Stark) wrote:

> >> All I can think of is keeping a list of used numbers, and having some

> >> kind of "if used, next" statement, which seems lame.

> ...

> >> (Theoretically that could keep going forever, if it randomly kept

> >> selecting the same numbers.) Any thoughts?

> >

> >If your random number generator degenerates thus, then in that case you

> >are pretty much screwed regardless.

>

> Not so -- think of a situation where your number space is f.ex.

> 1000 numbers, and you don't want repetitions. You'll end up with a

> significant and inreasing possibility of retries for each new number.

Increasing, but for the most part trivial.

> For the last number, there's only a 1/1000 chance that the random

> algorithm comes up with the desired number. I made a small script

> to test this; even with range of 3 (create randomised list of numbers

> of 1 to 3) I got with short testing 12 "false guesses" in generating

> the list. With larger ranges the number of false guesses got

> significantly worse -- with just range of 1000, the worst results I got

> had over 3000000 false guesses. Results with less than 1000000 false

> guesses seemed to be rare.

You clearly did something wrong. It rarely takes over 15,000 tries to get

all numbers from 1 to 1,000. The odds against needing 1,000,000 tries are

beyond astronomical.

If your random number generator is so defective that you can't get the

numbers you need from it using the hash method, then neither can you trust

its use in your shuffling procedure.

>

> So, for a list of random numbers without repetitions, the shuffled list

> algorithms are definitely better.

That is true as long as it is possible to enumerate all possible numbers

within the domain and you want to sample a large fraction of that domain

(as I said myself in the paragraph you snipped).

If we are talking about general methods, I'd rather have a method that

works in general and is slightly suboptimal in a special case, than a

method that fails in general but works in one special case.

Xho

--

-------------------- http://NewsReader.Com/ --------------------

Usenet Newsgroup Service $9.95/Month 30GB

> >sstark@us.ibm.com (Scott Stark) wrote:

> >> All I can think of is keeping a list of used numbers, and having some

> >> kind of "if used, next" statement, which seems lame.

> ...

> >> (Theoretically that could keep going forever, if it randomly kept

> >> selecting the same numbers.) Any thoughts?

> >

> >If your random number generator degenerates thus, then in that case you

> >are pretty much screwed regardless.

>

> Not so -- think of a situation where your number space is f.ex.

> 1000 numbers, and you don't want repetitions. You'll end up with a

> significant and inreasing possibility of retries for each new number.

Increasing, but for the most part trivial.

> For the last number, there's only a 1/1000 chance that the random

> algorithm comes up with the desired number. I made a small script

> to test this; even with range of 3 (create randomised list of numbers

> of 1 to 3) I got with short testing 12 "false guesses" in generating

> the list. With larger ranges the number of false guesses got

> significantly worse -- with just range of 1000, the worst results I got

> had over 3000000 false guesses. Results with less than 1000000 false

> guesses seemed to be rare.

You clearly did something wrong. It rarely takes over 15,000 tries to get

all numbers from 1 to 1,000. The odds against needing 1,000,000 tries are

beyond astronomical.

If your random number generator is so defective that you can't get the

numbers you need from it using the hash method, then neither can you trust

its use in your shuffling procedure.

>

> So, for a list of random numbers without repetitions, the shuffled list

> algorithms are definitely better.

That is true as long as it is possible to enumerate all possible numbers

within the domain and you want to sample a large fraction of that domain

(as I said myself in the paragraph you snipped).

If we are talking about general methods, I'd rather have a method that

works in general and is slightly suboptimal in a special case, than a

method that fails in general but works in one special case.

Xho

--

-------------------- http://NewsReader.Com/ --------------------

Usenet Newsgroup Service $9.95/Month 30GB

## Re: Using rand(), how to I avoid repeats?

>> with just range of 1000, the worst results I got had over 3000000

>> false guesses. Results with less than 1000000 false guesses seemed to

>> be rare.

>

>You clearly did something wrong. It rarely takes over 15,000 tries to get

>all numbers from 1 to 1,000. The odds against needing 1,000,000 tries are

>beyond astronomical.

Thanks for the correction; my previous test snippet apparently did have

some dire problem -- now I seem to get repetitions in the range of 5000

when generating numbers 1..1000 - so looks like I can agree with your

results.

Apologies for criticizing your method on false grounds.

--

Wolf a.k.a. Juha Laiho Espoo, Finland

(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V

PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++

"...cancel my subscription to the resurrection!" (Jim Morrison)

## Re: Using rand(), how to I avoid repeats?

Juha Laiho wrote:

> Not so -- think of a situation where your number space is f.ex.

> 1000 numbers, and you don't want repetitions. You'll end up with a

> significant and inreasing possibility of retries for each new number.

> For the last number, there's only a 1/1000 chance that the random

> algorithm comes up with the desired number.

If you have 1000 numbers available, and have used 999 of them,

then the last number is

the one and only possible number is not the way to do things.

Use shuffle algorithm instead.

-Joe

> Not so -- think of a situation where your number space is f.ex.

> 1000 numbers, and you don't want repetitions. You'll end up with a

> significant and inreasing possibility of retries for each new number.

> For the last number, there's only a 1/1000 chance that the random

> algorithm comes up with the desired number.

If you have 1000 numbers available, and have used 999 of them,

then the last number is

***NOT***random! Using rand() to returnthe one and only possible number is not the way to do things.

Use shuffle algorithm instead.

-Joe

## Re: Using rand(), how to I avoid repeats?

Joe Smith said to us:

>

> If you have 1000 numbers available, and have used 999 of them,

> then the last number is *NOT* random! Using rand() to return

> the one and only possible number is not the way to do things.

> Use shuffle algorithm instead.

> -Joe

Bowsayge tested both the random algorithm (using a hash to avoid dups)

and the shuffle algorithm, and he found that the random algorithm is

better when there is a large range (relative to the number of results

wanted).

IOW, if he wants to get 1000 random numbers which range from 1 to 1

million, random (with hash) is pretty darn good.

But if he wants to get 1000 random numbers which range from 1 to 1000,

the random algorithm makes seven times as many collisions as valid results,

so the shuffle algorithm greatly outperforms it.

However, shuffle is not perfect, because it requires a lot of memory for

big ranges. What if only 50 unique random numbers are required, but the

range is 1 to 10 million?

This simply reinforces what Bowsayge was taught in school: algorithms

have to be chosen for data sets.

--

bowsayge

bow-say-ge?

>

> If you have 1000 numbers available, and have used 999 of them,

> then the last number is *NOT* random! Using rand() to return

> the one and only possible number is not the way to do things.

> Use shuffle algorithm instead.

> -Joe

Bowsayge tested both the random algorithm (using a hash to avoid dups)

and the shuffle algorithm, and he found that the random algorithm is

better when there is a large range (relative to the number of results

wanted).

IOW, if he wants to get 1000 random numbers which range from 1 to 1

million, random (with hash) is pretty darn good.

But if he wants to get 1000 random numbers which range from 1 to 1000,

the random algorithm makes seven times as many collisions as valid results,

so the shuffle algorithm greatly outperforms it.

However, shuffle is not perfect, because it requires a lot of memory for

big ranges. What if only 50 unique random numbers are required, but the

range is 1 to 10 million?

This simply reinforces what Bowsayge was taught in school: algorithms

have to be chosen for data sets.

--

bowsayge

bow-say-ge?

## Re: Using rand(), how to I avoid repeats?

Scott Stark wrote:

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once?

@Array = (1 .. 20);

do

{

for (1 .. $#Array)

{ push (@Array, splice (@Array, int (rand (@Array)), 1)); }

print shift @Array, " ";

}

until (scalar (@Array) == 0);

This modification will reduce the number of iterations,

for (1 .. int ($#Array / 2))

Clearly almost any divisor can be used, with less randomness

directly proportionate to increasing divisor value. Only

restriction is the divisor must yield a result of a counting

number with a value of one or greater. A return value of

one will not afford a high degree of randomness.

The scalar length of an array also effects degree of

randomness. For a scalar length of twenty, a divisor

of two returns fairly decent randomness. A divisor

of one returns very good randomness. I would use

a minimum of ten iterations for acceptable randomness.

This modification will force a more even distribution

but will always return the last element first for

the code syntax I display,

{ push (@Array, splice (@Array, int (rand (@Array - $_)), 1)); }

Simply shifting out an element at random, without an

array shuffle, is significantly more efficient. This

is the method I would use for your circumstances,

@Array = (1 .. 20);

for (1 .. scalar (@Array))

{ print splice (@Array, int (rand (@Array)), 1), " "; }

Purl Gurl

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once?

@Array = (1 .. 20);

do

{

for (1 .. $#Array)

{ push (@Array, splice (@Array, int (rand (@Array)), 1)); }

print shift @Array, " ";

}

until (scalar (@Array) == 0);

This modification will reduce the number of iterations,

for (1 .. int ($#Array / 2))

Clearly almost any divisor can be used, with less randomness

directly proportionate to increasing divisor value. Only

restriction is the divisor must yield a result of a counting

number with a value of one or greater. A return value of

one will not afford a high degree of randomness.

The scalar length of an array also effects degree of

randomness. For a scalar length of twenty, a divisor

of two returns fairly decent randomness. A divisor

of one returns very good randomness. I would use

a minimum of ten iterations for acceptable randomness.

This modification will force a more even distribution

but will always return the last element first for

the code syntax I display,

{ push (@Array, splice (@Array, int (rand (@Array - $_)), 1)); }

Simply shifting out an element at random, without an

array shuffle, is significantly more efficient. This

is the method I would use for your circumstances,

@Array = (1 .. 20);

for (1 .. scalar (@Array))

{ print splice (@Array, int (rand (@Array)), 1), " "; }

Purl Gurl

## Re: Using rand(), how to I avoid repeats?

> I'm using rand() to generate a series of random numbers. Is there any

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

Several solutions have been presented already. Here's one I haven't

seen posted in this thread yet: a "lazy sparse shuffle".

my %int_pool;

use constant MAX

sub get

my $n = keys %int_pool;

my $r = $n + int rand(MAX

my $o = (exists $int

$int

return $o;

}

This method works the like the hash solutions already posted, except

that it has a guaranteed upper bound for execution time, even if the

number space does get densely sampled. On the other hand, it only

works for sample spaces that can be easily enumerated.

--

Ilmari Karonen

If replying by e-mail, please replace ".invalid" with ".net" in address.

> way to prevent the same number from being used more than once? All I

> can think of is keeping a list of used numbers, and having some kind

> of "if used, next" statement, which seems lame. (Theoretically that

> could keep going forever, if it randomly kept selecting the same

> numbers.) Any thoughts?

Several solutions have been presented already. Here's one I haven't

seen posted in this thread yet: a "lazy sparse shuffle".

my %int_pool;

use constant MAX

___UNIQUE___INT => 2******32-1;sub get

___unique___int () {my $n = keys %int_pool;

my $r = $n + int rand(MAX

___UNIQUE___INT+1 - $n);my $o = (exists $int

___pool ? $int___pool : $r);$int

___pool = (exists $int___pool ? $int_pool : $n);return $o;

}

This method works the like the hash solutions already posted, except

that it has a guaranteed upper bound for execution time, even if the

number space does get densely sampled. On the other hand, it only

works for sample spaces that can be easily enumerated.

--

Ilmari Karonen

If replying by e-mail, please replace ".invalid" with ".net" in address.

#### Site Timeline

- » FAQ 1.3 Which version of Perl should I use?
- — Next thread in » PERL Discussions

- » Displaying graphics - beginner programmer
- — Previous thread in » PERL Discussions

- » s suffix question
- — Newest thread in » PERL Discussions

- » Dell Battery Slice LED codes
- — The site's Newest Thread. Posted in » Laptop Computers Forum