Something you need to determine is how 'fair' you want this to be and the performance as the unused space becomes exhausted.
In essence, you want a random box from a N dimensional array. Its not the box itself that is important, but the location of the box that is important.
Under the covers, a 10x20 array is often represented as a 1x200 array with some math behind it to access the right spot. If you were to access [5][13]
, you are actually accessing location [5*20 + 13]
. You can use the same approach to go from a number back to the position. Location 113 goes to the integer devision by 20 and remainder giving 5 r13
.
So now, you don't need to actually store 200 pairs (though that isn't a lot), you just need a bitfield of 200 bits long. Generate a random number within the proper range and mark it as used in the bitfield.
Now, the question of how do you handle it when you've got a collision? This goes to the various hash collision techniques used in hash tables. Some won't work for this application, but its a good read nonetheless.
A simple approach would be once you have one collision, just start incrementing where you are looking at until you find one that wasn't used. The increment could be 1, or any number that is relativity prime to the size of the space.
Ok, I'm at a place I can sit down and write something. Its perl. Shouldn't be too far off of php and is rather straight forward.
#!/usr/bin/perl
my $x = shift @ARGV;
my $y = shift @ARGV;
my $f = shift @ARGV; # fill factor
my $t = $x * $y; # total space
my $v = '';
foreach (1 .. int($t * $f)) {
my $r = int(rand($t)); # random number from 0 .. $t-1
my $yp = int($r / $x); # y' (y prime)
my $xp = int($r % $x); # x' (x prime)
print "Trying $r: $xp $yp...\n";
while(vec($vec, $r, 1)) {
$yp = int($r / $x);
$xp = int($r % $x);
print "\tcollision at $r: $xp $yp\n";
$r += 1;
$r %= $t; # scale $r to within $t
}
$yp = int($r / $x);
$xp = int($r % $x);
vec($vec, $r, 1) = 1; # set the $r th bit
print "\tsettled at $r: $xp $yp\n";
}
Ultimately, the 'settled' values are the ones that you want.
You read two numbers from the command line and assign them to x
and y
. The total search space is t
- this is how big all the possible numbers are. Additionally, read a fill factor as $f
, this should be a value less than 1 and is used to limit the list iteration. Setting a value greater than 1 will present an infinite loop.
I'm filling up the space to the specified value (the foreach (1 .. ($t * $f))
). No, there isn't any error checking on $f
to make sure it is less than or equal to 1, but there should be.
So pick a random number from 0 to $t
- 1. The spot that this represents is $yp
and $xp
- y prime and x prime.
There is some perlish things here, vec works with a bit vector of arbitrary size. There are a number of ways of doing this in a given language, its just rather easy with perl. With Java, one could use a BitSet (this is big enough to hold maxint bits, which could represent a pair of 46340 numbers).
You then test the bit at the $r
th location to see if it has been used. If it has, increment $r
and roll it over if it becomes larger than the total space (so if you have a 10 and 20 (0 .. 199) and you hit 200, it becomes 0).
Once you find an unused bit, set it and output your values.
This is what the output looks like though:
Trying 96: 6 9...
settled at 96: 6 9
Trying 117: 7 11...
collision at 117: 7 11
settled at 118: 8 11
Trying 115: 5 11...
settled at 115: 5 11
Trying 153: 3 15...
settled at 153: 3 15
Trying 90: 0 9...
settled at 90: 0 9
Trying 140: 0 14...
collision at 140: 0 14
collision at 141: 1 14
collision at 142: 2 14
settled at 143: 3 14
Trying 73: 3 7...
settled at 73: 3 7
This is tested on a unix system thus:
% rndpair.pl 10 20 0.5 | grep settled | sort | uniq | wc -l
100
This shows that with 100 numbers there are 100 unique pairs printed out (just look at the 'settled' lines).
There's a fair bit of debugging information in there (you don't really need to keep resetting the value of $yp
and $xp
until the end.
And then there's the question of how fast is it? This is to generate 50% of the available pairs for the available space. Realize that there is some time tied up in the sort and uniq (of possibly some not small bits of text):
% time ./rndpair.pl 500 500 0.5 | grep settled | sort| uniq | wc -l
125000
real 0m3.528s
user 0m3.901s
sys 0m0.045s
Lets kick it up a notch and remove the other applications from the chain.
% time ./rndpair.pl 5000 5000 0.5 > /dev/null
real 0m34.668s
user 0m34.482s
sys 0m0.180s
How much room did that 5k x 5k storage space take? 25,000,000 bits, or about 3 megabytes.
Note that the performance of this drops as the fill factor goes up. For a search space of 6, 80610 (a previous comment if I read it right) this runs quite quickly (note the increasing times as the fill factor goes up):
% time ./rndpair.pl 6 80610 0.5 | grep settled | wc -l
241830
real 0m0.827s
user 0m1.562s
sys 0m0.029s
% time ./rndpair.pl 6 80610 0.75 | grep settled | wc -l
362745
real 0m1.721s
user 0m3.151s
sys 0m0.043s
% time ./rndpair.pl 6 80610 0.9 | grep settled | wc -l
435294
real 0m3.993s
user 0m7.031s
sys 0m0.079s
Best Answer
Sign is on a good track, but his algorithm is wrong. It is not much random. It is actually pretty hard to create random like this. I was playing around with this and everything I tried created obvious patterns when printed in 2D. In the end I manged to create an algorithm that doesn't create any eye-visible patterns. I looked for inspiration in existing random algorithms.
When this algorithm is used to render a 4-shades of gray image, it creates this:![random noise](https://i.stack.imgur.com/xPrdw.png)
For comparison, the Random algorithm creates this pattern:![enter image description here](https://i.stack.imgur.com/ADfeF.png)
And Sign's algorithm too has patterns:![enter image description here](https://i.stack.imgur.com/KQeoZ.png)