Programming
Pearls
December 1984 column I therefore charged ahead and
described it in detail.
Unfortunately, I was wrong; fortunately, Bob Floyd
caught me sleeping. When he studied Gries’s imple-
mentation of Algorithm S in the April 1987 column, he
was bothered by a flaw that is displayed clearly when
M = N = 100. When Size = 99, the set S contains all but
one of the desired integers. The algorithm closes its
eyes and blindly guesses integers until it stumbles over
the right one, which requires 100 random numbers on
the average. That analysis assumes that the random
number generator is truly random; for some non-
random sequences, the algorithm won’t even terminate.
Floyd set out to find an algorithm that uses exactly
one call of
RandInt
for each random number in S. The
structure of Floyd’s algorithm is easy to understand re-
cursively: to generate a 5-element sample from l..lO,
we first generate a 4-element sample from 1..9, and
then add the fifth element. The recursive program is
sketched as Program Fl.
We can appreciate the correctness of Floyd’s algo-
rithm anecdotally. When M = 5 and N = 10, the algo-
rithm first recursively computes in S a 4-element
random sample from 1..9. Next it assigns to T a random
integer in the range l..lO. Of the 10 values that T can
assume, exactly 5 result in inserting 10 into S: the four
values already in S, and the value 10 itself. Thus ele-
ment 10 is inserted into the set with the correct
probability of 5/10. The next section proves that
Algorithm Fl produces each M-element sample with
equal probability.
function SampleCM, N)
if M = 0 then
return the empty set
else
S
:= SampleCM-I,
N-l)
T := RandInt(1, N)
if T is not in S then
insert T in S
else
insert N in S
return S
ALGORITHM Fl. Floyd’s Algorithm-Recursive
Because Algorithm Fl uses a restricted form of recur-
sion. Floyd was able to translate it to an iterative form
by introducing a new variable,
J.
(Problem 8 addresses
recursion removal in more general terms.) The result is
Algorithm F2, which is much more efficient than Algo-
rithm S, yet is almost as succinct. Problems 2 and 3
address the data structures that might be used to
implement the set S.
For those who might scoff that Algorithm F2 is “just
pseudocode,” Program F2 implements Floyd’s algorithm
in the Awk language. The June 1985 column sketched
initialize set S to empty
for J := N - M +
1
to N do
T
:= RandInt(1, J)
if T is not in S then
insert T in S
else
insert J in S
ALGORITHM F2. Floyd’s Algorithm-Iterative
that language, with particular emphasis on its associa-
tive arrays (which are used to implement sets in Pro-
gram F2). Complete with input and output, the program
requires only nine lines of Awk.
BEGIN {
m = ARGVLII; n = ARGV121
for Cj = n-m+l; j i= n; j++) {
t =
1
+ int(j * randO)
if (t in s) s[jl =
1
else s[tl =
1
for (i in s) print i
PROGRAM F2.
Floyd’s Algorithm in Awk
Random Permutations
Some programs that use a random sample require that
the elements of the sample occur in a random order.
Such a sequence is called a random permutation with-
out replacement of M integers from l..N. In testing a
sorting program, for instance, it is important that ran-
domly generated input occur in random order (if the
input were always sorted, for instance, it might not
fully exercise the sort code).
We could use Floyd’s Algorithm F2 to generate a
random sample, then copy it to an array, and finally
shuffle the elements of the array. This code randomly
scrambles the array X[l..M]:
for I
:= M downto 2 do
Swap(X[RandInt(l, I)], X[Il)
This three-step method uses 2M calls to
RandInt.
Floyd’s random permutation generator is very similar
to Algorithm F2; it is shown in Algorithm P, next page.’
To compute an M-element permutation from l..N, it
first computes an (M - I)-element permutation from
l..N - 1; a recursive version of the algorithm ehmi-
nates the variable I. The primary data structure of Al-
gorithm P, though, is a sequence rather than a set.
‘Flovd observes that Aleorithm P can be viewed as a nondeterministic aleo-
rithm and macroexpand~d into a backtracking procedure that generates aTI
permutations of M out of N. For details, see Floyd’s “Nondeterministic Algo-
rithms” in 1.
ACM 14. 4
(Oct. 1967).
636-644.
September 1987 Volume 30 Number 9
Communications
of
the ACM
755