Unformatted text preview: external e ects and are thus, for all practical purposes,
impossible to keep aligned in time in a predictable way. If the assumption is made that
random number generation from a single generator will occur, across processors, in a certain
predictable order, then that assumption will quite likely be wrong. A number of techniques
have been developed that guarantee reproducibility in multiprocessor settings and with various types of Monte Carlo problems. We will consider only simple extensions to our previous
discussion of LCGs, but acknowledge that there are many approaches to parallel random 26
number generation in the literature. The rst situation we address involves using LCGs in
a xed number of MIMD processes, where that number is known at the beginning of a run.
Suppose we know in advance that we will have N independent processes and that we
will need N independent streams of random numbers. Then the best strategy for using an
LCG is to split its period into nonoverlapping segments, each of which will be accessed by
a di erent process. This amounts to nding N seeds which are known to be far apart on
the cycle produced by the LCG. To nd such seeds, rst consider (for c = 0), the LCG rule
successively applied:
Xn+1 = aXn (mod m)
Xn+2 = aXn+1 = a2Xn (mod m)
Xn+3 = aXn+2 = a3Xn (mod m)
:::
Xn+k = aXn+k = ak Xn (mod m)
Thus we can \leap ahead" k places of the period by multiplying the current seed value by
ak mod m. For our purposes, we would like N starting seeds, spaced at roughly k = P=N
steps apart. Since k is likely to be quite large, it is not practical to compute ak one step at
a time. Instead we compute an (L + 1)long array, d, the poweroftwo powers of a: d0 = a
d1 = d02
d2 = d12
:::
dL = dL;1 2
where dL is the largest poweroftwo power of a that is still smaller than k. I.e. L is the
integer part of the log (base 2) of a. For example, assume that k = 91 = 10110112 (very
small, but big enough to show how it works). Then
d = (a a2 a4 a8 a16 a32 a64)
and since 91 = 64 + 16 + 8 + 2 + 1, then a91 = a64 a16 a8 a2 a1 = d6 d4 d3 d1 d0
Thus for any k, ak = di for all i for which bit i in the basetwo representation of k is a one.
Therefore we can leap ahead by k cycle steps with no more than log2k multiplies. Once ak
is computed, the N seeds can be determined by the procedure:
Choose seed1 Remarks 27 seed2 = ak seed1 (mod m)
seed3 = ak seed2 (mod m)
:::
seedN = akseedN ;1 (mod m)
With these seeds, each of the N processes will generate random numbers from nearly equallyspaced points on the cycle. As long as no process needs more than k random numbers, a
condition easily met for some applications, then no overlap will occur. Everything just said
for MIMD processes applies equally well to SIMD programs, where the number of random
number streams needed is (usually) known at run time.
The development of the leap ahead technique just described assumed that c = 0 in the
LCG rule. For c 6= 0, Leap ahead can still be accomplished in a similar way, if one constructs
the log2klong array of partial sums of the form: sj = Pj=0 ai where, as before, j is a power
i
of two. The details are left as an exercise for the reader.
The second and more di cult case to consider is when we do not know at the beginning of
program execution how many processes (generators) we will need. The splitting of processes
in such programs are data driven and in most cases occur as the result of prior Monte Carlo
trials taken many steps earlier. The problem is to spawn new LCG seeds in a way that is
both reproducible and which yields independent new streams.
Here we only mention a generalized approach that works within limits. Further details
can be found in Frederickson, et al., 1984] Consider an LGC with the property that each X
has two successors, a \left" successor, XL, and a \right" successor, XR . These are generated
as follows:
L(X ) = XL = aLX + cL (mod m) and
R(X ) = XR = aRX + cR (mod m)
Figure 7 shows the action of these operations with respect to a starting seed X0 . Taken separately, the XL and XR sequences are simple LCGs that traverse the set f0 1 2 : : : m ; 1g
in di erent order. Alternatively (and the method in which these generators are typically
used), the XL rule produces a pseudorandom leapahead for the XR sequence, thus deterministically producing a seed for a new, spawned, subsequence of the \right" cycle. With
such a mechanism that uses only local information from a process, reproducibility can be
established. Frederickson gives a formula for the selection of the constants in the succession rules that satis es a particular independence criterion, given some constraints. The
interested reader is referred to Frederickson, et al., 1984 for further enlightenment. Exercise 8  Vectorization of Cray's LCG Complete Exercise 6 above to determine the parameters a and m (probably m = 248) of
Cray's ranf(). Develop a vectorized version of ranf() by creating a vector of successive
multiples of the coe cient a. For...
View
Full
Document
This document was uploaded on 01/28/2014.
 Fall '14

Click to edit the document details