Sunday, November 26, 2017

Basics of generating random numbers in Julia

Recently there were two questions regarding random number generation in Julia: one in discussion on Discourse and the other on Stack Overflow.
In this post I have tried to collect the basic information related to those two topics.

How does rand() work?

The default random number generator in Julia uses Mersenne Twister algorithm. By default a global instance of this generator is used. It can be directly accessed using variable Base.GLOBAL_RNG. Therefore calling rand() and rand(Base.GLOBAL_RNG) provide identical results.

You can set a seed to random number generator using srand function. There are several options for this but two most popular are:
  1. Calling srand() which will make Julia to use the system entropy to get a seed (you can think of this as random seed);
  2. Calling srand(i) where i is an integer. In this case you will get a reproducible sequence of random numbers.
An important detail of rand() is that it produces a value in the interval [0,1[ and generates exactly 52 bits of randomness - it can produce 2⁵² distinct values. In order to understand why you must know that internally Julia generates a value from the interval [1, 2[ and then subtracts 1 from it. The reason for this approach is that Float64 has 52 bit significand and in range [1,2[ all representable floating point values are equally spaced, see e.g. Wikipedia. This ensures that values generated by rand() are uniformly distributed.

Generating something else than one float from [0, 1[ interval

There are many options here, please refer to the Julia Manual for details. What I want to highlight here are the consequences of the fact that the underlying random number generator produces only 52 bits of randomness. Below we measure the time of obtaining one random sample of Int32 and Int64. Here is the benchmark:

julia> using BenchmarkTools

julia> @btime rand()
  7.464 ns (0 allocations: 0 bytes)
0.5055246442408914

julia> @btime rand(Int32)
  7.464 ns (0 allocations: 0 bytes)
37355051

julia> @btime rand(Int64)
  10.730 ns (0 allocations: 0 bytes)
2952626120715896373

The reason for the difference is that in order to get Int64 we need to sample twice because 64>52. You might ask if this would matter in practical code. Actually it does, because on most machines you run Julia Int is Int64. Here is an example of sampling a value from a range (my machine is 64 bit):

julia> @btime rand(1:10)
  37.789 ns (0 allocations: 0 bytes)
5

julia> @btime Int(rand(Int32(1):Int32(10)))
  26.126 ns (0 allocations: 0 bytes)
4

If you can accept a bit of inaccuracy in the distribution you can get more speed for generating 1:n range if n is small with the following code (I use n=10 as above):

julia> @btime ceil(Int, 10rand())
  15.395 ns (0 allocations: 0 bytes)
6

The discussion why it is inaccurate is presented on Discourse. Here let me give you an extreme example (the proper value of the evaluated expression is 0.5 in both cases):

julia> mean(ceil(Int64, 2^60 * rand()) % Bool for i in 1:10^6)
0.0

julia> mean(rand(1:2^60) % Bool for i in 1:10^6)
0.500582

We know why the first approach fails - Float64 does not have enough precision and we will always get an even number if we multiply such a value by 2⁶⁰.

And this is a less extreme example, but when multiplying by something smaller than 2⁵² (the correct result is 1):

julia> mean(rand(1:(2^50+2^49)) % 3 for i in 1:10^7)
1.0000001

julia> mean(ceil(Int64, (2^50+2^49) * rand()) % 3 for i in 1:10^7)
0.9586021

And again we see a bias (although not as large).

Using multiple random number generators

Sometimes it is enough to have a single source of pseudorandomness. In such cases Base.GLOBAL_RNG is enough. However, in many settings (like the one discussed on Stack Overflow) we need more than one independent random number generator. The usual approach is to generate seeds for consecutive random number generators that ensure that sequences produced by them do not overlap. In the paper Efficient Jump Ahead for F₂-Linear RandomNumber Generators you can find the detailed discussion of the topic.

In Julia you can generate such independent generators using randjump function. You give it a number of generators you require and they are guaranteed to use seeds that correspond to random number generator states separated by 10²⁰ steps.

There is one good practice if you decide use randjump. Call it only once do produce the required generators. Consecutive uses of randjump based on the same random number generator can give surprising results. In order to understand the reason for this approach consider the following example:

julia> srand(1)
MersenneTwister(UInt32[0x00000001], Base.dSFMT.DSFMT_state(Int32[1749029653, 1072851681, 1610647787, 1072862326, 1841712345, 1073426746, -198061126, 1073322060, -156153802, 1073567984  …  1977574422, 1073209915, 278919868, 1072835605, 1290372147, 18858467, 1815133874, -1716870370, 382, 0]), [1.23603, 1.34652, 1.31271, 1.00791, 1.48861, 1.21097, 1.95192, 1.9999, 1.25166, 1.98667  …  1.1489, 1.77623, 1.16774, 1.00214, 1.46738, 1.03244, 1.36938, 1.75271, 1.43027, 1.71293], 382)

julia> rand(100);

julia> x = randjump(Base.GLOBAL_RNG, 2)
2-element Array{MersenneTwister,1}:
 MersenneTwister(UInt32[0x00000001], Base.dSFMT.DSFMT_state(Int32[-423222098, 1072940746, 1823958146, 1073056597, 94617959, 1073021145, 2081944769, 1072701541, -1344696523, 1073205595  …  -1005791883, 1073144418, 24484970, 1073440808, 1370926729, 1336278534, -1527371338, -19485865, 382, 0]), [1.23603, 1.34652, 1.31271, 1.00791, 1.48861, 1.21097, 1.95192, 1.9999, 1.25166, 1.98667  …
  1.1489, 1.77623, 1.16774, 1.00214, 1.46738, 1.03244, 1.36938, 1.75271, 1.43027, 1.71293], 100)
 MersenneTwister(UInt32[0x00000001], Base.dSFMT.DSFMT_state(Int32[2095040610, 1073104849, 93118390, 1073625566, 1967429533, 1073003175, -889983042, 1073278562, 29166119, 1073602896  …  1102161927, 1072737217, -1901932835, 1073559995, 1843649418, -1848103721, -1079494630, -1219397333, 382, 0]), [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 382)

julia> rand();

julia> y = randjump(Base.GLOBAL_RNG, 2)
2-element Array{MersenneTwister,1}:
 MersenneTwister(UInt32[0x00000001], Base.dSFMT.DSFMT_state(Int32[-423222098, 1072940746, 1823958146, 1073056597, 94617959, 1073021145, 2081944769, 1072701541, -1344696523, 1073205595  …  -1005791883, 1073144418, 24484970, 1073440808, 1370926729, 1336278534, -1527371338, -19485865, 382, 0]), [1.23603, 1.34652, 1.31271, 1.00791, 1.48861, 1.21097, 1.95192, 1.9999, 1.25166, 1.98667  …
  1.1489, 1.77623, 1.16774, 1.00214, 1.46738, 1.03244, 1.36938, 1.75271, 1.43027, 1.71293], 101)
 MersenneTwister(UInt32[0x00000001], Base.dSFMT.DSFMT_state(Int32[2095040610, 1073104849, 93118390, 1073625566, 1967429533, 1073003175, -889983042, 1073278562, 29166119, 1073602896  …  1102161927, 1072737217, -1901932835, 1073559995, 1843649418, -1848103721, -1079494630, -1219397333, 382, 0]), [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], 382)

julia> x[1] === y[1]
true

julia> x[2] === y[2]
false

julia> x[2] == y[2]
true

What do we learn from it:

  • The first element of a vector of random number generators returned by randjump is simply its first argument - we can see it because x[1] === y[1] is true. No copying or jumping ahead is done.
  • Even though we have produced one random sample after creating x and before creating y (thus the state of Base.GLOBAL_RNG was not identical) the second elements of x and y have the same state, which can be seen by testing x[2] == y[2] (although they do not point to the same generator as x[2] === y[2] is false); the reason for this is that rand in Julia generates random numbers in batches of 382 size and calling rand() once in our case did not trigger the generation of the next batch of random numbers. So even though the state of Base.GLOBAL_RNG when generating x and y was not identical the internal state of Mersenne Twister generator was (it can be checked using Base.GLOBAL_RNG.state).
In short - it is safest not to call randjump several times in the same code.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.