Tuesday, September 26, 2017

Bridge hand distribution: simulation vs exact calculation

I have decided to start blogging about Julia language after having blogged about R for several years.
The objective of this blog is to provide short examples of how Julia can be used to solve computational problems.

For the start I have thought about reproducing the code generating distribution of high card points (HCP) in a pair or bridge hands that I have implemented in R some time ago (you can find it here).

The reason I have chosen this example for the first post is that implementation in Julia is much simpler than in R (well - it is also faster, but the gains are eaten by compilation times required for plotting).

Here is the code:
using StatsBase # sample function
using Combinatorics # combinations function
using Plots # plotting infrastructure
# Code calculating distribution of HCP in hands of a pair of players in bridge
# Ace is worth 4 points, King 3, Queen 2, Jack 1, all other cards (36 in total) count as 0
# Get the distribution using simulation
function sim_hand()
n = 1_000_000
deck = [repmat(1:4, 4); fill(0, 36)] # whole deck in HCP
distr = fill(0, 41)
for i in 1:n
hcp = sum(sample(deck, 26, replace=false))
distr[hcp + 1] += 1
end
distr / n
end
# Calculate the distribution exactly
function exact_hand()
high = repmat(1:4, 4) # cards that have non zero value in HCP
distr = fill(0, 41)
distr[1] = binomial(16, 0) * binomial(36, 26)
for i in 1:16
low = binomial(36, 26 - i) # ways to sample cards that have zero value in HCP
for c in combinations(high, i)
hcp = sum(c)
distr[hcp + 1] += low
end
end
distr / sum(distr)
end
# Run the computations and plot the results
# semicolons suppress printing of results in REPL if you copy-paste the code
srand(1);
hcp = 0:40;
sim = sim_hand();
exact = exact_hand();
plotly();
plt = plot(hcp, exact, label="exact", xlab="HCP", ylab="probability");
scatter!(plt, hcp, sim, label="simulated")
view raw bridgehand.jl hosted with ❤ by GitHub
By simpler I mean that in this problem using loops is a natural approach. The benefits in my opinion are (you can compare to code in R here):

  • the code uses much less memory; which would be a significant issue in simulation approach in R implementation, consider sampling not 1,000,000, but e.g. 10,000,000,000 points (and in Julia it would be simple to run such a simulation on multiple cores);
  • the exact calculation code is much simpler to understand as you do not have to make workarounds to vectorize your calculations.

And here is the result you should get when you run the code: