The problem is taken from the book Foundations and Methods of Stochastic Simulation by Barry Nelson where it is solved using VBA in section 4.5.
I will not repeat the whole discussion of the model, but focus on the numerical aspect. We are asked to calculate the value of the expression:
A single pass of the Monte Carlo simulation is approximated by a discrete sum:
The parameters we will use in the simulation are: T=1, r=0.05, K=55, σ=0.3, m=1000 and starting value of X at time 0 is 50. We will run 100,000 replications of Monte Carlo simulation and calculate the average.
I want to compare five implementations of the above problem:
- Julia using loops
- Julia using vectorized code
- Python using loops
- Numba using loops
- NumPy using vectorized code
First go the codes in Julia (running time is given in the comment at the end):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
function v_asian_sample(T, r, K, σ, X₀, m::Integer) | |
X = X₀ | |
x̂ = zero(X) | |
Δ = T/m | |
for i in 1:m | |
X *= exp((r-σ^2/2)*Δ + σ*√Δ*randn()) | |
x̂ += X | |
end | |
exp(-r*T)*max(x̂/m - K, 0) | |
end | |
function v_asian_sample_vec(T, r, K, σ, X₀, m::Integer, X) | |
Δ = T/m | |
randn!(X) | |
X .*= σ*√Δ | |
X .+= (r-σ^2/2)*Δ | |
X .= exp.(cumsum!(X, X)) | |
exp(-r*T)*max(mean(X)*X₀ - K, 0) | |
end | |
function v_asian_loop(T, r, K, σ, X₀, m, n, fun) | |
if fun == v_asian_sample | |
mean(fun(T, r, K, σ, X₀, m) for i in 1:n) | |
else | |
X = Vector{Float64}(m) | |
mean(fun(T, r, K, σ, X₀, m, X) for i in 1:n) | |
end | |
end | |
function v_asian(T, r, K, σ, X₀, m, n, fun) | |
v_asian_loop(promote(T, r, K, σ, X₀)..., m, n, fun) | |
end | |
for f in [v_asian_sample, v_asian_sample_vec], i in 1:2 | |
println(f) | |
@time v_asian(1, 0.05, 55, 0.3, 50, 1000, 100_000, f) | |
end | |
# Output: | |
# | |
# v_asian_sample | |
# 1.733317 seconds (69.79 k allocations: 3.802 MiB) | |
# v_asian_sample | |
# 1.591367 seconds (8 allocations: 128 bytes) | |
# v_asian_sample_vec | |
# 2.861599 seconds (1.29 M allocations: 38.848 MiB, 2.02% gc time) | |
# v_asian_sample_vec | |
# 2.344178 seconds (800.01 k allocations: 12.215 MiB, 2.51% gc time) |
And here are the codes in Python:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import math | |
import random | |
import statistics | |
import numpy | |
from timeit import default_timer as timer | |
from numba import jit | |
def v_asian_sample(T, r, K, s, X0, m): | |
xhat = 0.0 | |
X = X0 | |
D = T / m | |
for i in range(m): | |
X *= math.exp(random.normalvariate((r-s**2/2)*D, s*D**0.5)) | |
xhat += X | |
return math.exp(-r*T)*max(xhat/m - K, 0) | |
@jit | |
def v_asian_sample_jit(T, r, K, s, X0, m): | |
xhat = 0.0 | |
X = X0 | |
D = T / m | |
for i in range(m): | |
X *= math.exp(random.normalvariate((r-s**2/2)*D, s*D**0.5)) | |
xhat += X | |
return math.exp(-r*T)*max(xhat/m - K, 0) | |
def v_asian_sample_vec(T, r, K, s, X0, m): | |
D = T / m | |
X = numpy.random.normal((r-s**2/2)*D, s*D**0.5, m) | |
return math.exp(-r*T)*max(numpy.mean(numpy.exp(numpy.cumsum(X)))*X0 - K, 0) | |
def v_asian(T, r, K, s, Xo, m, n, fun): | |
return statistics.mean([fun(T, r, K, s, Xo, m) for i in range(n)]) | |
for f in [v_asian_sample, v_asian_sample_jit, v_asian_sample_vec]: | |
for i in range(2): | |
start = timer() | |
v_asian(1.0, 0.05, 55.0, 0.3, 50, 1000, 100000, f) | |
duration = timer() - start | |
print(f, "\n ", duration, " seconds") | |
# Output: | |
# | |
# <function v_asian_sample at 0x000001BAC05B8F28> | |
# 218.74880357997608 seconds | |
# <function v_asian_sample at 0x000001BAC05B8F28> | |
# 218.45428551167777 seconds | |
# CPUDispatcher(<function v_asian_sample_jit at 0x000002722635BEA0>) | |
# 5.729526031990998 seconds | |
# CPUDispatcher(<function v_asian_sample_jit at 0x000002722635BEA0>) | |
# 5.397216579782951 seconds | |
# <function v_asian_sample_vec at 0x000002722636A1E0> | |
# 7.5086485608902915 seconds | |
# <function v_asian_sample_vec at 0x000002722636A1E0> | |
# 7.4985504866390045 seconds |
Julia implementation is a bit more fancy as it promotes the arguments to a common type on the run and in Python I pass all values as floats (which is simpler). Also vectorized code in Julia uses in-place updating. In general I tried to make the implementations a normal code in respective languages.
The key take-aways are the following:
The key take-aways are the following:
- standard Python is a no-go solution for this problem;
- loops in Julia are fastest;
- somewhat surprisingly vectorized Julia code is faster than Numba although the former has to allocate more memory;
- NumPy implementation is around three times slower than vectorized Julia;
- Vectorized Julia code hugely benefits from in-place operations (that avoid memory allocation); however, even without these optimizations it was faster than Numba.