Saturday, August 11, 2018

ABM speed in Julia

In my last post I have discussed how you can implement a basic ABM model (forest fire) in Julia. I have used the approach of replicating how a model is implemented in NetLogo.

However, I have mentioned there that actually you can make your code run faster if you write it using the features Julia has to offer. Prompted by a recent post by Christopher Rackauckas (http://www.stochasticlifestyle.com/why-numba-and-cython-are-not-substitutes-for-julia/) I had thought to share the examples of possible implementations. The examples are not as advanced as what Chirs presents in his blog, but still show that one can get in Julia a 100x speedup over NetLogo.

If you want to follow the codes below in detail I recommend you to first read my earlier post and the associated codes. You can see there that the fastest timings there are of order of several seconds.

The first implementation is very similar to what we did in the last post in forestfire1.jl code. The only change is that we dynamically build a list of trees to be processed in the next epoch and keep it in newqueue vector. In the next epoch we sequentially visit them. Here is the code:

using Random

function setup(density)
    [rand() < density ? 1 : 0 for x in 1:251, y in 1:251]
end

function go_repeat(density)
    grid = setup(density)
    init_green = count(isequal(1), @view grid[2:end,:])
    queue = [(1, y) for y in 1:size(grid, 2)]
    while true
        newqueue = similar(queue, 0)
        for (x,y) in shuffle!(queue)
            grid[x, y] = 3
            for (dx, dy) in ((0, 1), (0, -1), (1, 0), (-1, 0))
                nx, ny = x + dx, y + dy
                if all((0,0) .< (nx, ny) .≤ size(grid)) && grid[nx, ny] == 1
                    grid[nx, ny] = 2
                    push!(newqueue, (nx, ny))
                end
            end
        end
        if isempty(newqueue)
            return count(isequal(3), @view grid[2:end,:]) / init_green * 100
        end
        queue = newqueue
    end
end

Here are its timings:

julia> @time [go_repeat(0.55) for i in 1:100];
  0.485022 seconds (1.02 M allocations: 107.575 MiB, 5.55% gc time)

julia> @time [go_repeat(0.75) for i in 1:100];
  0.501906 seconds (428.08 k allocations: 303.946 MiB, 9.91% gc time)

And we see that it is significantly faster than what we had.

However, if we agree to abandon the requirement that we want to exactly replicate the process of NetLogo we can go even faster. In this code I use depth first search and recursion to simulate forest fire (so the sequence of trees that catch fire is different). However, thanks to the fact that in Julia function calls are cheap this code runs fester than the previous one:

function setup(density)
    [rand() < density ? 1 : 0 for x in 1:251, y in 1:251]
end

function go(grid, x, y)
    grid[x, y] = 3
    x > 1 && grid[x-1,y] == 1 && go(grid, x-1, y)
    y > 1 && grid[x,y-1] == 1 && go(grid, x, y-1)
    x < size(grid, 1) && grid[x+1,y] == 1 && go(grid, x+1, y)
    y < size(grid, 2) && grid[x,y+1] == 1 && go(grid, x, y+1)
end

function go_repeat(density)
    grid = setup(density)
    init_green = count(isequal(1), @view grid[2:end,:])
    for y in 1:size(grid, 2)
        go(grid, 1, y)
    end
    count(isequal(3), @view grid[2:end,:]) / init_green * 100
end

Incidentally - it is even shorter. The timings of the code are the following:

julia> @time [go_repeat(0.55) for i in 1:100];
  0.305739 seconds (580.47 k allocations: 76.692 MiB, 6.93% gc time)

julia> @time [go_repeat(0.75) for i in 1:100];
  0.257212 seconds (133.33 k allocations: 54.668 MiB, 3.34% gc time)

and we see that it is yet faster (the first timing is longer because we are getting to a point where precompilation time of Julia starts to matter on the second run of the code for density equal to 0.55 it is around 0.15 seconds).

Finally with the release of Julia 1.0 it handles small unions fast, you can read about it here. This means that if we do not have too many types of agents we should be fine just like with one type of agent. Actually the practical limit is that we should not have more than three explicit types in a container to be sure that it runs fast. From my experience this is enough in 95% of cases.

Here you have a minimal modification of forestfire3.jl from my earlier post that run in times of the order of 30 to 90 seconds. The change is that I reduce the number of agents to two and keep nothing to indicate empty place on the grid (so efficiently we have three types of elements in a container). The code is almost identical:

using Random, Statistics

struct TreeGreen
end

struct TreeRed
    x::Int
end

function setup(density)
    Union{Nothing, TreeGreen, TreeRed}[x == 1 ? TreeRed(0) : rand() < density ?
                                       TreeGreen() : nothing for x in 1:251, y in 1:251]
end

function go(grid, tick)
    any(isequal(TreeRed(0)), grid) || return true
    for pos in shuffle!(findall(isequal(TreeRed(0)), grid))
        x, y = pos[1], pos[2]
        for (dx, dy) in ((0, 1), (0, -1), (1, 0), (-1, 0))
            nx, ny = x + dx, y + dy
            if all((0,0) .< (nx, ny) .≤ size(grid)) && grid[nx, ny] isa TreeGreen
                grid[nx, ny] = TreeRed(0)
            end
        end
        grid[pos] = TreeRed(tick)
    end
    return false
end

function go_repeat(density)
    grid = setup(density)
    init_green = count(isequal(TreeGreen()), @view grid[2:end, :])
    tick = 1
    while true
        go(grid, tick) && return count(t -> t isa TreeRed, @view grid[2:end, :]) / init_green * 100
        tick += 1
    end
end

and here are the timings:

julia> @time [go_repeat(0.55) for i in 1:100];
  6.732611 seconds (1.49 M allocations: 137.314 MiB, 0.40% gc time)

julia> @time [go_repeat(0.75) for i in 1:100];
  16.854758 seconds (523.59 k allocations: 312.620 MiB, 0.35% gc time)


They are of course slower than what we had above but still noticeably faster than NetLogo.

In conclusion we see that in Julia you have a great flexibility to adjust the form of the implementation to the problem at hand so that you can maximize the efficiency of the code if needed. Additionally, usually you do not have to pay a huge price of much more complex code. Finally, Julia 1.0 handles small unions efficiently, so you can expect a reasonable performance even in moderately complicated cases (and if you go really crazy with the complexity you can use tricks I have discussed in my last post).