Monday, December 3, 2018

Release notes for the DataFrames.jl package v0.15.0

The DataFrames.jl package is getting closer to 1.0 release. In order to reach this level maturity a number of significant changes is introduced in this release. Also it should be expected that in the near future there will be also several major changes in the package to synchronize it with Julia 1.0.

This post is divided into three sections:
  1. A brief statement of major changes in the package since the last release.
  2. Planned changes in the comming releases.
  3. A more detailed look at the selected changes.
Soon I will update https://github.com/bkamins/Julia-DataFrames-Tutorial to reflect those changes.

A brief statement of major changes in the package since the last release

  • Finish deprecation period of makeunique keyword argument; now data frames will throw an error when supplied with duplicate columns unless explicitly it is allowed to auto-generate non-conflicting column names;
  • Major redesign of split-apply-combine mehtods leading to cleaner code and improved performance (see below for the details); also now all column mapping-functions allow functions, types or functors to perform transformations (earlier only functions were allowed)
  • If anonymous function is used in split-apply-combine functions (e.g. by or aggregate) then its auto-generated name is function;
  • Allow comparisons for GroupedDataFrame
  • A decision to treat data frame as collection of rows; this mainly affects names of supported functions; deprecation of length (use size instead), delete! (renamed), merge! (removed), insert! (renamed), head (use first), and tail (use last)
  • deprecate setindex! with nothing on RHS that dropped column in the past
  • A major review of getindex and view methods for all types defined in the DataFrames.jl package to make them consistent with Julia 1.0 behavior
  • Allow specifying columns to completecases, dropmissing and dropmissing!
  • Make all major types defined in the DataFrames.jl package immutable to avoid their allocation (they were mostly mutable in the past)
  • Add convinience method that copy on DataFrameRow produces a NamedTuple; add convinience method that copy on DataFrameRow produces a NamedTuple; both changes should make using DataFrameRow more convinient as it is now more commonly encountered because getindex with a single row selected will return a value of this type
  • Fixed show methods for data frames with csv/tsv target
  • HTML output of the data frame also prints its dimensions
  • significant improvements in the documentation of the DataFrames.jl package

Planned changes in the comming releases

  • further, minor, cleanup of split-apply-combine interface (there are some corner cases still left to be fixed)
  • major review of setindex! (similar to what was done with getindex in this release); in particular to support broadcasting consistent with Julia 1.0; expect breaking (and possibly major) changes here
  • finish deprecation periods for getindex and eachcol
(this list will probably be longer but those are things that are a priority)

A more detailed look at the selected changes

An improved split-apply-combine

The first thing I want to highlight it a new split-apply-combine API with an improved performance (a major contribution of @nalimilan).

Consider the following basic setting:

using DataFrames

df = DataFrame(x=categorical(repeat(string.('a':'f') .^ 4, 10^6)),
               y = 1:6*10^6)

Below, I report all timings with @time to get the feel of real workflow delay, but the times are reported after precompilation.
Code that worked under DataFrames 0.14.1 along with its timing is the following:

julia> @time by(df, :x, v -> DataFrame(s=sum(v.y)));
  0.474845 seconds (70.90 k allocations: 296.545 MiB, 31.18% gc time)

Now under DataFrames 0.15.0 it is:

julia> @time by(df, :x, v -> DataFrame(s=sum(v[:,:y])));
  0.234375 seconds (6.34 k allocations: 229.213 MiB, 35.98% gc time)


julia> @time by(df, :x, s = :y=>sum);
  0.114782 seconds (332 allocations: 137.347 MiB, 14.21% gc time)
Observe that there are two levels of speedup:
  • even under an old API we get 2x speedup due to better handling of grouping;
  • if we use a new type-stable API not only the code is shorter but it is even faster.
Now let us dig into the options the new API provides. I will show them all by example (I am omitting the old API with function passed - it still works unchanged):
by(df, :x, s = :y=>sum, p = :y=>maximum) # one or more keyword arguments
by(df, :x, :y=>sum, :y=>maximum) # one or more positional arguments
by(:y=>sum, df, :x) # a Pair as the first argument
by((s = :y=>sum, p = :y=>maximum), df, :x) # a NamedTuple of Pairs
by((:y=>sum, :y=>maximum), df, :x) # a Tuple of Pairs
by([:y=>sum, :y=>maximum], df, :x) # a vector of Pairs
Now, if you use a Pair, a tuple or a vector option (i.e. all other than keyword arguments or NamedTuple) then you can return a NamedTuple instead of a DataFrame to give names to the columns, which is faster, especially when there are many small groups, e.g.:
by(df, :x, x->(a=1, b=sum(x[:, :y])))
by(df, :x, :y => x->(a=1, b=sum(x))) # faster with a column selector
instead of the old:
by(df, :x, x->DataFrame(a=1, b=sum(x[:, :y])))

EDIT:
You can pass more than one column in this way. Then the columns are passed as a named tuple, e.g.

julia> using Statistics

julia> df = DataFrame(x = repeat(1:2, 3), a=1:6, b=1:6);

julia> by(df, :x, str = (:a, :b) => string)
2×2 DataFrame
│ Row │ x     │ str                            │
│     │ Int64 │ String                         │
├─────┼───────┼────────────────────────────────┤
│ 1   │ 1     │ (a = [1, 3, 5], b = [1, 3, 5]) │
│ 2   │ 2     │ (a = [2, 4, 6], b = [2, 4, 6]) │

julia> by(df, :x, cor = (:a, :b) => x->cor(x...))
2×2 DataFrame
│ Row │ x     │ cor     │
│     │ Int64 │ Float64 │
├─────┼───────┼─────────┤
│ 1   │ 1     │ 1.0     │
│ 2   │ 2     │ 1.0     │

A more flexible eachrow and eachcol

Now the eachrow and eachcol functions return a value that is a read-only subtype of AbstractVector. This allows users to flexibly use all getindex mechanics from Base on these return values. For example:
julia> using DataFrames

julia> df = DataFrame(x=1:5, y='a':'e')
5×2 DataFrame
│ Row │ x     │ y    │
│     │ Int64 │ Char │
├─────┼───────┼──────┤
│ 1   │ 1     │ 'a'  │
│ 2   │ 2     │ 'b'  │
│ 3   │ 3     │ 'c'  │
│ 4   │ 4     │ 'd'  │
│ 5   │ 5     │ 'e'  │

julia> er = eachrow(df)
5-element DataFrames.DataFrameRows{DataFrame}:
 DataFrameRow (row 1)
x  1
y  a
 DataFrameRow (row 2)
x  2
y  b
 DataFrameRow (row 3)
x  3
y  c
 DataFrameRow (row 4)
x  4
y  d
 DataFrameRow (row 5)
x  5
y  e

julia> ec = eachcol(df)
┌ Warning: In the future eachcol will have names argument set to false by default
│   caller = top-level scope at none:0
└ @ Core none:0
2-element DataFrames.DataFrameColumns{DataFrame,Pair{Symbol,AbstractArray{T,1} where T}}:
┌ Warning: Indexing into a return value of eachcol will return a pair of column name and column value
│   caller = _getindex at abstractarray.jl:928 [inlined]
└ @ Core .\abstractarray.jl:928
┌ Warning: Indexing into a return value of eachcol will return a pair of column name and column value
│   caller = _getindex at abstractarray.jl:928 [inlined]
└ @ Core .\abstractarray.jl:928
 ┌ Warning: Indexing into a return value of eachcol will return a pair of column name and column value
│   caller = _getindex at abstractarray.jl:928 [inlined]
└ @ Core .\abstractarray.jl:928
[1, 2, 3, 4, 5]
 ['a', 'b', 'c', 'd', 'e']
And now you can index-into them like this (essentially any indexing Base allows for AbstractVector):
julia> ec[end]
┌ Warning: Indexing into a return value of eachcol will return a pair of column name and column value
│   caller = top-level scope at none:0
└ @ Core none:0
5-element Array{Char,1}:
 'a'
 'b'
 'c'
 'd'
 'e'

julia> er[1:3]
3-element Array{DataFrameRow{DataFrame},1}:
 DataFrameRow (row 1)
x  1
y  a
 DataFrameRow (row 2)
x  2
y  b
 DataFrameRow (row 3)
x  3
y  c
You will notice massive warnings when using eachcol. They will be removed in the next release of the DataFrames.jl package and are due to two reasons:
  • there are now two variants of eachcol; one returning plain columns (called by eachcol(df, false); the other returning plain column names and value (called by eachcol(df, true)); in the past calling eachcol(df) defaulted to the true option; in the future it will default to false to be consistent with https://github.com/JuliaLang/julia/pull/29749;
  • geting values of eachcol result returning value with column name in the past was inconsistent depending if we indexed into it or iterated over it; in the future it will always return a Pair in all cases.

Consistent getindex and view methods

There was a major redesign of how getindex and view work for all types that the DataFrames.jl package defines. Now they are as consistent with Base as was possible. The lengthtly details are outlined in https://juliadata.github.io/DataFrames.jl/latest/lib/indexing.html. Here are the key highlights of the new rules:
  • using @view on getindex will always consistently return a view containing the same values as getindex would return (in the past this was not the case);
  • selecting a single row with an integer from a data frame will return a DataFrameRow (it was a DataFrame in the past); this was a tough decision because DataFrameRow is a view, so one should be careful when using setindex! on such object, but it is guided by the rule that selecting a single row should drop a dimension like indexing in Base;
  • selecting multiple rows of a data frame will always perform a copy of columns (this was not consistent earlier; also the behavior follows what Base does); selecting columns without specifying rows returns an underlying vector; so for example, the difference is that now df[:, cols] performs a copy and df[cols] will not perform a copy of the underlying vectors.
Currently you will get many deprecation warnings where indexing rules will change. In the next release of the DataFrames.jl package these changes will be made.

Sunday, November 11, 2018

Disabling auto-indentation of code in Julia REPL

With the recent release of Julia 1.0.2 there is still a small annoyance in the Julia REPL on Windows. If you copy-paste a code like this:

function f()
    for i in 1:10
        if i > 5
            println(i)
        end
    end
end

from your editor to your Julia REPL, you get the following result:

julia> function f()
           for i in 1:10
                   if i > 5
                               println(i)
                                       end
                                           end
                                           end
f (generic function with 1 method)

Notice, that Julia automatically indents the code which is pasted, but the code is already indented so the result does not look nice. This gets really bad when you paste 50 lines of highly nested code.

There is an open PR to fix this issue here, but since it did not get into Julia 1.0.2 I thought that I would post the hack I use to disable auto-indentation. Run the following lines in your Julia REPL:

import REPL
REPL.GlobalOptions.auto_indent = false
REPL.LineEdit.options(s::REPL.LineEdit.PromptState) = REPL.GlobalOptions

and now if you copy-paste to Julia REPL the code we have discussed above you get:

julia> function f()
           for i in 1:10
               if i > 5
                   println(i)
               end
           end
       end
f (generic function with 1 method)

and all is formatted as expected.

The solution overwrites REPL.LineEdit.options method to make sure that we always use REPL.GlobalOptions with auto-indentation disabled. It is not ideal, but I find it good enough till the issue is resolved.

If you would want to use this solution by default you can put the proposed code in your ~/.julia/config/startup.jl file.

Saturday, August 11, 2018

ABM speed in Julia

In my last post I have discussed how you can implement a basic ABM model (forest fire) in Julia. I have used the approach of replicating how a model is implemented in NetLogo.

However, I have mentioned there that actually you can make your code run faster if you write it using the features Julia has to offer. Prompted by a recent post by Christopher Rackauckas (http://www.stochasticlifestyle.com/why-numba-and-cython-are-not-substitutes-for-julia/) I had thought to share the examples of possible implementations. The examples are not as advanced as what Chirs presents in his blog, but still show that one can get in Julia a 100x speedup over NetLogo.

If you want to follow the codes below in detail I recommend you to first read my earlier post and the associated codes. You can see there that the fastest timings there are of order of several seconds.

The first implementation is very similar to what we did in the last post in forestfire1.jl code. The only change is that we dynamically build a list of trees to be processed in the next epoch and keep it in newqueue vector. In the next epoch we sequentially visit them. Here is the code:

using Random

function setup(density)
    [rand() < density ? 1 : 0 for x in 1:251, y in 1:251]
end

function go_repeat(density)
    grid = setup(density)
    init_green = count(isequal(1), @view grid[2:end,:])
    queue = [(1, y) for y in 1:size(grid, 2)]
    while true
        newqueue = similar(queue, 0)
        for (x,y) in shuffle!(queue)
            grid[x, y] = 3
            for (dx, dy) in ((0, 1), (0, -1), (1, 0), (-1, 0))
                nx, ny = x + dx, y + dy
                if all((0,0) .< (nx, ny) .≤ size(grid)) && grid[nx, ny] == 1
                    grid[nx, ny] = 2
                    push!(newqueue, (nx, ny))
                end
            end
        end
        if isempty(newqueue)
            return count(isequal(3), @view grid[2:end,:]) / init_green * 100
        end
        queue = newqueue
    end
end

Here are its timings:

julia> @time [go_repeat(0.55) for i in 1:100];
  0.485022 seconds (1.02 M allocations: 107.575 MiB, 5.55% gc time)

julia> @time [go_repeat(0.75) for i in 1:100];
  0.501906 seconds (428.08 k allocations: 303.946 MiB, 9.91% gc time)

And we see that it is significantly faster than what we had.

However, if we agree to abandon the requirement that we want to exactly replicate the process of NetLogo we can go even faster. In this code I use depth first search and recursion to simulate forest fire (so the sequence of trees that catch fire is different). However, thanks to the fact that in Julia function calls are cheap this code runs fester than the previous one:

function setup(density)
    [rand() < density ? 1 : 0 for x in 1:251, y in 1:251]
end

function go(grid, x, y)
    grid[x, y] = 3
    x > 1 && grid[x-1,y] == 1 && go(grid, x-1, y)
    y > 1 && grid[x,y-1] == 1 && go(grid, x, y-1)
    x < size(grid, 1) && grid[x+1,y] == 1 && go(grid, x+1, y)
    y < size(grid, 2) && grid[x,y+1] == 1 && go(grid, x, y+1)
end

function go_repeat(density)
    grid = setup(density)
    init_green = count(isequal(1), @view grid[2:end,:])
    for y in 1:size(grid, 2)
        go(grid, 1, y)
    end
    count(isequal(3), @view grid[2:end,:]) / init_green * 100
end

Incidentally - it is even shorter. The timings of the code are the following:

julia> @time [go_repeat(0.55) for i in 1:100];
  0.305739 seconds (580.47 k allocations: 76.692 MiB, 6.93% gc time)

julia> @time [go_repeat(0.75) for i in 1:100];
  0.257212 seconds (133.33 k allocations: 54.668 MiB, 3.34% gc time)

and we see that it is yet faster (the first timing is longer because we are getting to a point where precompilation time of Julia starts to matter on the second run of the code for density equal to 0.55 it is around 0.15 seconds).

Finally with the release of Julia 1.0 it handles small unions fast, you can read about it here. This means that if we do not have too many types of agents we should be fine just like with one type of agent. Actually the practical limit is that we should not have more than three explicit types in a container to be sure that it runs fast. From my experience this is enough in 95% of cases.

Here you have a minimal modification of forestfire3.jl from my earlier post that run in times of the order of 30 to 90 seconds. The change is that I reduce the number of agents to two and keep nothing to indicate empty place on the grid (so efficiently we have three types of elements in a container). The code is almost identical:

using Random, Statistics

struct TreeGreen
end

struct TreeRed
    x::Int
end

function setup(density)
    Union{Nothing, TreeGreen, TreeRed}[x == 1 ? TreeRed(0) : rand() < density ?
                                       TreeGreen() : nothing for x in 1:251, y in 1:251]
end

function go(grid, tick)
    any(isequal(TreeRed(0)), grid) || return true
    for pos in shuffle!(findall(isequal(TreeRed(0)), grid))
        x, y = pos[1], pos[2]
        for (dx, dy) in ((0, 1), (0, -1), (1, 0), (-1, 0))
            nx, ny = x + dx, y + dy
            if all((0,0) .< (nx, ny) .≤ size(grid)) && grid[nx, ny] isa TreeGreen
                grid[nx, ny] = TreeRed(0)
            end
        end
        grid[pos] = TreeRed(tick)
    end
    return false
end

function go_repeat(density)
    grid = setup(density)
    init_green = count(isequal(TreeGreen()), @view grid[2:end, :])
    tick = 1
    while true
        go(grid, tick) && return count(t -> t isa TreeRed, @view grid[2:end, :]) / init_green * 100
        tick += 1
    end
end

and here are the timings:

julia> @time [go_repeat(0.55) for i in 1:100];
  6.732611 seconds (1.49 M allocations: 137.314 MiB, 0.40% gc time)

julia> @time [go_repeat(0.75) for i in 1:100];
  16.854758 seconds (523.59 k allocations: 312.620 MiB, 0.35% gc time)


They are of course slower than what we had above but still noticeably faster than NetLogo.

In conclusion we see that in Julia you have a great flexibility to adjust the form of the implementation to the problem at hand so that you can maximize the efficiency of the code if needed. Additionally, usually you do not have to pay a huge price of much more complex code. Finally, Julia 1.0 handles small unions efficiently, so you can expect a reasonable performance even in moderately complicated cases (and if you go really crazy with the complexity you can use tricks I have discussed in my last post).

Tuesday, July 31, 2018

ABC of ABM in Julia

TL;DR: When writing agent-based models in Julia try to use a single agent type to get good performance. If you definitely need more than one type of agent you still can get a good performance but it requires a bit more complex design of your code.

Introduction

In this post I discuss basic approaches to implementing Agent Based Models (ABM) in Julia. It covers a fragment of a tutorial that I will be giving with Przemysław Szufel at Social Simulation Conference 2018, workshop Running  high performance simulations with Julia programming language on Monday, August 20.

The post summarizes some thoughts about issues raised in recent discussion on Discourse about Agent Based Modeling in Julia.

While there are many possible approaches to implementation of ABMs in Julia I want to concentrate on basic techniques that can be picked up by someone who just starts learning Julia.

Our working example is implementation of forest fire model which is described in detail an excellent book An Introduction to Agent-Based Modeling by Uri Wilensky and William Rand. We will exactly reproduce the NetLogo implementation model. In particular I will avoid certain possible optimizations of the code to keep the organization of the logic follow NetLogo implementation.

This post is divided into three sections:
  1. Explaining how NetLogo model works
  2. Implementation in Julia using a single type of agent
  3. Implementation in Julia using several types of agents
All examples in this post should be run under Julia 0.7, currently in beta. I will update the codes if something would start to break after Julia 0.7 is released and that is why in this post they are linked as gists (here is a link for the impatient).

Also I assume that you know Julia a bit (during the workshop at the conference all will be explained starting from the basics).

Forest fire model

We have a 251x251 rectangular grid. Initially each cell of the grid is empty or contains a tree.
A tree has three possible states: green, on fire and burnt. Initially all trees are green and a tree is present in a cell with probability density which is a parameter of the model.

Now how the model works. In the initial step we set that all cells in the first row of the grid to contain trees that are on fire. Next in each step:
  1. we select all trees that are on fire;
  2. we iterate through them in a random order;
  3. for each tree on fire if it touches a green tree then the green tree is set on fire;
  4. finally the on fire tree changes state to burnt.
Our question is what percentage of trees that are initially present will get burnt in the process.

As a reference on my laptop running 100 replications of this model for density=0.55 takes around 30 seconds with all animations and updating disabled, and for density=0.75 it is over one minute (I do not try here or below to do very precise benchmarks as I want to concentrate on orders of magnitude).

A single type of agent

In this model implementing it with a single type of agent (a tree) is natural. Such an implementation can be expected to be easily made efficient in Julia. The reason is that we will have all containers (vectors, matrices, sets, dictionaries, etc.) hold a single concrete type. The benefit of this is the following in terms of performance:
  1. Julia compiler should be able to infer types of all variables in all functions (we know that we have only one type of agent).
  2. In particular (and this is often crucial) Julia compiler knows what method of a function it should dispatch if some method has the agent as a parameter (e.g. action of the agent).
In this case the power of Julia is that mostly, when you think about performance, you do not care if the type to represent an agent is in-built into Julia or your custom type nor whether it is an immutable or mutable type (there are differences and probably there are cases when they are significant but I want to stress the first level of thinking).

To see this consider two implementations of the model. The first one uses Int (in-built, immutable) to represent agent state on the grid, the second uses custom Tree type (user-defined, mutable and storing some more information than Int-version):
  1. Version using integers: forestfire1.jl
  2. Version using custom type: forestfire2.jl
As you can see the model with Tree type does a bit more work but essentially the code logic is very similar. Here are timings of running the codes:

$ julia7 forestfire1.jl
  2.190497 seconds (1.11 M allocations: 112.423 MiB, 0.94% gc time)
  5.586829 seconds (465.93 k allocations: 303.833 MiB, 0.92% gc time)

$ julia7 forestfire2.jl
  2.924750 seconds (7.44 M allocations: 305.734 MiB, 3.42% gc time)
  7.770683 seconds (6.76 M allocations: 495.277 MiB, 6.90% gc time)

The version using integers is a bit faster as expected but they are both significantly (10x) faster than NetLogo and the timings are of the same order of magnitude.

Several types of agents

In this example using one type of agent is natural, but let us test what happens if we force several types of agents into the model. Specifically we notice that in forestfire2.jl we have when field meaningful only for burned (brown) tree. So we decide to use three separate types of agents TreeGreen, TreeRed and TreeBrown. Additionally then we denote cell without a tree with nothing.

The implementation of such a model is given in file forestfire3.jl. The problem with it is that it is much slower as grid matrix has type Any (you can test yourself that making all tree types a subtype of some abstract type or making type of the matrix a union does not change what we get below). Therefore we can expect that it will be much slower. This is confirmed by running the model:

$ julia7 forestfire3.jl
 37.078864 seconds (694.45 M allocations: 20.809 GiB, 4.40% gc time)
 90.306497 seconds (2.04 G allocations: 60.961 GiB, 5.49% gc time)

and we see that we are roughly at the speed level of NetLogo.

The good thing is that Julia allows us to write such a code and in many cases it will be fast enough. In particular the code is much slower because agents to a lot of very simple actions so the cost of iteration is much larger than the cost of actions themselves. If agents had a complex and expensive logic then it could be moved out to a function (a technique called barrier functions) and the overhead of type instability would not be that significant.

However, the question is if we can make code fast using agents of heterogeneous types. Here we will consider the simplest possible technique that allows to achieve this. What you essentially do is:
  1. store information about agents in a tuple, I call it trees in the code
  2. each entry of this tuple is a collection of agents of a single type (in the example we will use a vector but the choice of collection should be tailored to the needs of the simulation)
  3. you create a single type, I call it TreeID in the code that allows you to select an appropriate element from the tuple in a type stable way; in our example it holds two fields:
    • typ identifying agent type (number of slot within a tuple)
    • loc identifying agent location (position of agent within the collection that is a slot of a tuple)
  4. the crucial thing is that the trees tuple holding collections of agents of homogeneous type should be always indexed by a number known at compile time (alternatively you could create a struct and select its fields) - this ensures that all usages of trees tuple will allow the compiler to infer the type of the result (in short what you have to avoid is passing a variable to index trees tuple; the general pattern is to use a sequence of if-elseif-elseif... statements based on the value of typ in TreeID)
The code implementing this pattern is given here forestfire4.jl. I have even complicated it a bit on purpose by adding x and y fields to TreeRed and defining burn function that has to be called to show that Julia is able to handle them at compile time. The downside is that the code got a bit more complex. We have the following mapping of typ value in TreeID:
  • 0 means no tree (thus no mapping to trees is needed)
  • 1 means green tree
  • 2 means red tree
  • 3 means brown tree
The crucial question is what is the performance of this pattern. Here is a result of running the code:

$ julia7 forestfire4.jl
  2.403108 seconds (1.04 M allocations: 171.372 MiB, 1.24% gc time)
  6.376856 seconds (505.42 k allocations: 653.171 MiB, 2.38% gc time)

And we see that it is very good.

The crucial benefits of this pattern are the following (by ID-structure I call an equivalent of TreeID in a general code):
  1. You can iterate over agents in whatever order you want (the ID-structure does not force you to process types of agents in separate batches)
  2. You can perform actions that rely on type of agent without having to reach to the agent; you can do it on ID-structure level;
  3. You can use ID-structure anywhere you want (it can be in action scheduler, it can be in a representation of locations of agents in space, it can be in a graph of connections, ...)
  4. If you have methods that should have different implementations depending on agent type then passing them ID-structure and using if-elseif-elseif... template inside you can store the logic that depends on the tuple-container (or struct-container) structure only in a few places of your code and most of the time not have to care about it by working on ID-structure level.

Sunday, February 25, 2018

A small adventure into Julia macro land

The Julia Manual teaches us that
Julia evaluates default values of function arguments every time the method is invoked, unlike in Python where the default values are evaluated only once when the function is defined.
in Noteworthy differences from Python section.

However, sometimes you want a value to be evaluated only once when the function is defined. Recently a probably obvious fact has downed on me that this can conveniently be achieved using macros. Here is a simple example:

macro intvec()
    println("Hey!")
    Int[]
end

function f(x)
    v = @intvec()
    push!(v, x)
    v
end

When you run this code you can observe that Hey! is printed once (when @intvec is evaluated).

Now let us check how the function works. Running:

for i in 1:5
    println(f(i))
end

produces:

[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]

and we can see that @intvec was not run (no Hey! is printed). This is natural - macros are evaluated only once before the program is actually executed.

Another small example using comprehensions:

a = [Int[] for i in 1:3]
b = [@intvec() for i in 1:3]
push!(a[1], 1)
push!(b[1], 1)

Now let us compare the contents of a and b:

julia> a
3-element Array{Array{Int64,1},1}:
 [1]
 Int64[]
 Int64[]

julia> b
3-element Array{Array{Int64,1},1}:
 [1]
 [1]
 [1]

And we see that in case of  b each index points to the same array.

One might ask if it is only a special case or it does actually mater in daily Julia usage. The situation where this distinction is important came up recently when writing documentation of @threads macro. If you check out a definition of f_fix function there you will find:

function f_fix()
    s = repeat(["123", "213", "231"], outer=1000)
    x = similar(s, Int)
    rx = [Regex("1") for i in 1:nthreads()]
    @threads for i in 1:3000
        x[i] = findfirst(rx[threadid()], s[i]).start
    end
    count(v -> v == 1, x)
end

where we use Regex("1") instead of a more natural r"1" exactly because the latter would create only one instance of regex object.

So the question is what is the benefit of r"1" then? The answer is performance - we have to compile the regex only once. This saves time if a function containing it would be called many times, e.g.:

julia> f() = match(r"1", "123")
f (generic function with 2 methods)

julia> g() = match(Regex("1"), "123")
g (generic function with 1 method)

julia> using BenchmarkTools

julia> @benchmark f()
BenchmarkTools.Trial:
  memory estimate:  240 bytes
  allocs estimate:  4
  --------------
  minimum time:     139.043 ns (0.00% GC)
  median time:      143.627 ns (0.00% GC)
  mean time:        170.929 ns (12.91% GC)
  maximum time:     2.854 μs (90.67% GC)
  --------------
  samples:          10000
  evals/sample:     916

julia> @benchmark g()
BenchmarkTools.Trial:
  memory estimate:  496 bytes
  allocs estimate:  9
  --------------
  minimum time:     5.754 μs (0.00% GC)
  median time:      6.687 μs (0.00% GC)
  mean time:        7.313 μs (0.00% GC)
  maximum time:     97.039 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     6

The lesson is typical for Julia - you can squeeze out a performance but there are consequences that you should be aware of.