Julia snippets: A small adventure into Julia macro land

The Julia Manual teaches us that

Julia evaluates default values of function arguments every time the method is invoked, unlike in Python where the default values are evaluated only once when the function is defined.

in Noteworthy differences from Python section.

However, sometimes you want a value to be evaluated only once when the function is defined. Recently a probably obvious fact has downed on me that this can conveniently be achieved using macros. Here is a simple example:

macro intvec()
println("Hey!")
Int[]
end

function f(x)
v = @intvec()
push!(v, x)
v
end

When you run this code you can observe that Hey! is printed once (when @intvec is evaluated).

Now let us check how the function works. Running:

for i in 1:5
println(f(i))
end

produces:

[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]

and we can see that @intvec was not run (no Hey! is printed). This is natural - macros are evaluated only once before the program is actually executed.

Another small example using comprehensions:

a = [Int[] for i in 1:3]
b = [@intvec() for i in 1:3]

push!(a[1], 1)

push!(b[1], 1)

Now let us compare the contents of a and b:

julia> a
3-element Array{Array{Int64,1},1}:
[1]
Int64[]
Int64[]

julia> b
3-element Array{Array{Int64,1},1}:
[1]
[1]
[1]

And we see that in case of b each index points to the same array.

One might ask if it is only a special case or it does actually mater in daily Julia usage. The situation where this distinction is important came up recently when writing documentation of @threads macro. If you check out a definition of f_fix function there you will find:

function f_fix()
s = repeat(["123", "213", "231"], outer=1000)
x = similar(s, Int)
rx = [Regex("1") for i in 1:nthreads()]
@threads for i in 1:3000
x[i] = findfirst(rx[threadid()], s[i]).start
end
count(v -> v == 1, x)
end

where we use Regex("1") instead of a more natural r"1" exactly because the latter would create only one instance of regex object.

So the question is what is the benefit of r"1" then? The answer is performance - we have to compile the regex only once. This saves time if a function containing it would be called many times, e.g.:

julia> f() = match(r"1", "123")
f (generic function with 2 methods)

julia> g() = match(Regex("1"), "123")
g (generic function with 1 method)

julia> using BenchmarkTools

julia> @benchmark f()
BenchmarkTools.Trial:
memory estimate: 240 bytes
allocs estimate: 4
--------------
minimum time: 139.043 ns (0.00% GC)
median time: 143.627 ns (0.00% GC)
mean time: 170.929 ns (12.91% GC)
maximum time: 2.854 μs (90.67% GC)
--------------
samples: 10000
evals/sample: 916

julia> @benchmark g()
BenchmarkTools.Trial:
memory estimate: 496 bytes
allocs estimate: 9
--------------
minimum time: 5.754 μs (0.00% GC)
median time: 6.687 μs (0.00% GC)
mean time: 7.313 μs (0.00% GC)
maximum time: 97.039 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 6

The lesson is typical for Julia - you can squeeze out a performance but there are consequences that you should be aware of.

2 comments:

FrontRangeGamerFebruary 26, 2018 at 11:54 PM
The manual says "A macro maps a tuple of arguments to a returned expression...". I found your results a bit confusing until I realized `@intvec` isn't returning an expression. If you use `macro intvec2() quote Int[] end end` you get a distinct array on each invocation. Why do repeated calls to `push!(@intvec,1)` always return `[1]` and not return Vectors of increasing size? I expect the latter from your `b = [@intvec() for i in 1:3]` example.

Note: Only a member of this blog may post a comment.

Sunday, February 25, 2018

A small adventure into Julia macro land

2 comments: