Numba - an overview#
Numba is a just-in-time compiler for Python that works best on code that uses NumPy arrays and functions, and loops.
ArviZ includes Numba as an optional dependency and a number of functions have been included in utils.py
for systems in which Numba is pre-installed. Additional functionality, arviz.Numba
, of disabling/re-enabling numba for systems that have Numba installed has also been included.
A simple example to display the effectiveness of Numba#
data = np.random.randn(1000000)
%timeit variance(data, ddof=1)
140 ms ± 2.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit variance_jit(data, ddof=1)
1.03 ms ± 44.3 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
That is almost 150 times faster!! Let’s compare this to NumPy
%timeit np.var(data, ddof=1)
1.79 ms ± 124 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In certain scenarios, Numba can even outperform NumPy!
Numba within ArviZ#
Let’s see Numba’s effect on a few of ArviZ functions
summary_data = np.random.randn(1000, 100, 10)
school = az.load_arviz_data("centered_eight").posterior["mu"].values
The methods of the Numba
class can be used to enable or disable numba. The attribute numba_flag
indicates whether numba is enabled within ArviZ or not.
Numba.disable_numba()
Numba.numba_flag
False
%timeit ks_summary(summary_data)
57.8 ms ± 1.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit ks_summary(school)
462 µs ± 16.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Numba.enable_numba()
Numba.numba_flag
True
%timeit ks_summary(summary_data)
7.18 ms ± 359 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit ks_summary(school)
359 µs ± 62.7 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Numba has provided a substantial speedup once again.