https://numba.readthedocs.io/en/stable/user/5minguide.html
The recommended way to use the @jit decorator is to let Numba decide when and how to optimize.
from numba import jit
@jit(nopython=True)
def f(x, y):
return x + y
print(f(1,2)) # return 3
print(f(1.3,2)) # return 3.3
print(f(1j,2)) # return (2+1j)
# Three calls will generate different code paths.
We can tell Numba the function signature we are expecting. No other specialization will be allowed.
from numba import jit, int64
@jit(int64(int64, int64),nopython=True)
def f(x, y):
return x + y
print(f(1,2)) # return 3
print(f(1.3,2)) # return 3 (float -> int)
#print(f(1j,2)) # TypeError
A number of keyword-only arguments can be passed to the @jit decorator.
nopython=True # force 'nopython mode' forceobj=True # force 'object mode' fastmath=True # relax some numerical rigour parallel=True # enables automatic parallelization (no GIL!), only in 'nopython mode' nogil=True # Numba will release the GIL cache=True # using a file-based cache debug=True # for debugging
https://numba.pydata.org/numba-doc/0.11/prange.html
Numba implements the ability to run loops in parallel. The loops body is scheduled in seperate threads, and they execute in a nopython numba context. 'prange' automatically takes care of data privatization and reductions.
# numba3.py
from numba import njit, float64, prange
x = np.arange(1000000, dtype=np.float64)
#@njit(parallel=True) # with parallelization
#@njit('float64(float64[:])') # eager compilation.
@njit
def parallel_sum(arr):
result = 0.0
for i in prange(arr.shape[0]):
result += arr[i]
return result
start = time.time()
parallel_sum(x)
end = time.time()
print("Elapsed (with compilation) = {}".format(end - start))
start = time.time()
parallel_sum(x)
end = time.time()
print("Elapsed (after compilation) = {}".format(end - start))
# Results.
# Without numba = 0.11021208763122559
# Elapsed (with compilation) = 0.05432915687561035
# Elapsed (after compilation) = 0.0010497570037841797
# With parallelization (12 cores).
# Elapsed (with compilation) = 0.1597743034362793
# Elapsed (after compilation) = 0.0004372596740722656