Numba - start

https://numba.pydata.org/
Numba makes Python code fast

https://numba.readthedocs.io/en/stable/user/index.html
User Manual

https://pypi.org/project/numba/
numba 0.61.2, released: Apr 9, 2025 [Py3.10+]

INSTALLING NUMBA


CONDA
conda install numba

PIP (in a virtual environment)
pip install numba   # for Python 3 only

APT (for the entire computer)
Debian packages: python3-numba, numba-doc, and dependences.

Numba is often used as a core package so its dependencies are kept to an absolute minimum, however, extra packages can be installed as follows to provide additional functionality:
(1) 'scipy' - enables support for compiling 'numpy.linalg' functions.
(2) 'colorama' - enables support for color highlighting in backtraces/error messages.
(3) 'pyyaml' - enables configuration of Numba via a YAML config file.
(4) 'icc_rt' - allows the use of the Intel SVML (high performance short vector math library, x86_64 only).

Working in the virtual environment with the latest version of Numba is recommended.

INTRODUCTION

Numba is an open source JIT (just-in-time) compiler that translates a subset of Python and NumPy code into fast machine code. Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library.

Numba offers a range of options for parallelizing your code for CPUs and GPUs, often with only minor code changes.


# Checking your installation.
$ python3
Python 3.7.3 (default, Mar 23 2024, 16:12:05) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numba
>>> numba.__version__
'0.42.0'            # Debian 10
>>> quit()
$ numba -s          # information about your system capabilities
System info:
--------------------------------------------------------------------------------
__Time Stamp__
2025-06-07 17:58:19.391019

__Hardware Information__
Machine                                       : x86_64
CPU Name                                      : skylake
Number of accessible CPU cores                : 12
Listed accessible CPUs cores                  : 0-11
CFS restrictions                              : None
CPU Features                                  : 
adx aes avx avx2 bmi bmi2 clflushopt cmov cx16 f16c fma fsgsbase invpcid lzcnt
mmx movbe pclmul popcnt prfchw rdrnd rdseed sahf sgx sse sse2 sse3 sse4.1 sse4.2
ssse3 xsave xsavec xsaveopt xsaves

__OS Information__
Platform                                      : Linux-4.19.0-27-amd64-x86_64-with-debian-10.13
Release                                       : 4.19.0-27-amd64
System Name                                   : Linux
Version                                       : #1 SMP Debian 4.19.316-1 (2024-06-25)
OS specific info                              : debian10.13
glibc info                                    : glibc 2.28

__Python Information__
Python Compiler                               : GCC 8.3.0
Python Implementation                         : CPython
Python Version                                : 3.7.3
Python Locale                                 : pl_PL UTF-8

__LLVM information__
LLVM version                                  : 7.0.1

__CUDA Information__
CUDA driver library cannot be found or no CUDA enabled devices are present.
Error class: <class 'numba.cuda.cudadrv.error.CudaSupportError'>

__ROC Information__
ROC available                                 : False
Error initialising ROC due to                 : No ROC toolchains found.
No HSA Agents found, encountered exception when searching:
Error at driver init: 
NUMBA_HSA_DRIVER /opt/rocm/lib/libhsa-runtime64.so is not a valid file path.  
Note it must be a filepath of the .so/.dll/.dylib or the driver:

__SVML Information__
SVML state, config.USING_SVML                 : False
SVML library found and loaded                 : False
llvmlite using SVML patched LLVM              : False
SVML operational                              : False

__Threading Layer Information__
TBB Threading layer available                 : True
OpenMP Threading layer available              : True
Workqueue Threading layer available           : True

__Numba Environment Variable Information__
None set.

__Conda Information__
Conda not present/not working.
Error was [Errno 2] No such file or directory: 'conda': 'conda'

# numba1.py
from numba import jit
import numpy as np
import time

x = np.arange(1000000, dtype=np.float64).reshape(1000, 1000)
# Numba likes NumPy arrays, Python lists should be avoided.

@jit(nopython=True)   # set "nopython" mode for best performance
def go_fast(arr):   # function is compiled to machine code when called the first time
    trace = 0.0
    for i in range(arr.shape[0]):   # Numba likes loops
        trace += np.tanh(arr[i, i]) # Numba likes NumPy functions
    return arr + trace              # Numba likes NumPy broadcasting

start = time.time()
go_fast(x)
end = time.time()
print("Elapsed (with compilation) = {}".format(end - start))

start = time.time()
go_fast(x)
end = time.time()
print("Elapsed (after compilation) = {}".format(end - start))

# Results.
# Without numba = 0.0017561912536621094
# Elapsed (with compilation) = 0.12379336357116699
# Elapsed (after compilation) = 0.0009722709655761719

# Second version - eager compilation.
from numba import njit, float64

@njit('float64[:,:](float64[:,:])')
def go_fast(arr): ...

OBJECT MODE AND NOPYTHON MODE

The Numba @jit decorator fundamentally operates in two compilation modes, 'nopython' mode and 'object' mode.

The behaviour of the 'nopython' compilation mode is to essentially compile the decorated function so that it will run entirely without the involvement of the Python interpreter. This is the recommended and best-practice way to use the Numba @jit decorator as it leads to the best performance.

Should the compilation in 'nopython' mode fail, Numba can compile using 'object' mode. This is a fall back mode for the @jit decorator if 'nopython=True' is not set. In this mode Numba will identify loops that it can compile and compile those into functions that run in machine code, and it will run the rest of the code in the interpreter. For best performance avoid using this mode!

Speed up varies depending on application but can be one to two orders of magnitude.

Warning. Now (numba 0.55.2) the NumbaDeprecationWarning is raised if this fall-back behaviour is present without 'forceobj=True'. @jit will by default compile in 'nopython' mode and 'object' mode compilation will become opt-in only.

HOW DOES NUMBA WORK?

Numba reads the Python bytecode for a decorated function and combines this with information about the types of the input arguments to the function. It analyzes and optimizes your code, and finally uses the LLVM compiler library to generate a machine code version of your function, tailored to your CPU capabilities. This compiled version is then used every time your function is called.