Apple Silicon notes¶
merton is built and tested on Apple Silicon Macs (M1/M2/M3/M4) as a
first-class target. Wheels are published for macos-14 arm64 in the
release matrix, so pip install merton Just Works on every Apple
Silicon laptop running Python 3.11+.
MLX (Metal GPU)¶
Install the mlx extra to route panel math to the GPU:
uv pip install "merton[mlx]"
MLX uses a unified-memory model: arrays live in shared memory that the
CPU and GPU both see. There’s no host-to-device copy when handing data
between code paths, which is a big win versus the CUDA model. The Metal
kernels are routed via mlx.core.erf for the normal CDF.
import mlx.core as mx
from merton import distance_to_default
A = mx.array([100.0, 200.0, 300.0])
dd = distance_to_default(A, 0.25, 60.0, 0.04, 1.0) # stays on MLX
Free-threaded Python¶
Apple Silicon laptops typically have 8-12 performance cores. Combined with free-threaded Python 3.13t / 3.14t, panel calibration scales linearly to the core count. See Free-threaded Python (PEP 703).
Native universal2 wheels?¶
We publish arm64-only Apple Silicon wheels (no universal2). Intel
Macs use the separate macos-13 x86_64 wheel. The reason is wheel size:
universal2 doubles the bundled Numba cache for no real benefit (Apple
stopped shipping Intel laptops in 2022).