OMAR — Open Multivariate Adaptive Regression
Python package for discovering localised, linear structures in complex, high-dimensional datasets.
What is omar?
omar (Open Multivariate Adaptive Regression) is a MARS [Friedman, 1991; 1993] implementation. It approximates high-dimensional functions by an additive model of low-dimensional functions. It does so by exploiting local lower dimensional manifolds by an additive expansion in a subset of the complete tensor product of the univariate truncated power spline basis. To sample the resulting, exponentially growing function space, it employs a heuristic grow-prune strategy.
It is designed to automatically construct accurate, interpretable, and efficient piecewise-linear models of functions with many predictor and one response variable.
omar is ideal when:
- You want to find the best possible linear approximation in high-dimensional, noisy data.
- You prefer models with interpretable basis functions.
- You need a fast, scalable tool for MARS modeling.
The model has the form:
$$ \hat{f}(x) = \sum_{n} a_n B_n(x) $$
where each $B_n(x) = \prod \text{max}(\pm(x-t),0)$ is a piecewise linear basis function at root $t$. A basis function therefore looks like this
Computational Backends
To enable practical use on modern hardware, omar includes two compute backends:
- Pure Python for accessibility and clarity, speedup with Numba for JIT-compiled performance on CPU.
- Fortran via
f2pywith native BLAS/LAPACK routines and OpenMP parallelism.
Installation
The easiest way to get started is by installing the prebuilt wheel from PyPI pip install omar.
Citations
- Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67. [JSTOR](http://www.jstor.org/stable/10.2307/2241837)
- Friedman, J. H. (1993). Fast MARS. Technical Report No. 110, Stanford University.
- Krause, O., & Igel, C. (2015). A More Efficient Rank-one Covariance Matrix Update for Evolution Strategies. FOGA ‘15, 129–136. [DOI](https://doi.org/10.1145/2725494.2725496)
Key highlights
- Modernized version of the Multivariate Adaptive Regression Splines (MARS)
- Improved numerical efficiency, based on modern rank-one update strategies
- Optional Fortran acceleration with OpenMP parallelism for large datasets