A python library for investigating some of the basic / classical elements of Random Matrix Theory, including eigenvalue generation, trimming, unfolding and computation and plotting of some spectral observables.
- empyricalRMT - Random Matrix Theory Tools for Python
- Features
- Examples
- Documentation
- Installation
- Limitations
Comparing actual eigenvalues to theory requires, in most cases, unfolding. This procedure
is often ad hoc, poorly documented, and dramatically impacts conclusions about the nature
of the systems under study. With empyricalRMT
, it is easy to unfold and visualize
different unfolding decisions:
import matplotlib.pyplot as plt
from empyricalRMT.eigenvalues import Eigenvalues
from empyricalRMT.smoother import SmoothMethod
eigs = Eigenvalues.generate(1000, kind="goe")
unfoldings = {
"Exponential": eigs.unfold(smoother=SmoothMethod.Exponential),
"Polynomial": eigs.unfold(smoother=SmoothMethod.Polynomial, degree=5),
"Gompertz": eigs.unfold(smoother=SmoothMethod.Gompertz),
"GOE": eigs.unfold(smoother=SmoothMethod.GOE),
}
N = len(unfoldings)
fig, axes = plt.subplots(ncols=3, nrows=N)
for i, (label, unfolded) in enumerate(unfoldings.items()):
title = f"{label} Unfolding"
unfolded.plot_nnsd(
title=title,
brody=True,
brody_fit="mle",
ensembles=["goe", "poisson"],
fig=fig,
axes=axes[i][0],
)
unfolded.plot_spectral_rigidity(title=title, ensembles=["goe"], fig=fig, axes=axes[i][1])
unfolded.plot_level_variance(title=title, ensembles=["goe"], fig=fig, axes=axes[i][2])
axes[i][0].legend().set_visible(False) if i != 0 else None
axes[i][1].legend().set_visible(False) if i != 0 else None
plt.show()
For a sample of unfolded eigenvalues, computation of the spectral rigidity,
and level number variance
requires some non-trivial Monte-Carlo computations that border on intractable
with plain Python and even NumPy / scikit-learn / scipy. empyricalRMT
uses
Numba and carefully written code to dramatically speed-up and parallelize the
computation of these metrics:
import numpy as np
from timeit import repeat
from empyricalRMT.eigenvalues import Eigenvalues
unfolded = Eigenvalues.generate(5000, kind="goe").unfold(smoother="goe")
L = np.arange(2, 50, 1, dtype=np.float64)
results = repeat(
"unfolded.spectral_rigidity(L)", number=10, globals=dict(unfolded=unfolded, L=L), repeat=10
)
print(f"Mean: {np.mean(results):0.2f}s. Range: [{np.min(results):0.2f}, {np.max(results):0.2f}]")
# Level number variance is far more variable for larger L, so use a smaller range here
L = np.arange(2, 20, 1, dtype=np.float64)
results = repeat(
"unfolded.level_variance(L)", number=10, globals=dict(unfolded=unfolded, L=L), repeat=10
)
print(
f"Mean: {np.mean(results):0.2f}s. Range: [{np.min(results):0.2f}, {np.max(results):0.2f}]"
)
Mean: 3.18s. Range: [2.88, 3.62]
Mean: 3.56s. Range: [3.08, 6.46]
Sample eigenvalues from large GOE matrices (provided they can fit in memory) fast via equivalently distributed tridiagonal matrices:
from empyricalRMT.construct import generate_eigs
eigs = generate_eigs(matsize=30000, kind="goe", log_time=True)
""" Output:
>>> 15:40:39 (Mar10) -- computing eigenvalues...
>>> 15:41:05 (Mar10) -- computed eigenvalues.
"""
E.g. under 30 seconds (Processor: 4-core / 8-threads, Intel(R) Xeon(R) CPU E3-1575M v5 @ 3.00GHz).
Sample a random GOE matrix and investigate some basic properties:
import numpy as np
from empyricalRMT._types import MatrixKind
from empyricalRMT.eigenvalues import Eigenvalues
from empyricalRMT.smoother import SmoothMethod
# generate eigenvalues from a 2000x2000 sample from the Gaussian Orthogonal Ensemble
eigs = Eigenvalues.generate(matsize=2000, kind=MatrixKind.GOE)
# unfold "analytically" using Wigner semi-circle
unfolded = eigs.unfold(smoother=SmoothMethod.GOE)
# visualize core spectral observables and unfolding fit
unfolded.plot_observables(
rigidity_L=np.arange(2, 20, 0.5),
levelvar_L=np.arange(2, 20, 0.5),
title="GOE Spectral Observables - Analytic Unfolding",
)
Plot some classic observables and compare to theory:
import numpy as np
from empyricalRMT._types import MatrixKind
from empyricalRMT.eigenvalues import Eigenvalues
from empyricalRMT.smoother import SmoothMethod
# generate eigenvalues from a 2000x2000 sample from the Gaussian Orthogonal Ensemble
eigs = Eigenvalues.generate(matsize=5000, kind=MatrixKind.GOE)
ensembles = ["poisson", "goe"] # theoretically expected curves to plot
unfolded.plot_nnsd(ensembles=ensembles) # nearest neighbours spacings
unfolded.plot_nnnsd(ensembles=["goe"]) # next-nearest neighbours spacings
unfolded.plot_spectral_rigidity(ensembles=ensembles)
unfolded.plot_level_variance(ensembles=ensembles)
Visually inspect / detect a questionable unfolding fit:
import matplotlib.pyplot as plt
from empyricalRMT.eigenvalues import Eigenvalues
from empyricalRMT.smoother import SmoothMethod
# generate time series data
T = np.random.standard_normal([1000, 250])
eigs = Eigenvalues.from_time_series(T, trim_zeros=False)
exp_unfolded = eigs.unfold(smoother=SmoothMethod.Exponential)
poly_unfolded = eigs.unfold(smoother=SmoothMethod.Polynomial, degree=5)
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, sharex=True, sharey=True)
exp_unfolded.plot_fit(fig=fig, axes=ax1, title="Exponential Unfolding")
poly_unfolded.plot_fit(fig=fig, axes=ax2, title="Polynomial Degree 5 Unfolding")
plt.show()
The source
code
aims to be well documented, and is the sole documentation for the project. If
you are using Python interactively (e.g. in a Jupyter notebook or REPL) then
these comments are quickly available by calling help(rmt_object)
. If youare
using a decent IDE or editor with appropriate extensions (e.g. PyCharm, VSCode,
modern Jupyter) you should be able to see the documentation on mouse hover or
via your editor shortcuts (e.g. Go To Definition).
Mostly, you just want import empyricalRMT.eigenvalues.Eigenvalues
and use
either Eigenvalues.generate()
to get fresh random eigenvalues, or
Eigenvalues(eigs)
for some ArrayLike
eigs
. Trimming and unfolding
methods are available on Eigenvalues
instances, and Eigenvalues.unfold()
gives you an empyricalRMT.unfold.Unfolded
instance, which has methods for
visualization and/or computation of spectral observables.
Otherwise, functional forms of core functions for computing spectral observables
are located in empyricalRMT.observables
, if you just want to pass in raw
NumPy arrays and get e.g. the spectral rigidity or level number variance.
pip install empyricalRMT
In your project using Poetry:
poetry add empyricalRMT && poetry install
git clone https://github.com/stfxecutables/empyricalRMT
cd empyricalRMT
poetry install --with dev
poetry shell
Run tests with pytest
.
Note this library is completely untested on Windows, but might work, if the dependencies also work on Windows. I've tried to use practices that have led to successful portability to Windows in the past, and there is nothing too fancy going on here, but it still very likely something will break on Windows.
This library was and is being developed on a reasonably decent machine (4 cores / 8 threads, 3.00GHz, 64GB RAM) that is currently running Ubuntu 18.04 LTS. Out of convenience, I have used Numba, but I know that this is not the most portable solution available, and may result in some issues.
Default values for most parameters were chosen because they provided reasonably accurate results (e.g. when compared to as predicted by theory) in reasonable amounts of time on this machine. I have tested the library on my old 2015 MacBook, and the defaults seem okay, but it is possible that they may not work well on your machine, especially with regards to memory.
RMT results are theoretical, and many results only hold for real numbers, as N approaches infinity. In practice, it seems that floating-point precision issues (whether in numerical integration, solving of eigenproblems, or in the convergence of some of the stochastic methods used in this library) plus the reality of working with finite matrices means that there are significant limits on the extent to which simulations agree with theory, especially when looking at long-range spectral observables (e.g. spectral rigidity or level number variance for L > 20) and when matrices are small (N < 2000).
Finally, I am just a dabbler in RMT. I have tried to limit myself to implementing only aspects I feel I understand, but I may have made some basic errors in implementation or understanding at some point. If you notice any issues or mistakes, corrections are always warmly welcomed!