11 Jan 2024 - tsp
Last update 11 Jan 2024
6 mins
Fitting models to data is a task that has to be done by a physicists on a daily basis. lmfit
is a very nice
and easy to use library when doing this in Python. Upfront - this is not the original way how one’s supposed
to build models using lmfit
. lmfit
is supposed to
be a very easy wrapper around scipy
fitting routines for different fitting methods (by default it uses
the Levenberg-Marquardt least squares fitting method - depending on the set of parameters and type of functions one
can use any of the many other fitting methods supported by lmfit
and scipy
) as well as a method to easily
build composite functions based on existing model functions. This recipe on the other hand just uses the standard minimizer
interface as well as an guessing and objective function. It’s very similar to the method shown in
the minimize
documentation but differs from the approach usually used in
the documentation.
lmfit
The easiest way to install lmfit
is using FreeBSDs pkg
to fetch precompiled binaries:
pkg install math/lmfit
Another way to install is using pip:
pip install lmfit
The idea is simple: You have some data and a model function that depends on a set of parameters. You then want to decide on the best choice of parameters so your model function matches the data in the best possible way. This is the most common problem that I usually use in many practical problems.
Upfront let’s import all required functions and classes:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import Parameters, fit_report, minimize
The first thing we need is a objective function that we want to fit to our data. This can be a polynomial, a Gaussian function, etc. This function can be either evaluated - then it will get passed the coordinate at which is should be evaluated and the parameters - or it should be fitted in which case it will also receive the data. Let’s take a look how such a simple function can look for a Gaussian:
[ f(x; c, \mu, \sigma) = c * \frac{1}{\sigma \sqrt{2 \pi}} e^{-\frac{1}{2} \frac{(x-\mu)^2}{\sigma^2}} ]As one can see this function has three parameters:
We are going to implement this method in a way that it receives the parameters as an Parameter object in it’s first argument, the evaluation point as it’s second argument and optionally a third argument that receives data during fitting:
def modelFunction(pars, x, data = None):
# First we convert the Parameters object into a dictionary,
# this will make our code a little bit easier to understand
vals = pars.valuesdict()
# For sake of readability give our parameters variables.
amp = vals["c"]
mu = vals["mu"]
sigma = vals["sigma"]
offset = vals["offset"]
# Evaluate the function
modelValue = amp * (1.0 / (sigma * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mu)**2 / (sigma**2))) + offset
# If we got data calculate the residual, else return the value
if data is not None:
return modelValue - data
else:
return modelValue
Now let’s assume we got a bunch of data points at the position data_x
with values data_y
and we want to fit this Gaussian function. For sake of this demonstration we’re going to generate a bunch
of noisy data points by our self:
real_mu = 3.4
real_sigma = 3.1
real_amp = 7.1
real_offset = -3
real_noise_lvl = 1
data_x = np.linspace(-10, 10, 100)
data_y = real_amp * np.exp(-0.5 * (data_x - real_mu)**2 / (real_sigma**2)) / (np.sqrt(2 * np.pi) * real_sigma) + real_offset
data_y_noisy = data_y + (np.random.rand(data_y.shape[0]) - 0.5) * real_noise_lvl
Lets take a short look how this generated data looks like:
fig, ax = plt.subplots()
ax.set_title("Simulated input data")
ax.plot(data_x, data_y, 'g--', label = "Real data")
ax.plot(data_x, data_y_noisy, label = "Noisy input data")
ax.grid()
ax.legend()
plt.show()
Now lets try to fit our function on the noisy data. First we generate a Parameters
object. For the
initial values we do a naive guess of the parameters:
fit_pars = Parameters()
fit_pars.add("amp", value = np.max(data_y_noisy) - np.min(data_y_noisy), vary = True)
fit_pars.add("mu", value = data_x[np.argmax(data_y_noisy)], vary = True)
fit_pars.add("sigma", value = 1, vary = True)
fit_pars.add("offset", value = np.min(data_y_noisy), vary = True)
After initialization we just have to call minimize
:
out = minimize(modelFunction, fit_pars, args=(data_x,), kws = { 'data' : data_y_noisy })
This has already performed the fit. To get a human readable fit report we can just
use fit_report
:
print(fit_report(out))
This will yield output like the following:
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 107
# data points = 100
# variables = 4
chi-square = 9.70078655
reduced chi-square = 0.10104986
Akaike info crit = -225.296322
Bayesian info crit = -214.875641
[[Variables]]
amp: 9.66997823 +/- 1.66650837 (17.23%) (init = -1.615516)
mu: 3.53609503 +/- 0.32519657 (9.20%) (init = 2.929293)
sigma: 3.95359254 +/- 0.50650314 (12.81%) (init = 1)
offset: -3.09039105 +/- 0.07705836 (2.49%) (init = -3.490874)
[[Correlations]] (unreported correlations are < 0.100)
C(amp, offset) = -0.906
C(amp, sigma) = 0.844
C(sigma, offset) = -0.723
One can programmatically access $\chi^2$ by out.chisqr
(in this example 9.70078654587556
)
print(f"Chi squared: {out.chisqr}")
Now let’s plot the result again:
fig, ax = plt.subplots()
ax.plot(data_x, data_y_noisy, 'x', label = "Noisy input data")
ax.plot(data_x, modelFunction(out.params, data_x), 'r--', label = "Fitted function")
ax.plot(data_x, data_y, 'g', label = "Original function")
ax.grid()
ax.legend()
plt.show()
So this method is of course much boilerplate code for lmfit
that would also
support simple model building like in the following example:
from lmfit.models import GaussianModel, ConstantModel
model = GaussianModel()
params = model.guess(data_y_noisy, data_x)
background = ConstantModel()
background.guess(data_y_noisy, data_x)
model = model + background
out = model.fit(data_y_noisy, params, x=data_x)
## Plot
fig, ax = plt.subplots()
ax.plot(data_x, data_y_noisy, 'x', label = "Original data")
ax.plot(data_x, out.best_fit, 'r', label = "Fitted function")
plt.show()
It gets really handy when one wants to build complex models or do iterative approximations that are hard to handle using other minimization or fitting systems though or when one wants to perform experimental measurements in between the fitting steps.
The whole code used for this blog article can be found in a GitHub GIST
This article is tagged: Physics, Programming, Math, Tutorial, Basics, Python, Statistics, Measurement, Gaussian, Mathematics
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/