Package ivs :: Package sigproc :: Module fit

Module fit

Fit various functions to timeseries.

Section 1. Radial velocity data

1.1 BD+29.3070

Fit the orbit of the main sequence companion of the sdB+MS system BD+29.3070. The radial velocities are obtained from HERMES spectra using the cross correlation tool of the pipeline.

Necessary imports:

>>> from ivs.inout.ascii import read2array
>>> from ivs.sigproc import funclib
>>> import numpy as np

Read in the data:

>>> data = read2array('/home/jorisv/IVS_python/test_fit/BD+29.3070.dat')
>>> dates, rv, err = data[:,0], data[:,1], data[:,2]

Import the function we want to fit to the data from the function library:

>>> mymodel = funclib.kepler_orbit(type='single')

Setup the starting values of the parameters and the boundaries:

>>> pars = [1000, dates[0], 0.0, 0.0, (max(rv)-min(rv))/2, np.average(rv)]
>>> bounds = [(pars[0]/2, pars[0]*1.5), (data[0][0]-pars[0]/2,data[0][0]+pars[0]/2), (0.0,0.5), (0,2*np.pi), (pars[4]/4,pars[4]*2), (min(rv),max(rv))]
>>> vary = [True, True, True, True, True, True]
>>> mymodel.setup_parameters(values=pars, bounds=bounds, vary=vary)

Fit the model to the data:

>>> result = minimize(dates, rv, mymodel, weights=1/err)

Print the results:

>>> print mymodel.param2str()
         p = 1012.26 +/- 16.57 
        t0 = 2455423.65 +/- 11.27 
         e = 0.16 +/- 0.01 
     omega = 1.72 +/- 0.08 
         k = 6.06 +/- 0.04 
        v0 = 32.23 +/- 0.09

The minimizer already returned errors on the parameters, based on the Levenberg-Marquardt algorithm of scipy. But we can get more robust errors by using the Minimizer.estimate_error method of the minimizer wich uses an F-test to calculate confidence intervals, fx on the period and eccentricity of the orbit:

>>> ci = result.estimate_error(p_names=['p', 'e'], sigmas=[0.25,0.65,0.95])
>>> print confidence2string(ci, accuracy=4)
p                   
                 25.00 %               65.00 %               95.00 %
 -           1006.9878              997.1355              980.7742  
 +           1017.7479             1029.2554             1053.0851  
e                   
                 25.00 %               65.00 %               95.00 %
 -              0.1603                0.1542                0.1433  
 +              0.1667                0.1731                0.1852

Now plot the resulting rv curve over the original curve:

>>> p = pl.figure(1)
>>> result.plot_results()

]include figure]]ivs_sigproc_lmfit_BD+29.3070_1.png]

We can use the Minimizer.plot_confidence_interval method to plot the confidence intervals of two parameters, and show the correlation between them, fx between the period and T0:

>>> p = pl.figure(2)
>>> result.plot_confidence_interval(xname='p',yname='t0', res=30, filled=True)

]include figure]]ivs_sigproc_lmfit_BD+29.3070_2.png]

To get a better idea of how the parameter space behaves we can start the fitting process from different starting values and see how they converge. Fx, we will let the fitting process start from different values for the period and the eccentricity, and then plot where they converge to

Herefore we use the grid_minimize function which has the same input as the normal minimize function and some extra parameters.

>>> fitters, startpars, models, chi2s = fit.grid_minimize(dates, rv, mymodel, weights=1/err, points=200, parameters = ['p','e'], return_all=True)

We started fitting from 200 points randomly distributed in period and eccentricity space, with the boundary values for there parameters as limits.

Based on this output we can use the plot_convergence function to plot to which values each starting point converges.

>>> p = pl.figure(3)
>>> plot_convergence(startpars, models, chi2s, xpar='p', ypar='e')

]include figure]]ivs_sigproc_lmfit_BD+29.3070_3.png]

1.2 LSI+65010

Fit orbit of massive X-ray binary LSI+65010, after Grundstrom 2007:

Necessary imports:

>>> from ivs.catalogs import vizier
>>> from ivs.timeseries import pergrams
>>> from ivs.aux import numpy_ext
>>> import pylab as pl

Read in the data, and remove the outliers:

>>> data,units,comms = vizier.search('J/ApJ/656/431/table2')
>>> times,RV = data['HJD'],data['RV']
>>> keep = RV<-30.
>>> times,RV = times[keep],RV[keep]

Find the best frequency using the Kepler periodogram, and fit an orbit with that frequency using the linear fitting routine.

>>> freqs,ampls = pergrams.kepler(times,RV,fn=0.2)
>>> freq = freqs[np.argmax(ampls)]
>>> pars1 = kepler(times, RV, freq, output_type='new')
>>> print pars1 
[11.581314028141733, 2451060.7517886101, 0.19000000000000003, 1.0069244281466982, 11.915330492005735, -59.178393186003241]

Now we want to improve this fit using the nonlinear optimizers, deriving errors on the parameters on the fly (warning: these errors are not necessarily realistic!). First, we setup the model:

>>> mymodel = funclib.kepler_orbit(type='single')
>>> mymodel.setup_parameters(pars1)
>>> result = minimize(times,RV,mymodel)
>>> pars2,e_pars2 = result.model.get_parameters()
>>> print pars2
[  1.15815058e+01   2.45106077e+06   1.94720600e-01   1.02204827e+00
   1.19264204e+01  -5.91827773e+01]
>>> print mymodel.param2str(accuracy=6)
         p = 11.581506 +/- 0.004104 
        t0 = 2451060.771681 +/- 0.583864 
         e = 0.194721 +/- 0.060980 
     omega = 1.022048 +/- 0.320605 
         k = 11.926420 +/- 0.786787 
        v0 = -59.182777 +/- 0.503345

Evaluate the orbital fits, and make phasediagrams of the fits and the data

>>> myorbit1 = mymodel.evaluate(times,pars1)
>>> myorbit2 = mymodel.evaluate(times,pars2)
>>> period = result.model.parameters['p'].value
>>> phases,phased = evaluate.phasediagram(times,RV,1/period)
>>> phases1,phased1 = evaluate.phasediagram(times,myorbit1,1/period)
>>> phases2,phased2 = evaluate.phasediagram(times,myorbit2,1/period)

Now plot everything:

>>> sa1 = np.argsort(phases1)
>>> sa2 = np.argsort(phases2)
>>> p = pl.figure()
>>> p = pl.subplot(121)
>>> p = pl.plot(freqs,ampls,'k-')
>>> p = pl.xlabel('Frequency [d$^{-1}$]')
>>> p = pl.ylabel('Statistic')
>>> p = pl.subplot(122)
>>> p = pl.plot(phases,phased,'ko')
>>> p = pl.plot(phases1[sa1],phased1[sa1],'r-',lw=2,label='Linear fit')
>>> p = pl.plot(phases2[sa2],phased2[sa2],'b--',lw=2,label='Optimization')
>>> p = pl.xlabel('Phase [$2\pi^{-1}$]')
>>> p = pl.ylabel('Amplitude [km s$^{-1}$]')
>>> p = pl.legend()

]include figure]]ivs_sigproc_fit_01.png]

Section 2. Fitting an absorption line

Here we show how to use 2 gaussians to fit an absorption line with an emission feature in its core:

Setup the two gaussian functions for the fitting process:

>>> gauss1 = funclib.gauss()
>>> pars = [-0.75,1.0,0.1,1]
>>> gauss1.setup_parameters(values=pars)

>>> gauss2 = funclib.gauss()
>>> pars = [0.22,1.0,0.01,0.0]
>>> vary = [True, True, True, False]
>>> gauss2.setup_parameters(values=pars, vary=vary)

Create the model by summing up the gaussians. As we just want to sum the two gaussian, we do not need to specify an expression for combining the two functions:

>>> mymodel = Model(functions=[gauss1, gauss2])

Create some data with noise on it

>>> np.random.seed(1111)
>>> x = np.linspace(0.5,1.5, num=1000)
>>> y = mymodel.evaluate(x)
>>> noise = np.random.normal(0.0, 0.015, size=len(x))
>>> y = y+noise

Change the starting values for the fit parameters:

>>> pars = [-0.70,1.0,0.11,0.95]
>>> gauss1.setup_parameters(values=pars)
>>> pars = [0.27,1.0,0.005,0.0]
>>> vary = [True, True, True, False]
>>> gauss2.setup_parameters(values=pars, vary=vary)

Fit the model to the data >>> result = minimize(x,y, mymodel)

Print the resulting values for the parameters. The errors are very small as the data only has some small normal distributed noise added to it:

>>> print gauss1.param2str(accuracy=6)
         a = -0.750354 +/- 0.001802 
        mu = 0.999949 +/- 0.000207 
     sigma = 0.099597 +/- 0.000267 
         c = 0.999990 +/- 0.000677
>>> print gauss2.param2str(accuracy=6)
         a = 0.216054 +/- 0.004485 
        mu = 1.000047 +/- 0.000226 
     sigma = 0.009815 +/- 0.000250 
         c = 0.000000 +/- 0.000000

Now plot the results:

>>> p = pl.figure(1)
>>> result.plot_results()

]include figure]]ivs_sigproc_lmfit_gaussian.png]

Section 3. Pulsation frequency analysis

Do a frequency analysis of the star HD129929, after Aerts 2004:

Read in the data:

>>> data,units,comms = vizier.search('J/A+A/415/241/table1')
>>> times,signal = data['HJD'],data['Umag']
>>> signal -= signal.mean()

Find the best frequency using the Scargle periodogram, fit an orbit with that frequency and optimize. Then print the results to the screen:

>>> freqs,ampls = pergrams.scargle(times,signal,f0=6.4,fn=7)
>>> freq = freqs[np.argmax(ampls)]
>>> pars1 = sine(times, signal, freq)
>>> e_pars1 = e_sine(times,signal, pars1)
>>> pars2,e_pars2,gain = optimize(times,signal,pars1,'sine')
>>> print pl.mlab.rec2txt(numpy_ext.recarr_join(pars1,e_pars1),precision=6)
      const       ampl       freq       phase    e_const     e_ampl     e_freq    e_phase
   0.000242   0.014795   6.461705   -0.093895   0.000608   0.001319   0.000006   0.089134
>>> print pl.mlab.rec2txt(numpy_ext.recarr_join(pars2,e_pars2),precision=6)
      const       ampl       freq       phase    e_const     e_ampl     e_freq    e_phase
   0.000242   0.014795   6.461705   -0.093895   0.000609   0.001912   0.000000   0.013386

Evaluate the sines, and make phasediagrams of the fits and the data

>>> mysine1 = evaluate.sine(times,pars1)
>>> mysine2 = evaluate.sine(times,pars2)
>>> phases,phased = evaluate.phasediagram(times,signal,pars1['freq'])
>>> phases1,phased1 = evaluate.phasediagram(times,mysine1,pars1['freq'])
>>> phases2,phased2 = evaluate.phasediagram(times,mysine2,pars1['freq'])

Now plot everything:

>>> sa1 = np.argsort(phases1)
>>> sa2 = np.argsort(phases2)
>>> p = pl.figure()
>>> p = pl.subplot(121)
>>> p = pl.plot(freqs,ampls,'k-')
>>> p = pl.xlabel('Frequency [d$^{-1}$]')
>>> p = pl.ylabel('Amplitude [mag]')
>>> p = pl.subplot(122)
>>> p = pl.plot(phases,phased,'ko')
>>> p = pl.plot(phases1[sa1],phased1[sa1],'r-',lw=2)
>>> p = pl.plot(phases2[sa2],phased2[sa2],'b--',lw=2)
>>> p = pl.xlabel('Phase [$2\pi^{-1}$]')
>>> p = pl.ylabel('Amplitude [mag]')

]include figure]]ivs_sigproc_fit_02.png]

Section 4. Exoplanet transit analysis

Find the transits of CoRoT 8b, after Borde 2010.

>>> import urllib
>>> from ivs.inout import ascii
>>> url = urllib.URLopener()
>>> filen,msg = url.retrieve('http://cdsarc.u-strasbg.fr/viz-bin/nph-Plot/Vgraph/txt?J%2fA%2bA%2f520%2fA66%2f.%2flc_white&F=white&P=0&--bitmap-size&800x400')
>>> times,signal = ascii.read2array(filen).T
>>> signal = signal / np.median(signal)
>>> url.close()

Find the best frequency using the Box Least Squares periodogram, fit a transit model with that frequency, optimize and prewhiten.

>>> freqs,ampls = pergrams.box(times,signal,f0=0.16,fn=0.162,df=0.005/times.ptp(),qma=0.05)
>>> freq = freqs[np.argmax(ampls)]
>>> pars = box(times,signal,freq)
>>> pars = box(times,signal,freq,b0=pars['ingress'][0]-0.05,bn=pars['egress'][0]+0.05)
>>> print pl.mlab.rec2txt(pars,precision=6)
       freq      depth    ingress     egress       cont
   0.161018   0.005978   0.782028   0.799229   1.000027

Evaluate the transits, and make phasediagrams of the fits and the data

>>> transit = evaluate.box(times,pars)
>>> phases,phased = evaluate.phasediagram(times,signal,freq)
>>> phases1,phased1 = evaluate.phasediagram(times,transit,freq)

Now plot everything and print the results to the screen:

>>> sa1 = np.argsort(phases1)
>>> p = pl.figure()
>>> p = pl.subplot(121)
>>> p = pl.plot(freqs,ampls,'k-')
>>> p = pl.xlabel('Frequency [d$^{-1}$]')
>>> p = pl.ylabel('Statistic')
>>> p = pl.subplot(122)
>>> p = pl.plot(phases,phased*100,'ko')
>>> p = pl.plot(phases1[sa1],phased1[sa1]*100,'r-',lw=2)
>>> p = pl.xlim(0.70,0.85)
>>> p = pl.xlabel('Phase [$2\pi^{-1}$]')
>>> p = pl.ylabel('Depth [%]')

]include figure]]ivs_sigproc_fit_03.png]

Section 5. Eclipsing binary fit

Splines are not a good way to fit eclipsing binaries, but just for the sake of showing the use of the periodic spline fitting functions, we do it anyway.

We use the data on CU Cnc of Ribas, 2003:

>>> data,units,comms = vizier.search('J/A+A/398/239/table1')
>>> times,signal = data['HJD'],data['Dmag']

Section 6. Blackbody fit

We generate a single black body curve with Teff=10000. and scale=1.:

>>> from ivs.sed import model as sed_model
>>> wave_dense = np.logspace(2.6,6,1000)
>>> flux_dense = sed_model.blackbody((wave_dense,'AA'),10000.)

We simulate 20 datapoints of this model and perturb it a bit:

>>> wave = np.logspace(3,6,20)
>>> flux = sed_model.blackbody((wave,'AA'),10000.)
>>> error = flux/2.
>>> flux += np.random.normal(size=len(wave),scale=error)

Next, we setup a single black body model to fit through the simulated data. Our initial guess is a temperature of 1000K and a scale of 1:

>>> pars = [1000.,1.]
>>> mymodel = funclib.blackbody()
>>> mymodel.setup_parameters(values=pars)

Fitting and evaluating the fit is as easy as:

>>> result = minimize(wave,flux, mymodel,weights=1./error)
>>> myfit = mymodel.evaluate(wave_dense)

This is the result:

>>> print mymodel.param2str()
             T = 9678.90 +/- 394.26 
         scale = 1.14 +/- 0.17

A multiple black body is very similar (we make the errors somewhat smaller for easier fitting):

>>> flux2 = sed_model.blackbody((wave,'AA'),15000.)
>>> flux2+= sed_model.blackbody((wave,'AA'),6000.)*10.
>>> flux2+= sed_model.blackbody((wave,'AA'),3000.)*100.
>>> error2 = flux2/10.
>>> flux2 += np.random.normal(size=len(wave),scale=error2)

The setup now needs three sets of parameters, which we choose to be all equal.

>>> pars = [[1000.,1.],[1000.,1.],[1000.,1.]]
>>> mymodel = funclib.multi_blackbody(n=3)
>>> mymodel.setup_parameters(values=pars)

Fitting and evaluate is again very easy:

>>> result = minimize(wave,flux2, mymodel,weights=1./error2)
>>> myfit2 = result.model.evaluate(wave_dense)

This is the result:

>>> print mymodel.param2str()
           T_0 = 6155.32 +/- 3338.54 
       scale_0 = 9.54 +/- 23.00 
           T_1 = 3134.37 +/- 714.01 
       scale_1 = 93.40 +/- 17.98 
           T_2 = 14696.40 +/- 568.76 
       scale_2 = 1.15 +/- 0.33

And a nice plot of the two cases:

>>> p = pl.figure()
>>> p = pl.subplot(121)
>>> p = pl.plot(wave_dense,flux_dense,'k-')
>>> p = pl.plot(wave_dense,myfit,'r-')
>>> p = pl.errorbar(wave,flux,yerr=error,fmt='bs')
>>> p = pl.gca().set_xscale('log')
>>> p = pl.gca().set_yscale('log')
>>> p = pl.subplot(122)
>>> p = pl.plot(wave_dense,myfit2,'r-')
>>> p = pl.errorbar(wave,flux2,yerr=error2,fmt='bs')
>>> p = pl.gca().set_xscale('log')
>>> p = pl.gca().set_yscale('log')

]include figure]]ivs_sigproc_lmfit_blackbody.png]

Classes

[hide private]

Non-linear improvements

Function
Class to define a function with associated parameters.

Model
Class to create a model using different Functions each with their associated parameters.

Minimizer
A minimizer class on the lmfit Python package, which provides a simple, flexible interface to non-linear least-squares optimization, or curve fitting.

Functions

[hide private]

Linear fit functions

record array

sine(times, signal, freq, sigma=None, constant=True, error=False, t0=0)
Fit a harmonic function.

source code

record array

periodic_spline(times, signal, freq, t0=None, order=20, k=3)
Fit a periodic spline.

source code

record array

kepler(times, signal, freq, sigma=None, wexp=2., e0=0, en=0.99, de=0.01, output_type='old')
Fit a Kepler orbit to a time series.

source code

record array

box(times, signal, freq, b0=0, bn=1, order=50, t0=None)
Fit box shaped transits.

source code

tuple

gauss(x, y, threshold=0.1, constant=False, full_output=False, init_guess_method='analytical', window=None)
Fit a Gaussian profile to data using a polynomial fit.

source code

Error determination

Nx4 array(, Nx3 array)

e_sine(times, signal, parameters, correlation_correction=True, limit=10000)
Compute the errors on the parameters from a sine fit.

source code

Linear improvements

diffcorr(times, signal, parameters, func_name, max_iter=100, tol=1e-6, args=(), full_output=False)
Differential corrections.

source code

Non-linear improvements

residuals(parameters, domain, data, evalfunc, weights, logar, *args)

source code

residuals_single(parameters, domain, data, evalfunc, weights, logar, *args)

source code

optimize(times, signal, parameters, func_name, prep_func=None, minimizer='leastsq', weights=None, logar=False, args=())
Fit a function to data.

source code

minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', args=None, kws=None, scale_covar=True, iter_cb=None, verbose=True, **fit_kws)
Basic minimizer function using the Minimizer class, find values for the parameters so that the sum-of-squares of (y-model(x)) is minimized.

source code

Minimizer object or array of [Minimizer, Model, float]

grid_minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', args=None, kws=None, scale_covar=True, iter_cb=None, points=100, parameters=None, return_all=False, verbose=True, **fit_kws)
Grid minimizer.

source code

General purpose

float(,list)

get_correlation_factor(residus, full_output=False)
Calculate the correlation facor rho (Schwarzenberg-Czerny, 2003).

source code

Print and plot functions

_calc_length(par, accuracy, field=None)
helper function to calculate the length of the given parameters for parameters2string

source code

_format_field(par, field, maxlen=10, accuracy=2)

source code

parameters2string(parameters, accuracy=2, error='stderr', output='result', **kwargs)
Converts a parameter object to string

source code

correlation2string(parameters, accuracy=3, limit=0.100)
Converts the correlation of different parameters to string

source code

confidence2string(parameters, accuracy=4)
Converts the confidence intervals to string

source code

plot_convergence(startpars, models, chi2s, xpar=None, ypar=None, clim=None)
Plot the convergence path of the results from grid_minimize.

source code

Variables

[hide private]

logger = logging.getLogger('SP.FIT')

Function Details

[hide private]

sine(times, signal, freq, sigma=None, constant=True, error=False, t0=0)

source code

Fit a harmonic function.

This function is of the form

C + \sum_j A_j sin(2pi nu_j (t-t0) + phi_j)

where the presence of the constant term is an option. The error bars on the fitted parameters can also be requested (by Joris De Ridder).

(phase in radians!)

Parameters:

times (numpy array) - time points
signal (numpy array) - observations
freq (numpy array or float) - frequencies of the harmonics
sigma (numpy array) - standard error of observations
constant (boolean) - flag, if not None, also fit a constant
error (boolean) - flag, if not None, also compute the errorbars
t0 (float) - time zero point.

Returns: record array

parameters

periodic_spline(times, signal, freq, t0=None, order=20, k=3)

source code

Fit a periodic spline.

CAUTION: this definition assumes the proposed period is either small compared to the total time range, or there are a lot of points per period.

This definition basically phases all the data and constructs an empirical periodic function through an averaging process per phasebin, and then performing a splinefit through those points.

In order to make the first derivative continuous, we repeat the first point(s) at the end and the end point(s) at the beginning.

The constructed function can than be removed from the original data.

Output is a record array containing the columns 'freq', 'knots', 'coeffs' and 'degree'.

Example usage:

>>> myfreq = 1/20.
>>> times = np.linspace(0,150,10000)
>>> signal = sin(2*pi*myfreq*times+0.32*2*pi) + np.random.normal(size=len(times))
>>> cjs = periodic_spline(times,signal,myfreq,order=30)
>>> trend = evaluate.periodic_spline(times,cjs)
>>> import pylab as pl
>>> p=pl.figure()
>>> p=pl.plot(times,signal,'ko')
>>> p=pl.plot(times,trend,'r-')
>>> p=pl.title("test:tfit:periodic_spline_fit")

Parameters:

times (numpy 1d array) - time points
signal (numpy 1d array) - observation points
freq (float) - frequency of the periodic trend
order (integer) - number of points to use for the spline fit.
k (integer) - order of the spline

Returns: record array

parameters

kepler(times, signal, freq, sigma=None, wexp=2., e0=0, en=0.99, de=0.01, output_type='old')

source code

Fit a Kepler orbit to a time series.

Example usage:

>>> from ivs.timeseries import pergrams

First set the parameters we want to use:

>>> pars = tuple([12.456,23456.,0.37,213/180.*pi,98.76,55.])
>>> pars = np.rec.array([pars],dtype=[('P','f8'),('T0','f8'),('e','f8'),
...                                ('omega','f8'),('K','f8'),('gamma','f8')])

Then generate the signal and add some noise

>>> times = np.linspace(pars[0]['T0'],pars[0]['T0']+5*pars[0]['P'],100)
>>> signalo = evaluate.kepler(times,pars)
>>> signal = signalo + np.random.normal(scale=20.,size=len(times))

Calculate the periodogram:

>>> freqs,ampls = pergrams.kepler(times,signal)
>>> opars = kepler(times,signal,freqs[np.argmax(ampls)])
>>> signalf = evaluate.kepler(times,opars)

And make some plots

>>> import pylab as pl
>>> p = pl.figure()
>>> p = pl.subplot(221)
>>> p = pl.plot(times,signal,'ko')
>>> p = pl.plot(times,signalo,'r-',lw=2)
>>> p = pl.plot(times,signalf,'b--',lw=2)
>>> p = pl.subplot(222)
>>> p = pl.plot(freqs,ampls,'k-')

Parameters:

times (numpy 1d array) - time points
signal (numpy 1d array) - observation points
freq (float) - frequency of kepler orbit

Returns: record array

parameters

box(times, signal, freq, b0=0, bn=1, order=50, t0=None)

source code

Fit box shaped transits.

Parameters:

times (numpy 1d array) - time points
signal (numpy 1d array) - observation points
freq (float) - frequency of transiting signal
b0 (0<float<bn) - minimum start of ingress (in phase)
bn (b0<float<1) - maximum end of egress (in phase)
order (integer) - number of phase bins
t0 (float) - zeropoint of times

Returns: record array

parameters

gauss(x, y, threshold=0.1, constant=False, full_output=False, init_guess_method='analytical', window=None)

source code

Fit a Gaussian profile to data using a polynomial fit.

y = A * exp( -(x-mu)**2 / (2*sigma**2))

ln(y) = ln(A) - (x-mu)**2 / (2*sigma**2) ln(y) = ln(A) - mu**2 / (2*sigma**2) + mu / (sigma**2) * x - x**2 / (2*sigma**2)

then the parameters are given by

p0 = - 1 / (2*sigma**2) p1 = mu / ( sigma**2) p2 = ln(A) - mu**2 / (2*sigma**2)

Note that not all datapoints are used, but only those above a certain values (namely 10% of the maximum value), In this way, we reduce the influence of the continuum and concentrate on the shape of the peak itself.

Afterwards, we perform a non linear least square fit with above parameters as starting values, but only accept it if the CHI2 has improved.

If a constant has to be fitted, the nonlinear options has to be True.

Example: we generate a Lorentzian curve and fit a Gaussian to it:

>>> x = np.linspace(-10,10,1000)
>>> y = evaluate.lorentz(x,[5.,0.,2.]) + np.random.normal(scale=0.1,size=len(x))
>>> pars1,e_pars1 = gauss(x,y)
>>> pars2,e_pars2 = gauss(x,y,constant=True)
>>> y1 = evaluate.gauss(x,pars1)
>>> y2 = evaluate.gauss(x,pars2)
>>> p = pl.figure()
>>> p = pl.plot(x,y,'k-')
>>> p = pl.plot(x,y1,'r-',lw=2)
>>> p = pl.plot(x,y2,'b-',lw=2)

Parameters:

x (numpy array) - x axis data
y (numpy array) - y axis data
nl (boolean) - flag for performing a non linear least squares fit
constant (boolean) - fit also a constant

Returns: tuple

A, mu, sigma (,C)

e_sine(times, signal, parameters, correlation_correction=True, limit=10000)

source code

Compute the errors on the parameters from a sine fit.

Note: errors on the constant are only calculated when the number of datapoints is below 1000. Otherwise, the matrices involved become to huge.

Parameters:

times (numpy array) - time points
signal (numpy array) - observations
parameters (numpy record array) - record array containing the fitted parameters. This should have columns 'ampl','freq' and 'phase', and optionally 'const'.
correlation_correction (boolean) - set to True if you want to correct for correlation effects
limit (integer) - Calculating the error on the constant requires the calculation of a matrix of size Ndata x Nparam, which takes a long time for large datasets. The routines skips the estimation of the error on the constant if the timeseries is longer than limit datapoints

Returns: Nx4 array(, Nx3 array)

errors

minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', args=None, kws=None, scale_covar=True, iter_cb=None, verbose=True, **fit_kws)

source code

Basic minimizer function using the Minimizer class, find values for the parameters so that the sum-of-squares of (y-model(x)) is minimized. When the fitting process is completed, the parameters of the Model are updated with the results. If the leastsq engine is used, estimated uncertainties and correlations will be saved to the Model as well. Returns a Minimizer object.

Fitting engines

By default, the Levenberg-Marquardt algorithm is used for fitting. While often criticized, including the fact it finds a local minima, this approach has some distinct advantages. These include being fast, and well-behaved for most curve- fitting needs, and making it easy to estimate uncertainties for and correlations between pairs of fit variables. Alternative fitting algoritms are at least partially implemented, but not all functions will work with them.

leastsq: Levenberg-Marquardt, scipy.optimize.leastsq
anneal: Simulated Annealing, scipy.optimize.anneal
lbfgsb: quasi-Newton optimization, scipy.optimize.fmin_l_bfgs_b

Parameters:

x (numpy array) - the independent data array (x data)
y (numpy array) - the dependent data array (y data)
model - The Model to fit to the data
err - The errors on the y data, same dimentions as y
weights - The weights given to the different y data
resfunc - A function to calculate the residuals, if not provided standard residual function is used.
engine - Which fitting engine to use: 'leastsq', 'anneal', 'lbfgsb'
kws - Extra keyword arguments to be passed to the model
fit_kws - Extra keyword arguments to be passed to the fitter function

Returns:

(Parameter object, Minimizer object)

grid_minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', args=None, kws=None, scale_covar=True, iter_cb=None, points=100, parameters=None, return_all=False, verbose=True, **fit_kws)

source code

Grid minimizer. Offers the posibility to start minimizing from a grid of starting parameters defined by the used. The number of starting points can be specified, as well as the parameters that are varried. For each parameter for which the start value should be varied, a minimum and maximum value should be provided when setting up that parameter. The starting values are chosen randomly in the range [min,max]. The other arguments are the same as for the normal minimize function.

If parameters are provided that can not be kicked (starting value can not be varried), they will be removed from the parameters array automaticaly. If no parameters can be kicked, only one minimize will be performed independently from the number of points provided. Pay attention with the vary function of the parameters, even if a parameter has vary = False, it will be kicked by the grid minimizer if it appears in parameters. This parameter will then be fixed at its new starting value.

Parameters:

parameters (array of strings) - The parameters that you want to randomly chose in the fitting process
points (int) - The number of starting points
return_all (Boolean) - if True, the results of all fits are returned, if False, only the best fit is returned.

Returns: Minimizer object or array of [Minimizer, Model, float]

The best minimizer, or all minimizers as [minimizers, newmodels, chisqrs]

get_correlation_factor(residus, full_output=False)

source code

Calculate the correlation facor rho (Schwarzenberg-Czerny, 2003).

Under white noise assumption, the residus are expected to change sign every 2 observations (rho=1). Longer distances, 2*rho, are a sign of correlation.

The errors are then underestimated by a factor 1/sqrt(rho).

Parameters:

residus (numpy array) - residus after the fit
full_output (bool) - if True, the groups of data with same sign will be returned

Returns: float(,list)

rho(,same sign groups)