ivs.sigproc.fit

1 """ 2 Fit various functions to timeseries. 3 4 Section 1. Radial velocity data 5 =============================== 6 7 1.1 BD+29.3070 8 -------------- 9 Fit the orbit of the main sequence companion of the sdB+MS system BD+29.3070. 10 The radial velocities are obtained from HERMES spectra using the cross correlation 11 tool of the pipeline. 12 13 Necessary imports: 14 15 >>> from ivs.inout.ascii import read2array 16 >>> from ivs.sigproc import funclib 17 >>> import numpy as np 18 19 Read in the data: 20 21 >>> data = read2array('/home/jorisv/IVS_python/test_fit/BD+29.3070.dat') 22 >>> dates, rv, err = data[:,0], data[:,1], data[:,2] 23 24 Import the function we want to fit to the data from the function library: 25 26 >>> mymodel = funclib.kepler_orbit(type='single') 27 28 Setup the starting values of the parameters and the boundaries: 29 30 >>> pars = [1000, dates[0], 0.0, 0.0, (max(rv)-min(rv))/2, np.average(rv)] 31 >>> bounds = [(pars[0]/2, pars[0]*1.5), (data[0][0]-pars[0]/2,data[0][0]+pars[0]/2), (0.0,0.5), (0,2*np.pi), (pars[4]/4,pars[4]*2), (min(rv),max(rv))] 32 >>> vary = [True, True, True, True, True, True] 33 >>> mymodel.setup_parameters(values=pars, bounds=bounds, vary=vary) 34 35 Fit the model to the data: 36 37 >>> result = minimize(dates, rv, mymodel, weights=1/err) 38 39 Print the results: 40 41 >>> print mymodel.param2str() 42 p = 1012.26 +/- 16.57 43 t0 = 2455423.65 +/- 11.27 44 e = 0.16 +/- 0.01 45 omega = 1.72 +/- 0.08 46 k = 6.06 +/- 0.04 47 v0 = 32.23 +/- 0.09 48 49 The minimizer already returned errors on the parameters, based on the Levenberg-Marquardt algorithm of scipy. But we can get more robust errors by using the L{Minimizer.estimate_error} method of the minimizer wich uses an F-test to calculate confidence intervals, fx on the period and eccentricity of the orbit: 50 51 >>> ci = result.estimate_error(p_names=['p', 'e'], sigmas=[0.25,0.65,0.95]) 52 >>> print confidence2string(ci, accuracy=4) 53 p 54 25.00 % 65.00 % 95.00 % 55 - 1006.9878 997.1355 980.7742 56 + 1017.7479 1029.2554 1053.0851 57 e 58 25.00 % 65.00 % 95.00 % 59 - 0.1603 0.1542 0.1433 60 + 0.1667 0.1731 0.1852 61 62 Now plot the resulting rv curve over the original curve: 63 64 >>> p = pl.figure(1) 65 >>> result.plot_results() 66 67 ]include figure]]ivs_sigproc_lmfit_BD+29.3070_1.png] 68 69 We can use the L{Minimizer.plot_confidence_interval} method to plot the confidence intervals of two 70 parameters, and show the correlation between them, fx between the period and T0: 71 72 >>> p = pl.figure(2) 73 >>> result.plot_confidence_interval(xname='p',yname='t0', res=30, filled=True) 74 75 ]include figure]]ivs_sigproc_lmfit_BD+29.3070_2.png] 76 77 To get a better idea of how the parameter space behaves we can start the fitting process from 78 different starting values and see how they converge. Fx, we will let the fitting process start 79 from different values for the period and the eccentricity, and then plot where they converge to 80 81 Herefore we use the L{grid_minimize} function which has the same input as the normal minimize function 82 and some extra parameters. 83 84 >>> fitters, startpars, models, chi2s = fit.grid_minimize(dates, rv, mymodel, weights=1/err, points=200, parameters = ['p','e'], return_all=True) 85 86 We started fitting from 200 points randomly distributed in period and eccentricity space, with the 87 boundary values for there parameters as limits. 88 89 Based on this output we can use the L{plot_convergence} function to plot to which values each starting point converges. 90 91 >>> p = pl.figure(3) 92 >>> plot_convergence(startpars, models, chi2s, xpar='p', ypar='e') 93 94 ]include figure]]ivs_sigproc_lmfit_BD+29.3070_3.png] 95 96 97 98 99 1.2 LSI+65010 100 ------------- 101 Fit orbit of massive X-ray binary LSI+65010, after Grundstrom 2007: 102 103 Necessary imports: 104 105 >>> from ivs.catalogs import vizier 106 >>> from ivs.timeseries import pergrams 107 >>> from ivs.aux import numpy_ext 108 >>> import pylab as pl 109 110 Read in the data, and remove the outliers: 111 112 >>> data,units,comms = vizier.search('J/ApJ/656/431/table2') 113 >>> times,RV = data['HJD'],data['RV'] 114 >>> keep = RV<-30. 115 >>> times,RV = times[keep],RV[keep] 116 117 Find the best frequency using the Kepler periodogram, and fit an orbit with that 118 frequency using the linear fitting routine. 119 120 >>> freqs,ampls = pergrams.kepler(times,RV,fn=0.2) 121 >>> freq = freqs[np.argmax(ampls)] 122 >>> pars1 = kepler(times, RV, freq, output_type='new') 123 >>> print pars1 124 [11.581314028141733, 2451060.7517886101, 0.19000000000000003, 1.0069244281466982, 11.915330492005735, -59.178393186003241] 125 126 Now we want to improve this fit using the nonlinear optimizers, deriving errors 127 on the parameters on the fly (B{warning: these errors are not necessarily realistic!}). 128 First, we setup the model: 129 130 >>> mymodel = funclib.kepler_orbit(type='single') 131 >>> mymodel.setup_parameters(pars1) 132 >>> result = minimize(times,RV,mymodel) 133 >>> pars2,e_pars2 = result.model.get_parameters() 134 >>> print pars2 135 [ 1.15815058e+01 2.45106077e+06 1.94720600e-01 1.02204827e+00 136 1.19264204e+01 -5.91827773e+01] 137 >>> print mymodel.param2str(accuracy=6) 138 p = 11.581506 +/- 0.004104 139 t0 = 2451060.771681 +/- 0.583864 140 e = 0.194721 +/- 0.060980 141 omega = 1.022048 +/- 0.320605 142 k = 11.926420 +/- 0.786787 143 v0 = -59.182777 +/- 0.503345 144 145 Evaluate the orbital fits, and make phasediagrams of the fits and the data 146 147 >>> myorbit1 = mymodel.evaluate(times,pars1) 148 >>> myorbit2 = mymodel.evaluate(times,pars2) 149 >>> period = result.model.parameters['p'].value 150 >>> phases,phased = evaluate.phasediagram(times,RV,1/period) 151 >>> phases1,phased1 = evaluate.phasediagram(times,myorbit1,1/period) 152 >>> phases2,phased2 = evaluate.phasediagram(times,myorbit2,1/period) 153 154 Now plot everything: 155 156 >>> sa1 = np.argsort(phases1) 157 >>> sa2 = np.argsort(phases2) 158 >>> p = pl.figure() 159 >>> p = pl.subplot(121) 160 >>> p = pl.plot(freqs,ampls,'k-') 161 >>> p = pl.xlabel('Frequency [d$^{-1}$]') 162 >>> p = pl.ylabel('Statistic') 163 >>> p = pl.subplot(122) 164 >>> p = pl.plot(phases,phased,'ko') 165 >>> p = pl.plot(phases1[sa1],phased1[sa1],'r-',lw=2,label='Linear fit') 166 >>> p = pl.plot(phases2[sa2],phased2[sa2],'b--',lw=2,label='Optimization') 167 >>> p = pl.xlabel('Phase [$2\pi^{-1}$]') 168 >>> p = pl.ylabel('Amplitude [km s$^{-1}$]') 169 >>> p = pl.legend() 170 171 ]include figure]]ivs_sigproc_fit_01.png] 172 173 Section 2. Fitting an absorption line 174 ===================================== 175 176 Here we show how to use 2 gaussians to fit an absorption line with an emission feature in its core: 177 178 Setup the two gaussian functions for the fitting process: 179 180 >>> gauss1 = funclib.gauss() 181 >>> pars = [-0.75,1.0,0.1,1] 182 >>> gauss1.setup_parameters(values=pars) 183 184 >>> gauss2 = funclib.gauss() 185 >>> pars = [0.22,1.0,0.01,0.0] 186 >>> vary = [True, True, True, False] 187 >>> gauss2.setup_parameters(values=pars, vary=vary) 188 189 Create the model by summing up the gaussians. As we just want to sum the two gaussian, we do not need 190 to specify an expression for combining the two functions: 191 192 >>> mymodel = Model(functions=[gauss1, gauss2]) 193 194 Create some data with noise on it 195 196 >>> np.random.seed(1111) 197 >>> x = np.linspace(0.5,1.5, num=1000) 198 >>> y = mymodel.evaluate(x) 199 >>> noise = np.random.normal(0.0, 0.015, size=len(x)) 200 >>> y = y+noise 201 202 Change the starting values for the fit parameters: 203 204 >>> pars = [-0.70,1.0,0.11,0.95] 205 >>> gauss1.setup_parameters(values=pars) 206 >>> pars = [0.27,1.0,0.005,0.0] 207 >>> vary = [True, True, True, False] 208 >>> gauss2.setup_parameters(values=pars, vary=vary) 209 210 Fit the model to the data 211 >>> result = minimize(x,y, mymodel) 212 213 Print the resulting values for the parameters. The errors are very small as the data only has some 214 small normal distributed noise added to it: 215 216 >>> print gauss1.param2str(accuracy=6) 217 a = -0.750354 +/- 0.001802 218 mu = 0.999949 +/- 0.000207 219 sigma = 0.099597 +/- 0.000267 220 c = 0.999990 +/- 0.000677 221 >>> print gauss2.param2str(accuracy=6) 222 a = 0.216054 +/- 0.004485 223 mu = 1.000047 +/- 0.000226 224 sigma = 0.009815 +/- 0.000250 225 c = 0.000000 +/- 0.000000 226 227 Now plot the results: 228 229 >>> p = pl.figure(1) 230 >>> result.plot_results() 231 232 ]include figure]]ivs_sigproc_lmfit_gaussian.png] 233 234 Section 3. Pulsation frequency analysis 235 ======================================= 236 237 Do a frequency analysis of the star HD129929, after Aerts 2004: 238 239 Read in the data: 240 241 >>> data,units,comms = vizier.search('J/A+A/415/241/table1') 242 >>> times,signal = data['HJD'],data['Umag'] 243 >>> signal -= signal.mean() 244 245 Find the best frequency using the Scargle periodogram, fit an orbit with that 246 frequency and optimize. Then print the results to the screen: 247 248 >>> freqs,ampls = pergrams.scargle(times,signal,f0=6.4,fn=7) 249 >>> freq = freqs[np.argmax(ampls)] 250 >>> pars1 = sine(times, signal, freq) 251 >>> e_pars1 = e_sine(times,signal, pars1) 252 >>> pars2,e_pars2,gain = optimize(times,signal,pars1,'sine') 253 >>> print pl.mlab.rec2txt(numpy_ext.recarr_join(pars1,e_pars1),precision=6) 254 const ampl freq phase e_const e_ampl e_freq e_phase 255 0.000242 0.014795 6.461705 -0.093895 0.000608 0.001319 0.000006 0.089134 256 >>> print pl.mlab.rec2txt(numpy_ext.recarr_join(pars2,e_pars2),precision=6) 257 const ampl freq phase e_const e_ampl e_freq e_phase 258 0.000242 0.014795 6.461705 -0.093895 0.000609 0.001912 0.000000 0.013386 259 260 Evaluate the sines, and make phasediagrams of the fits and the data 261 262 >>> mysine1 = evaluate.sine(times,pars1) 263 >>> mysine2 = evaluate.sine(times,pars2) 264 >>> phases,phased = evaluate.phasediagram(times,signal,pars1['freq']) 265 >>> phases1,phased1 = evaluate.phasediagram(times,mysine1,pars1['freq']) 266 >>> phases2,phased2 = evaluate.phasediagram(times,mysine2,pars1['freq']) 267 268 Now plot everything: 269 270 >>> sa1 = np.argsort(phases1) 271 >>> sa2 = np.argsort(phases2) 272 >>> p = pl.figure() 273 >>> p = pl.subplot(121) 274 >>> p = pl.plot(freqs,ampls,'k-') 275 >>> p = pl.xlabel('Frequency [d$^{-1}$]') 276 >>> p = pl.ylabel('Amplitude [mag]') 277 >>> p = pl.subplot(122) 278 >>> p = pl.plot(phases,phased,'ko') 279 >>> p = pl.plot(phases1[sa1],phased1[sa1],'r-',lw=2) 280 >>> p = pl.plot(phases2[sa2],phased2[sa2],'b--',lw=2) 281 >>> p = pl.xlabel('Phase [$2\pi^{-1}$]') 282 >>> p = pl.ylabel('Amplitude [mag]') 283 284 ]include figure]]ivs_sigproc_fit_02.png] 285 286 287 Section 4. Exoplanet transit analysis 288 ===================================== 289 290 Find the transits of CoRoT 8b, after Borde 2010. 291 292 >>> import urllib 293 >>> from ivs.inout import ascii 294 >>> url = urllib.URLopener() 295 >>> filen,msg = url.retrieve('http://cdsarc.u-strasbg.fr/viz-bin/nph-Plot/Vgraph/txt?J%2fA%2bA%2f520%2fA66%2f.%2flc_white&F=white&P=0&--bitmap-size&800x400') 296 >>> times,signal = ascii.read2array(filen).T 297 >>> signal = signal / np.median(signal) 298 >>> url.close() 299 300 Find the best frequency using the Box Least Squares periodogram, fit a transit 301 model with that frequency, optimize and prewhiten. 302 303 >>> freqs,ampls = pergrams.box(times,signal,f0=0.16,fn=0.162,df=0.005/times.ptp(),qma=0.05) 304 >>> freq = freqs[np.argmax(ampls)] 305 >>> pars = box(times,signal,freq) 306 >>> pars = box(times,signal,freq,b0=pars['ingress'][0]-0.05,bn=pars['egress'][0]+0.05) 307 >>> print pl.mlab.rec2txt(pars,precision=6) 308 freq depth ingress egress cont 309 0.161018 0.005978 0.782028 0.799229 1.000027 310 311 Evaluate the transits, and make phasediagrams of the fits and the data 312 313 >>> transit = evaluate.box(times,pars) 314 >>> phases,phased = evaluate.phasediagram(times,signal,freq) 315 >>> phases1,phased1 = evaluate.phasediagram(times,transit,freq) 316 317 Now plot everything and print the results to the screen: 318 319 >>> sa1 = np.argsort(phases1) 320 >>> p = pl.figure() 321 >>> p = pl.subplot(121) 322 >>> p = pl.plot(freqs,ampls,'k-') 323 >>> p = pl.xlabel('Frequency [d$^{-1}$]') 324 >>> p = pl.ylabel('Statistic') 325 >>> p = pl.subplot(122) 326 >>> p = pl.plot(phases,phased*100,'ko') 327 >>> p = pl.plot(phases1[sa1],phased1[sa1]*100,'r-',lw=2) 328 >>> p = pl.xlim(0.70,0.85) 329 >>> p = pl.xlabel('Phase [$2\pi^{-1}$]') 330 >>> p = pl.ylabel('Depth [%]') 331 332 ]include figure]]ivs_sigproc_fit_03.png] 333 334 Section 5. Eclipsing binary fit 335 =============================== 336 337 Splines are not a good way to fit eclipsing binaries, but just for the sake of 338 showing the use of the periodic spline fitting functions, we do it anyway. 339 340 We use the data on CU Cnc of Ribas, 2003: 341 342 >>> data,units,comms = vizier.search('J/A+A/398/239/table1') 343 >>> times,signal = data['HJD'],data['Dmag'] 344 345 346 Section 6. Blackbody fit 347 ======================== 348 349 We generate a single black body curve with Teff=10000. and scale=1.: 350 351 >>> from ivs.sed import model as sed_model 352 >>> wave_dense = np.logspace(2.6,6,1000) 353 >>> flux_dense = sed_model.blackbody((wave_dense,'AA'),10000.) 354 355 We simulate 20 datapoints of this model and perturb it a bit: 356 357 >>> wave = np.logspace(3,6,20) 358 >>> flux = sed_model.blackbody((wave,'AA'),10000.) 359 >>> error = flux/2. 360 >>> flux += np.random.normal(size=len(wave),scale=error) 361 362 Next, we setup a single black body model to fit through the simulated data. Our 363 initial guess is a temperature of 1000K and a scale of 1: 364 365 >>> pars = [1000.,1.] 366 >>> mymodel = funclib.blackbody() 367 >>> mymodel.setup_parameters(values=pars) 368 369 Fitting and evaluating the fit is as easy as: 370 371 >>> result = minimize(wave,flux, mymodel,weights=1./error) 372 >>> myfit = mymodel.evaluate(wave_dense) 373 374 This is the result: 375 376 >>> print mymodel.param2str() 377 T = 9678.90 +/- 394.26 378 scale = 1.14 +/- 0.17 379 380 A multiple black body is very similar (we make the errors somewhat smaller for 381 easier fitting): 382 383 >>> flux2 = sed_model.blackbody((wave,'AA'),15000.) 384 >>> flux2+= sed_model.blackbody((wave,'AA'),6000.)*10. 385 >>> flux2+= sed_model.blackbody((wave,'AA'),3000.)*100. 386 >>> error2 = flux2/10. 387 >>> flux2 += np.random.normal(size=len(wave),scale=error2) 388 389 The setup now needs three sets of parameters, which we choose to be all equal. 390 391 >>> pars = [[1000.,1.],[1000.,1.],[1000.,1.]] 392 >>> mymodel = funclib.multi_blackbody(n=3) 393 >>> mymodel.setup_parameters(values=pars) 394 395 Fitting and evaluate is again very easy: 396 397 >>> result = minimize(wave,flux2, mymodel,weights=1./error2) 398 >>> myfit2 = result.model.evaluate(wave_dense) 399 400 This is the result: 401 402 >>> print mymodel.param2str() 403 T_0 = 6155.32 +/- 3338.54 404 scale_0 = 9.54 +/- 23.00 405 T_1 = 3134.37 +/- 714.01 406 scale_1 = 93.40 +/- 17.98 407 T_2 = 14696.40 +/- 568.76 408 scale_2 = 1.15 +/- 0.33 409 410 And a nice plot of the two cases: 411 412 >>> p = pl.figure() 413 >>> p = pl.subplot(121) 414 >>> p = pl.plot(wave_dense,flux_dense,'k-') 415 >>> p = pl.plot(wave_dense,myfit,'r-') 416 >>> p = pl.errorbar(wave,flux,yerr=error,fmt='bs') 417 >>> p = pl.gca().set_xscale('log') 418 >>> p = pl.gca().set_yscale('log') 419 >>> p = pl.subplot(122) 420 >>> p = pl.plot(wave_dense,myfit2,'r-') 421 >>> p = pl.errorbar(wave,flux2,yerr=error2,fmt='bs') 422 >>> p = pl.gca().set_xscale('log') 423 >>> p = pl.gca().set_yscale('log') 424 425 ]include figure]]ivs_sigproc_lmfit_blackbody.png] 426 """ 427 import time 428 import logging 429 430 import numpy as np 431 from numpy import pi,cos,sin 432 import numpy.linalg as la 433 from scipy.interpolate import splrep 434 import scipy.optimize 435 436 from ivs.aux import progressMeter as progress 437 from ivs.aux import loggers 438 from ivs.sigproc import evaluate 439 from ivs.timeseries import pyKEP 440 441 import re 442 import copy 443 import pylab as pl 444 import matplotlib as mpl 445 from ivs.sigproc import lmfit 446 447 logger = logging.getLogger('SP.FIT')

448 449 #{Linear fit functions 450 451 452 -def sine(times, signal, freq, sigma=None,constant=True,error=False,t0=0):

453 """ 454 Fit a harmonic function. 455 456 This function is of the form 457 458 C + \sum_j A_j sin(2pi nu_j (t-t0) + phi_j) 459 460 where the presence of the constant term is an option. The error 461 bars on the fitted parameters can also be requested (by Joris De Ridder). 462 463 (phase in radians!) 464 465 @param times: time points 466 @type times: numpy array 467 @param signal: observations 468 @type signal: numpy array 469 @param freq: frequencies of the harmonics 470 @type freq: numpy array or float 471 @keyword sigma: standard error of observations 472 @type sigma: numpy array 473 @keyword constant: flag, if not None, also fit a constant 474 @type constant: boolean 475 @keyword error: flag, if not None, also compute the errorbars 476 @type error: boolean 477 @keyword t0: time zero point. 478 @type t0: float 479 @return: parameters 480 @rtype: record array 481 """ 482 #-- Subtract the zero point from the time points. 483 times = times - t0 484 485 #-- Prepare the input: if a frequency value is given, put it in a list. If 486 # an iterable is given convert it to an array 487 if not hasattr(freq,'__len__'): 488 freq = [freq] 489 freq = np.asarray(freq) 490 # do the same for the sigmas 491 if sigma is None: 492 sigma = np.ones_like(signal) 493 elif not hasattr(sigma,'__len__'): 494 sigma = sigma * np.ones_like(signal) 495 496 #-- Determine the number of fit parameters 497 Ndata = len(times) 498 Nfreq = len(freq) 499 if not constant: Nparam = 2*Nfreq 500 else: Nparam = 2*Nfreq+1 501 502 #-- The fit function used is of the form 503 # C + \sum_j a_j sin(2pi\nu_j t_i) + b_j cos(2pi\nu_j t_i) 504 # which is linear in its fit parameters. These parameters p can therefore be 505 # solved by minimizing ||b - A p||, where b are the observations, and A is the 506 # basisfunction matrix, i.e. A[i,j] is the j-th base function evaluated in 507 # the i-th timepoint. The first Nfreq columns are the amplitudes of the sine, 508 # the second Nfreq columns are the amplitudes of the cosine, and if requested, 509 # the last column belongs to the constant 510 A = np.zeros((Ndata,Nparam)) 511 for j in range(Nfreq): 512 A[:,j] = sin(2*pi*freq[j]*times) * sigma 513 A[:,Nfreq+j] = cos(2*pi*freq[j]*times) * sigma 514 if constant: 515 A[:,2*Nfreq] = np.ones(Ndata) * sigma 516 517 b = signal * sigma 518 519 #-- Solve using SVD decomposition 520 fitparam, chisq, rank, s = np.linalg.lstsq(A,b) 521 522 #-- Compute the amplitudes and phases: A_j sin(2pi*\nu_j t_i + phi_j) 523 amplitude = np.zeros(Nfreq) 524 phase = np.zeros(Nfreq) 525 for j in range(Nfreq): 526 amplitude[j] = np.sqrt(fitparam[j]**2 + fitparam[Nfreq+j]**2) 527 phase[j] = np.arctan2(fitparam[Nfreq+j], fitparam[j]) 528 529 #-- If no error bars are needed, we are finished here, we collect all parameters 530 if constant: 531 constn = np.zeros(len(amplitude)) 532 constn[0] = fitparam[2*Nfreq] 533 names = ['const','ampl','freq','phase'] 534 fpars = [constn,amplitude,freq,phase/(2*pi)] 535 else: 536 names = ['ampl','freq','phase'] 537 fpars = [amplitude,freq,phase/(2*pi)] 538 539 parameters = np.rec.fromarrays(fpars,names=names) 540 541 logger.debug('SINEFIT: Calculated harmonic fit with %d frequencies through %d datapoints'%(Nfreq,Ndata)) 542 543 return parameters

544

545 -def periodic_spline(times, signal, freq, t0=None, order=20, k=3):

546 """ 547 Fit a periodic spline. 548 549 CAUTION: this definition assumes the proposed period is either 550 small compared to the total time range, or there are a lot of 551 points per period. 552 553 This definition basically phases all the data and constructs 554 an empirical periodic function through an averaging process 555 per phasebin, and then performing a splinefit through those points. 556 557 In order to make the first derivative continuous, we repeat 558 the first point(s) at the end and the end point(s) at the beginning. 559 560 The constructed function can than be removed from the original 561 data. 562 563 Output is a record array containing the columns 'freq', 'knots', 'coeffs' 564 and 'degree'. 565 566 Example usage: 567 568 >>> myfreq = 1/20. 569 >>> times = np.linspace(0,150,10000) 570 >>> signal = sin(2*pi*myfreq*times+0.32*2*pi) + np.random.normal(size=len(times)) 571 >>> cjs = periodic_spline(times,signal,myfreq,order=30) 572 >>> trend = evaluate.periodic_spline(times,cjs) 573 >>> import pylab as pl 574 >>> p=pl.figure() 575 >>> p=pl.plot(times,signal,'ko') 576 >>> p=pl.plot(times,trend,'r-') 577 >>> p=pl.title("test:tfit:periodic_spline_fit") 578 579 @param times: time points 580 @type times: numpy 1d array 581 @param signal: observation points 582 @type signal: numpy 1d array 583 @param freq: frequency of the periodic trend 584 @type freq: float 585 @keyword order: number of points to use for the spline fit. 586 @type order: integer 587 @keyword k: order of the spline 588 @type k: integer 589 @return: parameters 590 @rtype: record array 591 """ 592 #-- get keyword arguments 593 if t0 is None: 594 t0 = times[0] 595 596 #-- Prepare the input: if a frequency value is given, put it in a list. If 597 # an iterable is given convert it to an array 598 if not hasattr(freq,'__len__'): 599 freq = [freq] 600 freq = np.asarray(freq) 601 602 N = order*3+(k-2) 603 parameters = np.zeros(len(freq), dtype=[('freq','float32'), 604 ('knots','%dfloat32'%N), 605 ('coeffs','%dfloat32'%N), 606 ('degree','int8')]) 607 for nf,ifreq in enumerate(freq): 608 #-- get phased signal 609 phases,phased_sig = evaluate.phasediagram(times,signal,ifreq,t0=t0) 610 611 #-- construct phase domain 612 x = np.linspace(0.,1.,order)[:-1] 613 dx = x[1]-x[0] 614 x = x+dx/2 615 616 #-- make bin-averaged signal. Possible that some bins are empty, just skip 617 # them but warn the user 618 found_phase = [] 619 found_averg = [] 620 for xi in x: 621 this_bin = ((xi-dx/2.)<=phases) & (phases<=(xi+dx/2.)) 622 if np.sum(this_bin)==0: 623 logger.warning("PERIODIC SPLINE: some parts of the phase are not covered") 624 continue 625 found_phase.append(xi) 626 found_averg.append(np.average(phased_sig[this_bin])) 627 628 found_phase = np.array(found_phase) 629 found_averg = np.array(found_averg) 630 631 #-- make circular 632 found_phase = np.hstack([found_phase-1,found_phase,found_phase+1]) 633 found_averg = np.hstack([found_averg, found_averg,found_averg]) 634 635 #-- compute spline representation 636 knots,coeffs,degree = splrep(found_phase,found_averg,k=k) 637 parameters['freq'][nf] = ifreq 638 parameters['knots'][nf] = knots 639 parameters['coeffs'][nf] = coeffs 640 parameters['degree'][nf] = degree 641 642 return parameters

643

644 -def kepler(times, signal, freq, sigma=None, wexp=2., e0=0, en=0.99, de=0.01, output_type='old'):

645 """ 646 Fit a Kepler orbit to a time series. 647 648 Example usage: 649 650 >>> from ivs.timeseries import pergrams 651 652 First set the parameters we want to use: 653 654 >>> pars = tuple([12.456,23456.,0.37,213/180.*pi,98.76,55.]) 655 >>> pars = np.rec.array([pars],dtype=[('P','f8'),('T0','f8'),('e','f8'), 656 ... ('omega','f8'),('K','f8'),('gamma','f8')]) 657 658 Then generate the signal and add some noise 659 660 >>> times = np.linspace(pars[0]['T0'],pars[0]['T0']+5*pars[0]['P'],100) 661 >>> signalo = evaluate.kepler(times,pars) 662 >>> signal = signalo + np.random.normal(scale=20.,size=len(times)) 663 664 Calculate the periodogram: 665 666 >>> freqs,ampls = pergrams.kepler(times,signal) 667 >>> opars = kepler(times,signal,freqs[np.argmax(ampls)]) 668 >>> signalf = evaluate.kepler(times,opars) 669 670 And make some plots 671 672 >>> import pylab as pl 673 >>> p = pl.figure() 674 >>> p = pl.subplot(221) 675 >>> p = pl.plot(times,signal,'ko') 676 >>> p = pl.plot(times,signalo,'r-',lw=2) 677 >>> p = pl.plot(times,signalf,'b--',lw=2) 678 >>> p = pl.subplot(222) 679 >>> p = pl.plot(freqs,ampls,'k-') 680 681 @param times: time points 682 @type times: numpy 1d array 683 @param signal: observation points 684 @type signal: numpy 1d array 685 @param freq: frequency of kepler orbit 686 @type freq: float 687 @return: parameters 688 @rtype: record array 689 """ 690 if sigma is None: 691 sigma = np.ones_like(times) 692 T = times[-1]-times[0] 693 x00 = 0. 694 x0n = 359.9 695 #-- initialize parameters 696 f0 = freq 697 fn = freq+0.05/T 698 df = 0.1/T 699 maxstep = int((fn-f0)/df+1) 700 f1 = np.zeros(maxstep) #-- frequency 701 s1 = np.zeros(maxstep) #-- power 702 p1 = np.zeros(maxstep) #-- window 703 l1 = np.zeros(maxstep) #-- power LS 704 s2 = np.zeros(maxstep) #-- power Kepler 705 k2 = np.zeros(6) #-- parameters for Kepler orbit 706 707 pyKEP.kepler(times,signal,sigma,f0,fn,df,wexp,e0,en,de,x00,x0n, 708 f1,s1,p1,l1,s2,k2) 709 710 freq,x0,e,w,K,RV0 = k2 711 pars = [1/freq,x0/(2*pi*freq) + times[0], e, w, K, RV0] 712 if output_type=='old': 713 pars = np.rec.array([tuple(pars)],dtype=[('P','f8'),('T0','f8'),('e','f8'),('omega','f8'),('K','f8'),('gamma','f8')]) 714 715 return pars

716

717 718 719 -def box(times,signal,freq,b0=0,bn=1,order=50,t0=None):

720 """ 721 Fit box shaped transits. 722 723 @param times: time points 724 @type times: numpy 1d array 725 @param signal: observation points 726 @type signal: numpy 1d array 727 @param freq: frequency of transiting signal 728 @type freq: float 729 @param b0: minimum start of ingress (in phase) 730 @type b0: 0<float<bn 731 @param bn: maximum end of egress (in phase) 732 @type bn: b0<float<1 733 @param order: number of phase bins 734 @type order: integer 735 @param t0: zeropoint of times 736 @type t0: float 737 @return: parameters 738 @rtype: record array 739 """ 740 if t0 is None: 741 t0 = 0. 742 743 #-- Prepare the input: if a frequency value is given, put it in a list. If 744 # an iterable is given convert it to an array 745 if not hasattr(freq,'__len__'): 746 freq = [freq] 747 freq = np.asarray(freq) 748 749 parameters = np.rec.fromarrays([np.zeros(len(freq)) for i in range(5)],dtype=[('freq','f8'),('depth','f8'), 750 ('ingress','f8'),('egress','f8'), 751 ('cont','f8')]) 752 bins = np.linspace(b0,bn,order) 753 754 for fnr,frequency in enumerate(freq): 755 parameters[fnr]['freq'] = frequency 756 #-- we need to check two phase diagrams, to compensate for edge effects 757 phases1,phased1 = evaluate.phasediagram(times,signal,frequency,t0=t0) 758 phases2,phased2 = evaluate.phasediagram(times,signal,frequency,t0=t0+0.5/frequency) 759 best_fit = np.inf 760 for i,start in enumerate(bins): 761 for end in bins[i+1:]: 762 transit1 = (start<=phases1) & (phases1<=end) 763 transit2 = (start<=phases2) & (phases2<=end) 764 transit_sig1 = np.median(np.compress( transit1,phased1)) 765 contini_sig1 = np.median(np.compress(1-transit1,phased1)) 766 depth1 = contini_sig1-transit_sig1 767 transit_sig2 = np.median(np.compress( transit2,phased2)) 768 contini_sig2 = np.median(np.compress(1-transit2,phased2)) 769 depth2 = contini_sig2-transit_sig2 770 #-- check fit 771 chisquare1 = sum((evaluate.box(times,[contini_sig1,frequency,depth1,start,end])-signal)**2) 772 chisquare2 = sum((evaluate.box(times,[contini_sig2,frequency,depth2,start,end])-signal)**2) 773 if chisquare1 < best_fit and chisquare1 < chisquare2: 774 parameters[fnr]['depth'] = depth1 775 parameters[fnr]['ingress'] = start 776 parameters[fnr]['egress'] = end 777 parameters[fnr]['cont'] = contini_sig1 778 best_fit = chisquare1 779 elif chisquare2 < best_fit and chisquare2 < chisquare1: 780 start2 = start - 0.5 781 end2 = end - 0.5 782 if start2 < 0: start2 += 1 783 if end2 < 0: end2 += 1 784 parameters[fnr]['depth'] = depth2 785 parameters[fnr]['ingress'] = start2 786 parameters[fnr]['egress'] = end2 787 parameters[fnr]['cont'] = contini_sig2 788 best_fit = chisquare2 789 return parameters

790

791 -def gauss(x,y,threshold=0.1,constant=False,full_output=False, 792 init_guess_method='analytical',window=None):

793 """ 794 Fit a Gaussian profile to data using a polynomial fit. 795 796 y = A * exp( -(x-mu)**2 / (2*sigma**2)) 797 798 ln(y) = ln(A) - (x-mu)**2 / (2*sigma**2) 799 ln(y) = ln(A) - mu**2 / (2*sigma**2) + mu / (sigma**2) * x - x**2 / (2*sigma**2) 800 801 then the parameters are given by 802 803 p0 = - 1 / (2*sigma**2) 804 p1 = mu / ( sigma**2) 805 p2 = ln(A) - mu**2 / (2*sigma**2) 806 807 Note that not all datapoints are used, but only those above a certain values (namely 808 10% of the maximum value), In this way, we reduce the influence of the continuum 809 and concentrate on the shape of the peak itself. 810 811 Afterwards, we perform a non linear least square fit with above parameters 812 as starting values, but only accept it if the CHI2 has improved. 813 814 If a constant has to be fitted, the nonlinear options has to be True. 815 816 Example: we generate a Lorentzian curve and fit a Gaussian to it: 817 818 >>> x = np.linspace(-10,10,1000) 819 >>> y = evaluate.lorentz(x,[5.,0.,2.]) + np.random.normal(scale=0.1,size=len(x)) 820 >>> pars1,e_pars1 = gauss(x,y) 821 >>> pars2,e_pars2 = gauss(x,y,constant=True) 822 >>> y1 = evaluate.gauss(x,pars1) 823 >>> y2 = evaluate.gauss(x,pars2) 824 >>> p = pl.figure() 825 >>> p = pl.plot(x,y,'k-') 826 >>> p = pl.plot(x,y1,'r-',lw=2) 827 >>> p = pl.plot(x,y2,'b-',lw=2) 828 829 @param x: x axis data 830 @type x: numpy array 831 @param y: y axis data 832 @type y: numpy array 833 @param nl: flag for performing a non linear least squares fit 834 @type nl: boolean 835 @param constant: fit also a constant 836 @type constant: boolean 837 @rtype: tuple 838 @return: A, mu, sigma (,C) 839 """ 840 #-- if a constant needs to be determined, take the median of the 5% lowest 841 # points (but take at least three): 842 if constant: 843 N = len(y) 844 C = np.median(np.sort(y)[:max(int(0.05*N),3)]) 845 else: 846 C = 0. 847 848 if window is not None: 849 win = (window[0]<=x) & (x<=window[1]) 850 x,y = x[win],y[win] 851 852 #-- transform to a polynomial function and perform a fit 853 # first clip where y==0 854 threshold *= max(y-C) 855 use_in_fit = (y-C)>=threshold 856 xc = x[use_in_fit] 857 yc = y[use_in_fit] 858 859 lnyc = np.log(yc-C) 860 p = np.polyfit(xc, lnyc, 2) 861 862 #-- determine constants 863 sigma = np.sqrt(-1. / (2.*p[0])) 864 mu = sigma**2.*p[1] 865 A = np.exp(p[2] + mu**2. / (2.*sigma**2.)) 866 867 #-- handle NaN Exceptions 868 if A!=A or mu!=mu or sigma!=sigma: 869 logger.error('Initial Gaussian fit failed') 870 init_success = False 871 A = (yc-C).max() 872 mu = xc[yc.argmax()] 873 sigma = xc.ptp()/3. 874 else: 875 init_success = True 876 877 #-- check chi2 878 if constant: 879 p0 = evaluate.gauss_preppars(np.asarray([A,mu,sigma,C])) 880 else: 881 p0 = evaluate.gauss_preppars(np.asarray([A,mu,sigma])) 882 yf = evaluate.gauss(xc,p0) 883 chi2 = np.sum((yc-yf)**2) 884 885 #-- perform non linear least square fit 886 if constant: 887 pars,e_pars,gain = optimize(x,y,p0,'gauss', minimizer='leastsq') 888 else: 889 pars,e_pars,gain = optimize(x,y,p0,'gauss', minimizer='leastsq') 890 if gain<0: 891 pars = p0 892 pars['sigma'] = np.abs(pars['sigma']) 893 if not full_output: 894 return pars,e_pars 895 else: 896 return pars,e_pars,p0,use_in_fit,init_success

897

898 899 #} 900 #{ Error determination 901 902 -def e_sine(times,signal,parameters,correlation_correction=True,limit=10000):

903 """ 904 Compute the errors on the parameters from a sine fit. 905 906 Note: errors on the constant are only calculated when the number of datapoints 907 is below 1000. Otherwise, the matrices involved become to huge. 908 909 @param times: time points 910 @type times: numpy array 911 @param signal: observations 912 @type signal: numpy array 913 @param parameters: record array containing the fitted parameters. This should 914 have columns 'ampl','freq' and 'phase', and optionally 'const'. 915 @type parameters: numpy record array 916 @param correlation_correction: set to True if you want to correct for correlation effects 917 @type correlation_correction: boolean 918 @param limit: Calculating the error on the constant requires the calculation 919 of a matrix of size Ndata x Nparam, which takes a long time for large datasets. 920 The routines skips the estimation of the error on the constant if the timeseries 921 is longer than C{limit} datapoints 922 @type limit: integer 923 @return: errors 924 @rtype: Nx4 array(, Nx3 array) 925 """ 926 #-- these are the parameters we need 927 freq = parameters['freq'] 928 phase = parameters['phase'] 929 amplitude = parameters['ampl'] 930 chisq = np.sum((signal - evaluate.sine(times,parameters))**2) 931 932 #-- for quick reference, we also need the dimensions of the data and 933 # fit parameters 934 Ndata = len(times) 935 Nparam = 2*len(parameters) 936 Nfreq = len(freq) 937 T = times.ptp() 938 939 #-- do we need to include the constant? 940 if 'const' in parameters.dtype.names: 941 constant = True 942 Nparam += 1 943 944 #-- these lists will contain the columns and their names 945 errors = [] 946 names = [] 947 948 #-- If error bars on the constant are needed, we do it here. The errors 949 # on the amplitude, frequency and phase are computed below. 950 if constant and Ndata<limit: 951 # First the derivative matrix: the 1st Nfreq columns the derivative w.r.t. 952 # the amplitude, the 2nd Nfreq columns w.r.t. the phase, and the if relevant 953 # the last column w.r.t. the constant. From this the covariance matrix. 954 F = np.zeros((Ndata, Nparam)) 955 for i in range(Ndata): 956 F[i,0:Nfreq] = sin(2*pi*freq*times[i] + phase) 957 F[i,Nfreq:2*Nfreq] = amplitude * cos(2*pi*freq*times[i] + phase) 958 #-- and for the constant 959 F[i,2*Nfreq] = 1.0 960 961 covariance = np.linalg.inv(np.dot(F.T, F)) 962 covariance *= chisq / (Ndata - Nparam) 963 964 #error_ampl = np.sqrt(covariance.diagonal()[:Nfreq]) 965 #error_phase = np.sqrt(covariance.diagonal()[Nfreq:2*Nfreq]) 966 error_const = np.sqrt(covariance[2*Nfreq,2*Nfreq]) 967 error_const = np.ones(Nfreq)*error_const 968 if len(error_const)>1: 969 error_const[1:] = 0 970 errors.append(error_const) 971 names.append('e_const') 972 elif constant: 973 errors.append(np.zeros(Nfreq)) 974 names.append('e_const') 975 976 #-- other variables 977 errors_ = np.zeros((Nfreq,3)) 978 residus = signal + 0. 979 for i in range(1,Nfreq+1): 980 residus -= evaluate.sine(times,parameters[i-1:i]) 981 a = parameters[i-1]['ampl'] 982 sigma_m = np.std(residus) 983 984 #-- error on amplitude, frequency and phase (in that order) 985 errors_[i-1,0] = np.sqrt(2./Ndata) * sigma_m 986 errors_[i-1,1] = np.sqrt(6./Ndata) * 1. / (pi*T) * sigma_m / a 987 errors_[i-1,2] = np.sqrt(2./Ndata) * sigma_m / a 988 989 #-- correct for correlation effects 990 if correlation_correction: 991 rho = max(1,get_correlation_factor(residus)) 992 errors_[i-1,0] *= np.sqrt(rho) 993 errors_[i-1,1] *= np.sqrt(rho) 994 errors_[i-1,2] *= np.sqrt(rho) 995 996 #-- collect results and return 997 e_parameters = np.rec.fromarrays(errors+[errors_[:,0],errors_[:,1],errors_[:,2]], 998 names=names + ['e_ampl','e_freq','e_phase']) 999 1000 return e_parameters

1001

1002 1003 #{ Linear improvements 1004 1005 -def diffcorr(times, signal, parameters, func_name, \ 1006 max_iter=100, tol=1e-6, args=(), full_output=False):

1007 """ 1008 Differential corrections. 1009 """ 1010 1011 #-- ensure we have a flat array, and look for the functions to evaluate the 1012 # fits and the coefficients 1013 prep_func = getattr(evaluate,func_name+'_preppars') 1014 diff_func = getattr(evaluate,func_name+'_diffcorr') 1015 eval_func = getattr(evaluate,func_name) 1016 if parameters.dtype.names: 1017 parameters = prep_func(parameters) 1018 1019 #-- prepare arrays for coefficients and new parameters 1020 Deltas = np.zeros((max_iter,len(parameters))) 1021 params = np.zeros_like(Deltas) 1022 params[0] = parameters 1023 counter = 1 1024 while (counter==1) or (counter>0 and counter<max_iter and np.any(np.abs(Deltas[counter-1])>tol)): 1025 params[counter] = params[counter-1] + Deltas[counter-1] 1026 myfit = eval_func(times,params[counter],*args) 1027 coeff = diff_func(times,params[counter],*args) 1028 Delta,res,rank,s = la.lstsq(coeff,myfit-signal) 1029 Deltas[counter] = -Delta 1030 counter += 1 1031 1032 #-- transform the parameters to record arrays, as well as the steps 1033 parameters = prep_func(params[counter-1]) 1034 e_parameters = prep_func(Deltas[counter-1]) 1035 e_parameters.dtype.names = ['e_'+name for name in e_parameters.dtype.names] 1036 1037 if full_output: 1038 return parameters,e_parameters,0,params[:counter],Deltas[:counter] 1039 else: 1040 return parameters,e_parameters,0

1041

1042 1043 #} 1044 1045 1046 #{ Non-linear improvements 1047 1048 -def residuals(parameters,domain,data,evalfunc,weights,logar,*args):

1049 fit = evalfunc(domain,parameters,*args) 1050 if weights is None: 1051 weights = np.ones_like(data) 1052 if logar: 1053 return weights*(np.log10(data)-np.log10(fit)) 1054 else: 1055 return weights*(data-fit)

1056

1057 -def residuals_single(parameters,domain,data,evalfunc,weights,logar,*args):

1058 fit = evalfunc(domain,parameters,*args) 1059 if weights is None: 1060 weights = np.ones_like(data) 1061 if logar: 1062 return sum(weights*(np.log10(data)-np.log10(fit))**2) 1063 else: 1064 return sum(weights*(data-fit)**2)

1065

1066 -def optimize(times, signal, parameters, func_name, prep_func=None, 1067 minimizer='leastsq', weights=None, logar=False, args=()):

1068 """ 1069 Fit a function to data. 1070 """ 1071 #-- we need these function to evaluate the fit and to (un)pack the fitting 1072 # parameters from and to flat arrays 1073 if prep_func is None and isinstance(func_name,str): 1074 prepfunc = getattr(evaluate,func_name+'_preppars') 1075 else: 1076 prepfunc = None 1077 if isinstance(func_name,str): 1078 evalfunc = getattr(evaluate,func_name) 1079 else: 1080 evalfunc = func_name 1081 optifunc = getattr(scipy.optimize,minimizer) 1082 1083 #-- if no weights, everything has the same weight 1084 if weights is None: 1085 weights = np.ones_like(times) 1086 1087 #-- if the initial guess of the fitting parameters aren't flat, flatten them 1088 # here: 1089 if prepfunc is not None and parameters.dtype.names: 1090 parameters = prepfunc(parameters) 1091 init_guess = parameters.copy() 1092 #-- keep track of the initial chi square value, to check if there is an 1093 # improvement 1094 dof = (len(times)-len(init_guess)) 1095 signalf_init = evalfunc(times,init_guess) 1096 if logar: 1097 chisq_init = np.sum((np.log10(signalf_init)-np.log10(signal))**2) 1098 else: 1099 chisq_init = np.sum((signalf_init-signal)**2) 1100 1101 #-- optimize 1102 if minimizer=='leastsq': 1103 popt, cov, info, mesg, flag = optifunc(residuals,init_guess, 1104 args=(times,signal,evalfunc,weights,logar)+args,full_output=1)#,diag=[1.,10,1000,1.,100000000.]) 1105 #-- calculate new chisquare, and check if we have improved it 1106 chisq = np.sum(info['fvec']*info['fvec']) 1107 if chisq>chisq_init or flag!=1: 1108 logger.error('Optimization not successful [flag=%d] (%g --> %g)'%(flag,chisq_init,chisq)) 1109 chisq = np.inf 1110 1111 #-- derive the errors from the nonlinear fit 1112 if cov is not None: 1113 errors = np.sqrt(cov.diagonal()) * np.sqrt(chisq/dof) 1114 else: 1115 logger.error('Error estimation via optimize not successful') 1116 errors = np.zeros(len(popt)) 1117 else: 1118 out = optifunc(residuals_single,init_guess, 1119 args=(times,signal,evalfunc,weights),full_output=1,disp=False) 1120 popt = out[0] 1121 #-- calculate new chisquare, and check if we have improved it 1122 signalf_update = evalfunc(times,popt) 1123 chisq = np.sum((signalf_update-signal)**2) 1124 errors = np.zeros(len(popt)) 1125 if chisq>chisq_init: 1126 logger.error('Optimization not successful') 1127 1128 #-- gain in chi square: if positive, we gained, if negative, we lost... 1129 gain = (chisq_init-chisq)/chisq_init*100. 1130 1131 #-- transform the parameters to record arrays, as well as the errors 1132 if prepfunc is not None: 1133 parameters = prepfunc(popt) 1134 e_parameters = prepfunc(errors) 1135 e_parameters.dtype.names = ['e_'+name for name in e_parameters.dtype.names] 1136 else: 1137 parameters = popt 1138 e_parameters = errors 1139 1140 return parameters,e_parameters, gain

1141

1142 #=================================================================================================== 1143 #LMFIT functions 1144 1145 -class Function(object):

1146 """ 1147 Class to define a function with associated parameters. The function can be evaluated using the 1148 L{evaluate} method. Parameters can be added/updated, together with boundaries and expressions, 1149 and can be hold fixed and released by adjusting the vary keyword in L{setup_parameters}. 1150 1151 The provided function needs to take two arguments. The first one is a list of parameters, the 1152 second is a list of x-values for which the function will be evaluated. For example if you want 1153 to define a quadratic function it will look like: 1154 1155 >>> func = lambda pars, x: pars[0] + pars[1] * x + pars[2] * x**2 1156 1157 Functions can be combined using the L{Model} class, or can be fitted directly to data using the 1158 L{Minimizer} class. 1159 1160 To improve the fit, a jacobian can be provided as well. However, some care is nessessary when 1161 defining this function. When using the L{Minimizer} class to fit a Function to data, the 1162 residual function is defined as:: 1163 1164 residuals = data - Function.evaluate() 1165 1166 To derive the jacobian you have to take the derivatives of -1 * function. Furthermore the 1167 derivatives have to be across the rows. For the example quadratic function above, the jacobian 1168 would look like: 1169 1170 >>> jacobian = lambda pars, x: np.array([-np.ones(len(x)), -x, -x**2]).T 1171 1172 If you get bad results, try flipping all signs. The jacobian does not really improve the speed 1173 of the fitprocess, but it does help to reach the minimum in a more consistent way (see examples). 1174 1175 The internal representation of the parameters uses a parameter object of the U{lmfit 1176 <http://cars9.uchicago.edu/software/python/lmfit/index.html>} package. No knowledge of this 1177 repersentation is required as methods for direct interaction with the parameter values and 1178 other settings are provided. If wanted, the parameters object itself can be obtained with 1179 the L{parameters} attribute. 1180 """ 1181

1182 - def __init__(self, function=None, par_names=None, jacobian=None, resfunc=None):

1183 """ 1184 Create a new Function by providing the function of which it consists together with the names 1185 of each parameter in this function. You can specify the jacobian as well. 1186 1187 @param function: A function expression 1188 @type function: function 1189 @param par_names: The names of each parameter of function 1190 @type par_names: list of strings 1191 @param jacobian: A function expression for the jacobian of function. 1192 @type jacobian: function 1193 @param resfunc: Function to calculate the residuals. (Not obligatory) 1194 @type resfunc: function 1195 """ 1196 self.function = function 1197 self.par_names = par_names 1198 self.jacobian = jacobian 1199 self.resfunc = resfunc 1200 1201 #create an empty parameter set based on the parameter names 1202 self.parameters = None 1203 self.setup_parameters()

1204 1205 #{ Interaction 1206

1207 - def evaluate(self,x, *args, **kwargs):

1208 """ 1209 Evaluate the function for the given values and optional the given parameter object. 1210 If no parameter object is given then the parameter object belonging to the function 1211 is used. 1212 1213 >>> #func.evaluate(x, parameters) 1214 >>> #func.evaluate(x) 1215 1216 @param x: the independant data for which to evaluate the function 1217 @type x: numpy array 1218 1219 @return: Function(x), same size as x 1220 @rtype: numpy array 1221 """ 1222 if len(args) == 0: 1223 #-- Use the parameters belonging to this object 1224 pars = [] 1225 for name in self.par_names: 1226 pars.append(self.parameters[name].value) 1227 1228 return self.function(pars,x, **kwargs) 1229 1230 if len(args) == 1: 1231 #-- Use the provided parameters 1232 #-- if provided as a ParameterObject 1233 if isinstance(args[0],dict): 1234 pars = [] 1235 for name in self.par_names: 1236 pars.append(args[0][name].value) 1237 else: 1238 pars = args[0] 1239 1240 return self.function(pars,x, **kwargs)

1241

1242 - def evaluate_jacobian(self, x, *args):

1243 """ 1244 Evaluates the jacobian if that function is provided, using the given parameter object. 1245 If no parameter object is given then the parameter object belonging to the function 1246 is used. 1247 """ 1248 if self.jacobian == None: 1249 return [0.0 for i in self.par_names] 1250 1251 if len(args) == 0: 1252 #-- Use the parameters belonging to this object 1253 pars = [] 1254 for name in self.par_names: 1255 pars.append(self.parameters[name].value) 1256 1257 return self.jacobian(pars,x) 1258 1259 if len(args) == 1: 1260 #-- Use the provided parameters 1261 #-- if provided as a ParameterObject 1262 if isinstance(args[0],dict): 1263 pars = [] 1264 for name in self.par_names: 1265 pars.append(args[0][name].value) 1266 else: 1267 pars = args[0] 1268 1269 return self.jacobian(pars,x)

1270

1271 - def setup_parameters(self, value=None, bounds=None, vary=None, expr=None, **kwargs):

1272 """ 1273 Create or adjust a parameter object based on the parameter names and if provided 1274 the values, bounds, vary and expressions. Basic checking if the parameter boundaries 1275 are consistent is performed. 1276 1277 Example Use: 1278 1279 >>> setup_parameters(values=[v1, v2, ...], bounds=[(b1, b2), (b3, b4), ...], vary=[True, False, ...]) 1280 1281 @param values: The values of the parameters 1282 @type values: array 1283 @param bounds: min and max boundaries on the parameters [(min,max),...] 1284 @type bounds: array of tuples 1285 @param vary: Boolean array, true to vary a parameter, false to keep it fixed in the fitting process 1286 @type vary: array 1287 @param exprs: array of expressions that the paramters have to follow 1288 @type exprs: array 1289 """ 1290 nrpars = len(self.par_names) 1291 if value == None: 1292 value = kwargs['values'] if 'values' in kwargs else [0 for i in range(nrpars)] 1293 if bounds == None: 1294 bounds = np.array([[None,None] for i in range(nrpars)]) 1295 else: 1296 bounds = np.asarray(bounds) 1297 if vary == None: 1298 vary = [True for i in range(nrpars)] 1299 if expr == None: 1300 expr = kwargs['exprs'] if 'exprs' in kwargs else [None for i in range(nrpars)] 1301 1302 min = kwargs['min'] if 'min' in kwargs else bounds[:,0] 1303 max = kwargs['max'] if 'max' in kwargs else bounds[:,1] 1304 1305 def check_boundaries(min, max, value, name): 1306 #-- Check if boundaries are consistent 1307 if max != None and min != None and max < min: 1308 min, max = max, min 1309 logging.warning('Parameter %s: max < min, switched boundaries!'%(name)) 1310 if min != None and value < min: 1311 min = value 1312 logging.warning('Parameter %s: value < min, adjusted min!'%(name)) 1313 if max != None and value > max: 1314 max = value 1315 logging.warning('Parameter %s: value > max, adjusted max!'%(name)) 1316 return min, max

1317 1318 if self.parameters == None: 1319 #-- Create a new parameter object 1320 self.parameters = lmfit.Parameters() 1321 for i,name in enumerate(self.par_names): 1322 min_, max_ = check_boundaries(min[i], max[i], value[i], name) 1323 self.parameters.add(name, value=value[i], vary=vary[i], min=min_, max=max_, expr=expr[i]) 1324 else: 1325 #-- Adjust an existing parameter object 1326 for i,name in enumerate(self.par_names): 1327 min_, max_ = check_boundaries(min[i], max[i], value[i], name) 1328 self.parameters[name].value = float(value[i]) 1329 self.parameters[name].user_value = float(value[i]) 1330 self.parameters[name].vary = bool(vary[i]) if vary[i] != None else True 1331 self.parameters[name].min = min_ 1332 self.parameters[name].max = max_ 1333 self.parameters[name].expr = expr[i]

1334

1335 - def update_parameter(self, parameter=None, **kwargs):

1336 """ 1337 Updates a specified parameter. The parameter can be given by name or by index. This 1338 method also supports the use of min and max keywords to set the lower and upper boundary 1339 seperatly. 1340 1341 Example Use: 1342 1343 >>> update_parameter(parameter='parname', value=10.0, min=5.0, vary=True) 1344 >>> update_parameter(parameter=2, value=0.15, bounds=(0.10, 0.20)) 1345 """ 1346 1347 if type(parameter) == int: 1348 parameter = self.parameters[self.par_names[parameter]] 1349 elif type(parameter) == str: 1350 parameter = self.parameters[parameter] 1351 1352 #-- if bounds is provided, break them up in min and max. 1353 if 'bounds' in kwargs: 1354 kwargs['min'] = kwargs['bounds'][0] 1355 kwargs['max'] = kwargs['bounds'][1] 1356 1357 if 'value' in kwargs: 1358 kwargs['user_value'] = kwargs['value'] 1359 1360 for key in ['value', 'min', 'max', 'vary', 'expr']: 1361 if key in kwargs: 1362 setattr(parameter, key, kwargs[key])

1363

1364 - def get_parameters(self, parameters=None, error='stderr', full_output=False):

1365 """ 1366 Returns the parameter values together with the errors if they exist. If No 1367 fitting has been done, or the errors could not be calculated, None is returned 1368 for the error. 1369 1370 The parameters of which the settings should be returned can be given in 1371 I{parameters}. If None is given, all parameters are returned. 1372 1373 @param parameters: Parameters to be returned or None for all parameters. 1374 @type parameters: array 1375 @param error: Which error to return ('stderr', 'mcerr') 1376 @type error: string 1377 @param full_output: When True, also vary, the boundaries and expression are returned 1378 @type full_output: bool 1379 1380 @return: the parameter values and there errors: value, err, [vary, min, max, expr] 1381 @rtype: numpy arrays 1382 """ 1383 if type(parameters) == str: 1384 parameters = [parameters] 1385 1386 pnames = parameters if parameters != None else self.par_names 1387 1388 1389 out = [] 1390 for name in pnames: 1391 par = self.parameters[name] 1392 out.append([par.value,getattr(par, error), par.vary, par.min, 1393 par.max, par.expr]) 1394 out = np.array(out) 1395 1396 if full_output: 1397 return out.T 1398 else: 1399 return out[:,0], out[:,1]

1400 1401 #} 1402 1403 #{ Print functions 1404

1405 - def param2str(self, **kwargs):

1406 """ 1407 Converts the parameter object of this model to an easy printable string, including 1408 the value, error, boundaries, if the parameter is varied, and if in the fitting process 1409 on of the boundaries was reached. 1410 1411 The error to be printed can be set with the error keyword. You can chose from the 1412 standard error: 'stderr', the monte carlo error: 'mcerr', or any of the confidence 1413 intervalls that you have calculated by coding them like: 'ci###'. Fx: 95% (sigma = 0.95) 1414 use 'ci95', for 99.7% (sigma = 0.997) use 'ci997'. Don't put decimal signs in the ci! 1415 1416 The accuracy with which the parameters are printed can be set with the accuracy keyword. 1417 And the amount of information that is printed is determined by full_output. If False, 1418 only parameter value and error are printed, if True also boundaries and vary are shown. 1419 1420 @param accuracy: Number of decimal places to print 1421 @type accuracy: int 1422 @param error: Which error type to print ('stderr', 'mcerr' or 'ci###') 1423 @type error: string 1424 @param full_output: Amount of information to print 1425 @type full_output: bool 1426 1427 @return: parameters in string format 1428 @rtype: string 1429 """ 1430 return parameters2string(self.parameters, **kwargs)

1431

1432 - def correl2str(self, **kwargs):

1433 """ 1434 Convert the correlations between the different parameters of this function to an 1435 easy printable string. The correlations are only printed if they are larger than 1436 a certain provided limit. And the accuracy keyword sets the amount of decimals 1437 to print 1438 1439 @param accuracy: number of decimal places to print 1440 @param limit: minimum correlation value to print 1441 1442 @return: correlation in string format 1443 @rtype: string 1444 """ 1445 return correlation2string(self.parameters, **kwargs)

1446

1447 - def ci2str(self, **kwargs):

1448 """ 1449 Convert the confidence intervals of the parameters of this model to an easy 1450 printable string. 1451 1452 The accuracy with wich the CIs should be printed can be set with the accuracy 1453 keyword. 1454 1455 @param accuracy: Number of decimal places to print 1456 @type accuracy: int 1457 1458 @return: confidence intervals in string format 1459 @rtype: string 1460 """ 1461 return confidence2string(self.parameters, **kwargs)

1462 1463 #} 1464 1465 #{ Internal 1466

1467 - def __str__(self):

1468 """ String representation of the Function object """ 1469 pnames = ", ".join(self.par_names) 1470 name = "<Function: '{:s}' with parameters: {:s}>".format(self.function.__name__, pnames) 1471 return name

1472

1473 #} 1474 1475 1476 -class Model(object):

1477 """ 1478 Class to create a model using different L{Function}s each with their associated parameters. 1479 This Model can then be used to fit data using the L{Minimizer} class. The Model can be 1480 evaluated using the L{evaluate} method. Parameters can be added/updated, together with 1481 boundaries and expressions, and can be hold fixed or adjusted by changing the vary keyword 1482 in L{update_parameter}. 1483 1484 Parameters for the different Functions are combined. To keep track of which parameter is 1485 which, they get a number added to the end indicating the function they belong to. For example: 1486 when a Model is created summing a gaussian function with a quadratic function. The gaussian has 1487 parameters [a, mu, sigma, c] and the quadratic function has parameters [s0, s1, s2]. If the 1488 functions are provided in order [Gaussian, Quadratic], the parameters will be renames to: 1489 [a_0, mu_0, sigma_0, c_0] and [s0_1, s1_1, s2_1]. Keep in mind that this renaming only happens 1490 in the Model object. In the underlying Functions the parameters will keep there original name. 1491 1492 The functions themselfs can be combined using a mathematical expression in the constructor. 1493 If no expression is provided, the output of each function is summed up. Keep in mind that each 1494 function needs to have the same output shape:: 1495 1496 Model(x) = Function1(x) + Function2(x) + ... 1497 1498 The provided expression needs to be a function taking an array with the results of the 1499 Functions in the model as arguments, and having an numpy array as result. This can be done 1500 with simple I{lambda} expression or more complicated functions:: 1501 1502 Model(x) = Expr( [Function1(x),Function2(x),...] ) 1503 1504 The internal representation of the parameters uses a parameter object of the U{lmfit 1505 <http://cars9.uchicago.edu/software/python/lmfit/index.html>} package. No knowledge of this 1506 repersentation is required as methods for direct interaction with the parameter values and 1507 other settings are provided. If wanted, the parameters object itself can be obtained with 1508 the L{parameters} attribute. 1509 1510 At the moment the use of a jacobian is not supported at the Model level as it is not possible 1511 to derive a symbolic jacobian from the provided functions. If you want to use a jacobian you 1512 will have to write a Function yourself in which you can provide a jacobian function. 1513 """ 1514

1515 - def __init__(self, functions=None, expr=None, resfunc=None):

1516 """ 1517 Create a new Model by providing the functions of which it consists. You can provid an 1518 expression describing how the Functions have to be combined as well. This expression needs 1519 to take the out put of the Fuctions in an array as argument, and provide a new numpy array 1520 as result. 1521 1522 @param functions: A list of L{Function}s 1523 @type functions: list 1524 @param expr: An expression to combine the given functions. 1525 @type expr: function 1526 @param resfunc: A function to calculate the residuals (Not obligatory) 1527 @type resfunc: function 1528 """ 1529 self.functions = functions 1530 self.expr = expr 1531 self.resfunc = resfunc 1532 self.jacobian = None 1533 self._par_names = None 1534 self.parameters = None 1535 1536 #-- Combine the parameters 1537 self.pull_parameters()

1538 1539 #{ Interaction 1540

1541 - def evaluate(self, x, *args, **kwargs):

1542 """ 1543 Evaluate the model for the given values and optional a given parameter object. 1544 If no parameter object is given then the parameter object belonging to the model 1545 is used. 1546 1547 >>> evaluate(x, parameters) 1548 >>> evaluate(x) 1549 1550 @param x: the independant values for which to evaluate the model. 1551 @type x: array 1552 1553 @return: Model(x) 1554 @rtype: numpy array 1555 """ 1556 if len(args) == 0: 1557 #-- Use the parameters belonging to this object 1558 parameters = self.parameters 1559 elif len(args) == 1: 1560 #-- Use the provided parameters 1561 parameters = args[0] 1562 1563 #-- Update the parameters of the individual functions before calling them 1564 self.push_parameters(parameters=parameters) 1565 1566 #-- For each function, read the arguments and calculate the result 1567 if self.expr == None: 1568 result = np.zeros(len(x)) 1569 for function in self.functions: 1570 result += function.evaluate(x, **kwargs) 1571 else: 1572 result = [] 1573 for function in self.functions: 1574 result.append(function.evaluate(x, **kwargs)) 1575 result = self.expr(result) 1576 1577 return result

1578

1579 - def evaluate_jacobian(self, x, *args):

1580 """ 1581 Not implemented! 1582 """ 1583 return [0.0 for p in self.parameters]

1584

1585 - def setup_parameters(self,values=None, bounds=None, vary=None, exprs=None):

1586 """ 1587 Not implemented yet, use the L{setup_parameters} method of the Functions 1588 themselfs, or for adjustments of a single parameter use the L{update_parameter} 1589 function 1590 1591 Please provide feedback on redmine on how you would like to use this function!!! 1592 """ 1593 if values is None: values = [None for i in self.functions] 1594 if bounds is None: bounds = [None for i in self.functions] 1595 if vary is None: vary = [None for i in self.functions] 1596 if exprs is None: exprs = [None for i in self.functions] 1597 1598 for i,func in enumerate(self.functions): 1599 func.setup_parameters(values=values[i],bounds=bounds[i],vary=vary[i],exprs=exprs[i]) 1600 1601 self.pull_parameters()

1602

1603 - def update_parameter(self, parameter=None, **kwargs):

1604 """ 1605 Updates a specified parameter. The parameter can be given by name or by index. 1606 However, you have to be carefull when using the names. The model class changes 1607 the names of the parameters of the underlying functions based on the order in 1608 which the functions are provided (See introduction). This method also supports 1609 the use of kwargs min and max to set the lower 1610 and upper boundary separatly. 1611 1612 Example use: 1613 1614 >>> update_parameter(parameter='parname_0', value=10.0, min=5.0, vary=True) 1615 >>> update_parameter(parameter=2, value=0.15, bounds=(0.10, 0.20)) 1616 """ 1617 1618 if type(parameter) == int: 1619 parameter = self.parameters[self._par_names[parameter]] 1620 elif type(parameter) == str: 1621 parameter = self.parameters[parameter] 1622 1623 #-- if bounds is provided, break them up in min and max. 1624 if 'bounds' in kwargs: 1625 kwargs['min'] = kwargs['bounds'][0] 1626 kwargs['max'] = kwargs['bounds'][1] 1627 1628 for key in ['value', 'min', 'max', 'vary', 'expr']: 1629 if key in kwargs: 1630 setattr(parameter, key, kwargs[key]) 1631 1632 self.push_parameters()

1633

1634 - def get_parameters(self, parameters=None, error='stderr', full_output=False):

1635 """ 1636 Returns the parameter values together with the errors if they exist. If No 1637 fitting has been done, or the errors could not be calculated, None is returned 1638 for the error. 1639 1640 The parameters of which the settings should be returned can be given in 1641 I{parameters}. If None is given, all parameters are returned. 1642 1643 @param parameters: Parameters to be returned or None for all parameters. 1644 @type parameters: array 1645 @param error: Which error to return ('stderr', 'mcerr') 1646 @type error: string 1647 @param full_output: When True, also vary, the boundaries and expression are returned 1648 @type full_output: bool 1649 1650 @return: the parameter values and there errors: value, err, [vary, min, max, expr] 1651 @rtype: numpy arrays 1652 """ 1653 if type(parameters) == str: 1654 parameters = [parameters] 1655 pnames = parameters if parameters != None else self.parameters.keys() 1656 1657 out = [] 1658 for name in pnames: 1659 par = self.parameters[name] 1660 out.append([par.value,getattr(par, error), par.vary, par.min, 1661 par.max, par.expr]) 1662 out = np.array(out) 1663 1664 if full_output: 1665 return out.T 1666 else: 1667 return out[:,0], out[:,1]

1668 1669 #} 1670 1671 #{ Print functions 1672

1673 - def param2str(self, **kwargs):

1674 """ 1675 Converts the parameter object of this model to an easy printable string, including 1676 the value, error, boundaries, if the parameter is varied, and if in the fitting process 1677 on of the boundaries was reached. 1678 1679 The error to be printed can be set with the error keyword. You can chose from the 1680 standard error: 'stderr', the monte carlo error: 'mcerr', or any of the confidence 1681 intervalls that you have calculated by coding them like: 'ci###'. Fx: 95% (sigma = 0.95) 1682 use 'ci95', for 99.7% (sigma = 0.997) use 'ci997'. Don't put decimal signs in the ci! 1683 1684 The accuracy with which the parameters are printed can be set with the accuracy keyword. 1685 And the amount of information that is printed is determined by full_output. If False, 1686 only parameter value and error are printed, if True also boundaries and vary are shown. 1687 1688 @param accuracy: Number of decimal places to print 1689 @type accuracy: int 1690 @param error: Which error type to print ('stderr', 'mcerr' or 'ci###') 1691 @type error: string 1692 @param full_output: Amount of information to print 1693 @type full_output: bool 1694 1695 @return: parameters in string format 1696 @rtype: string 1697 """ 1698 return parameters2string(self.parameters, **kwargs)

1699

1700 - def correl2str(self, **kwargs):

1701 """ 1702 Convert the correlations between the different parameters of this model to an 1703 easy printable string. The correlations are only printed if they are larger than 1704 a certain provided limit. And the accuracy keyword sets the amount of decimals 1705 to print 1706 1707 @param accuracy: number of decimal places to print 1708 @param limit: minimum correlation value to print 1709 1710 @return: correlation in string format 1711 @rtype: string 1712 """ 1713 return correlation2string(self.parameters, **kwargs)

1714

1715 - def ci2str(self, **kwargs):

1716 """ 1717 Convert the confidence intervals of the parameters of this model to an easy 1718 printable string. 1719 1720 The accuracy with wich the CIs should be printed can be set with the accuracy 1721 keyword. 1722 1723 @param accuracy: Number of decimal places to print 1724 @type accuracy: int 1725 1726 @return: confidence intervals in string format 1727 @rtype: string 1728 """ 1729 return confidence2string(self.parameters, **kwargs)

1730 1731 #} 1732 1733 #{ Advanced attributes 1734 1735 @property

1736 - def par_names(self):

1737 'get par_names' 1738 return [val for sublist in self._par_names for val in sublist]

1739 1740 #@par_names.setter 1741 #def par_names(self, val): 1742 #'set par_names' 1743 #self._par_names = val 1744 1745 #} 1746 1747 #{ Internal 1748

1749 - def pull_parameters(self):

1750 """ 1751 Pulls the parameter objects from the underlying functions, and combines it to 1 parameter object. 1752 """ 1753 functions = self.functions 1754 parameters = [] 1755 for func in functions: 1756 parameters.append(func.parameters) 1757 1758 #-- Create new parameter object 1759 new_params = lmfit.Parameters() 1760 pnames = [] 1761 for i, params in enumerate(parameters): 1762 pname = [] 1763 for n,par in params.items(): 1764 pname.append(n+'_%i'%(i)) 1765 new_params.add(n+'_%i'%(i), value=par.value, vary=par.vary, min=par.min, max=par.max, expr=par.expr) 1766 pnames.append(pname) 1767 1768 self._par_names = pnames 1769 self.parameters = new_params

1770

1771 - def push_parameters(self, parameters=None):

1772 """ 1773 Pushes the parameters in the combined parameter object to the parameter objects of the underlying 1774 models or functions. 1775 """ 1776 if parameters == None: 1777 parameters = self.parameters 1778 for pnames,function in zip(self._par_names, self.functions): 1779 old_parameters = function.parameters 1780 for name in pnames: 1781 old_name = re.sub('_[0123456789]*$','',name) 1782 old_parameters[old_name] = parameters[name]

1783

1784 - def __str__(self):

1785 """ String representation of the Model object """ 1786 fnames = ", ".join([f.function.__name__ for f in self.functions]) 1787 expr = self.expr if self.expr != None else "Sum()" 1788 name = "<Model with functions: [{:s}] combined by: {:s}>" 1789 return name.format(fnames, expr)

1790

1791 #} 1792 1793 1794 -class Minimizer(object):

1795 """ 1796 A minimizer class on the U{lmfit <http://cars9.uchicago.edu/software/python/lmfit/index.html>} 1797 Python package, which provides a simple, flexible interface to non-linear 1798 least-squares optimization, or curve fitting. The package is build around the 1799 Levenberg-Marquardt algorithm of scipy, but 2 other minimizers: limited memory 1800 Broyden-Fletcher-Goldfarb-Shanno and simulated annealing are partly supported as 1801 well. For the Levenberg-Marquardt algorithm, the estimated uncertainties and 1802 correlation between fitted variables are calculated as well. 1803 1804 This minimizer finds the best parameters to fit a given L{Model} or L{Function} to a 1805 set of data. You only need to provide the Model and data. Weighted fitting is 1806 supported. 1807 1808 This minimizer uses the basic residual function:: 1809 1810 residuals = ( data - model(x) ) * weights 1811 1812 If a more advanced residual functions is required, fx when working with multi 1813 dimentional data, the used can specify its own residual function in the provided 1814 Function or Model, or by setting the I{resfunc} keyword. This residual function needs 1815 to be of the following call sign:: 1816 1817 resfunc(synth, data, weights=None, errors=None, **kwargs) 1818 1819 Functions 1820 ========= 1821 1822 A L{Function} is basicaly a function together with a list of the parameters that is 1823 needs. In the internal representation the parameters are represented as a Parameters 1824 object. However, the user does not need to now how to handle this, and can just 1825 provided or retrieve the parameter values using arrays. Every fitting parameter are 1826 extensions of simple numerical variables with the following properties: 1827 1828 - Parameters can be fixed or floated in the fit. 1829 - Parameters can be bounded with a minimum and/or maximum value. 1830 - Parameters can be written as simple mathematical expressions of other 1831 Parameters, using the U{asteval module <http://newville.github.com/asteval/>}. 1832 These values will be re-evaluated at each step in the fit, so that the 1833 expression is satisfied. This gives a simple but flexible approach to 1834 constraining fit variables. 1835 1836 Models 1837 ====== 1838 1839 A L{Model} is a combination of Functions with its own parameter set. The Functions 1840 are provided as a list, and a gamma function can be given to describe how the 1841 functions are combined. Methods to handle the parameters in a model are provided, but 1842 the user is recommended handle the parameters at Function level as the naming of the 1843 parameters changes at Model level. 1844 """ 1845

1846 - def __init__(self, x, y, model, errors=None, weights=None, resfunc=None, 1847 engine='leastsq', args=None, kws=None, grid_points=1, grid_params=None, 1848 verbose=False, **kwargs):

1849 1850 self.x = x 1851 self.y = y 1852 self.model = model 1853 self.errors = errors 1854 self.weights = weights 1855 self.model_kws = kws 1856 self.fit_kws = kwargs #fx: iter_cb, scale_covar 1857 self.resfunc = model.resfunc 1858 self.engine = engine 1859 self._minimizers = [None] 1860 1861 if weights == None: 1862 self.weights = np.ones(len(y)) # if no weigths definded set them all at one. 1863 1864 if resfunc != None: 1865 self.resfunc = resfunc # if residual function is provided, use that one. 1866 1867 params = model.parameters 1868 1869 #-- Setup the residual function and the lmfit.minimizer object 1870 self._setup_residual_function() 1871 self._setup_jacobian_function() 1872 fcn_args = (self.x, self.y) 1873 fcn_kws = dict(weights=self.weights, errors=self.errors) 1874 if self.model_kws != None: 1875 fcn_kws.update(self.model_kws) 1876 1877 #-- Setup the Minimizer object 1878 self._prepare_minimizer(fcn_args, fcn_kws, grid_points, grid_params) 1879 1880 #-- Actual fitting 1881 self._start_minimize(engine, verbose=verbose, Dfun=self.jacobian)

1882 1883 #{ Error determination 1884

1885 - def calculate_CI(self, parameters=None, sigmas=[0.654, 0.95, 0.997], maxiter=200, 1886 **kwargs):

1887 """ 1888 Returns the confidence intervalls of the given parameters. This function uses 1889 the F-test method described below to calculate confidence intervalls. The 1890 I{sigma} parameter describes which confidence level is required in percentage: 1891 sigma=0.654 corresponds with the standard 1 sigma level, 0.95 with 2 sigma and 1892 0.997 with 3 sigma. 1893 1894 The output is a dictionary containing for each parameter the lower and upper 1895 boundary of the asked confidence level. If short_output is True, an array of 1896 tupples is returned instead. When only one parameter is given, and short_output 1897 is True, only a tupple of the lower and upper boundary is returned. 1898 1899 The confidence intervalls calculated with this function are stored in the 1900 Model or Function as well. 1901 1902 F-test 1903 ====== 1904 The F-test is used to compare the null model, which is the best fit 1905 found by the minimizer, with an alternate model, where one of the 1906 parameters is fixed to a specific value. The value is changed util the 1907 differnce between chi2_0 and chi2_f can't be explained by the loss of a 1908 degree of freedom with a certain confidence. 1909 1910 M{F = (chi2_f / chi2_0 - 1) * (N-P)/P_fix} 1911 1912 N is the number of data-points, P the number of parameter of the null model. 1913 P_fix is the number of fixed parameters (or to be more clear, the difference 1914 of number of parameters betweeen the null model and the alternate model). 1915 1916 This method relies completely on the I(conf_interval) method of the lmfit 1917 package. 1918 1919 @param parameters: Names of the parameters to calculate the CIs from (if None, 1920 all parameters are used) 1921 @type parameters: array of strings 1922 @param sigmas: The probability levels used to calculate the CI 1923 @type sigmas: array or float 1924 1925 @return: the confidence intervals. 1926 @rtype: dict 1927 """ 1928 #-- check if a special probability function is provided. 1929 prob_func = kwargs.pop('prob_func', None) 1930 1931 if parameters == None: 1932 parameters = self.model.par_names 1933 elif type(parameters) == str: 1934 parameters = [parameters] 1935 1936 if not hasattr(sigmas, "__iter__"): 1937 sigmas = [sigmas] 1938 if 'sigma' in kwargs: 1939 sigmas = [kwargs['sigma']] 1940 print 'WARNING: sigma is depricated, use sigmas' 1941 1942 #-- Use the adjusted conf_interval() function of the lmfit package. 1943 # We need to work on a copy of the minimizer and make a backup of 1944 # the parameter object cause lmfit messes up the minimizer when 1945 # running conf_interval 1946 mini = copy.copy(self.minimizer) 1947 backup = copy.deepcopy(self.model.parameters) 1948 ci = lmfit.conf_interval(mini, p_names=parameters, sigmas=sigmas, 1949 maxiter=maxiter, prob_func=prob_func, trace=False, 1950 verbose=False) 1951 self.model.parameters = backup 1952 1953 #-- store the CI values in the parameter object 1954 for key, value in ci.items(): 1955 self.model.parameters[key].cierr.update(value) 1956 1957 return ci

1958

1959 - def calculate_CI_2D(self, xpar=None, ypar=None, res=10, limits=None, ctype='prob'):

1960 """ 1961 Calculates the confidence interval for 2 given parameters. Both the confidence interval 1962 calculated using the F-test method from the I{estimate_error} method, and the normal chi 1963 squares can be obtained using the I{ctype} keyword. 1964 1965 The confidence intervall is returned as a grid, together with the x and y distribution of 1966 the parameters: (x-values, y-values, grid) 1967 1968 @param xname: The parameter on the x axis 1969 @param yname: The parameter on the y axis 1970 @param res: The resolution of the grid over which the confidence intervall is calculated 1971 @param limits: The upper and lower limit on the parameters for which the confidence 1972 intervall is calculated. If None, 5 times the stderr is used. 1973 @param ctype: 'prob' for probabilities plot (using F-test), 'chi2' for chi-squares. 1974 1975 @return: the x values, y values and confidence values 1976 @rtype: (array, array, 2d array) 1977 """ 1978 1979 xn = hasattr(res,'__iter__') and res[0] or res 1980 yn = hasattr(res,'__iter__') and res[1] or res 1981 1982 prob_func = None 1983 if ctype == 'chi2': 1984 def prob_func(Ndata, Nparas, new_chi, best_chi, nfix=1.): 1985 return new_chi

1986 old = np.seterr(divide='ignore') #turn division errors off temporary 1987 x, y, grid = lmfit.conf_interval2d(self.minimizer, xpar, ypar, xn, yn, limits=limits, 1988 prob_func=prob_func) 1989 np.seterr(divide=old['divide']) 1990 1991 if ctype=='prob': 1992 grid *= 100. 1993 1994 return x, y, grid 1995

1996 - def calculate_MC_error(self, points=100, errors=None, distribution='gauss', 1997 short_output=True, verbose=True, **kwargs):

1998 """ 1999 Use Monte-Carlo simulations to estimate the error of each parameter. In this 2000 approach each datapoint is perturbed by its error, and for each new dataset 2001 best fitting parameters are calculated. The MC error of a parameter is its 2002 deviation over all iterations (points). 2003 2004 The errors supplied to this function will overwrite the errors already stored 2005 in this Minimizer. If however no errors are supplied, the already stored ones 2006 will be used. For now only symetric errors are supported. 2007 2008 Currently all datapoints are considered to have a symetric gaussian distribution, 2009 but in future version more distributions will be supported. 2010 2011 The MC errors are saved in the Model or Function supplied to this fitter, and 2012 can be returned as an array (short_output=True), or as a dictionary 2013 (short_output=False). 2014 2015 @param points: The number of itterations 2016 @type points: int 2017 @param errors: Possible new errors on the input data. 2018 @type errors: array or float 2019 @param distribution: Not yet supported 2020 @type distribution: str 2021 @param short_output: True if you want array, False if you want dictionary 2022 @type short_output: bool 2023 2024 @return: The MC errors of all parameters. 2025 @rtype: array or dict 2026 """ 2027 if errors != None: 2028 self.errors = errors 2029 2030 perturb_args = dict(distribution=distribution) 2031 perturb_args.update(kwargs) 2032 params = np.empty(shape=(points), dtype=object) 2033 2034 #-- perturb the data 2035 y_perturbed = self._perturb_input_data(points, **perturb_args) 2036 2037 if verbose: print "MC simulations ({:.0f} points):".format(points) 2038 if verbose: Pmeter = progress.ProgressMeter(total=points) 2039 for i, y_ in enumerate(y_perturbed): 2040 if verbose: Pmeter.update(1) 2041 2042 #-- setup the fit 2043 pars = copy.deepcopy(self.model.parameters) 2044 fcn_args = (self.x, y_) 2045 fcn_kws = dict(weights=self.weights, errors=self.errors) 2046 if self.model_kws != None: 2047 fcn_kws.update(self.model_kws) 2048 result = lmfit.Minimizer(self.residuals, pars, fcn_args=fcn_args, 2049 fcn_kws=fcn_kws, **self.fit_kws) 2050 2051 #-- run the fit and save the results 2052 result.start_minimize(self.engine, Dfun=self.jacobian) 2053 params[i] = pars 2054 2055 pnames, mcerrors = self._mc_error_from_parameters(params) 2056 2057 if short_output: 2058 return mcerrors 2059 else: 2060 out = dict() 2061 for name, err in zip(pnames, mcerrors): 2062 out[name] = err 2063 return out

2064 2065 #} 2066 2067 #{ Plotting Functions 2068

2069 - def plot_fit(self, points=1000, axis=0, legend=False, **kwargs):

2070 """ 2071 Plot the original data together with the best fit results. This method has some 2072 functionality to plot multi-dimensional data, but is limited to 2D data which it 2073 will plot consecutively allong the specified axis. 2074 2075 @param points: Number of points to use when plotting the best fit model. 2076 @type points: int 2077 @param axis: In case of multi-dim input along which axis to split (0 or 1) 2078 @type axis: int 2079 @param legend: Display legend on plot 2080 @type legend: bool 2081 """ 2082 2083 #-- transform to 2D if nessessary 2084 res = np.atleast_2d(self.y - self.model.evaluate(self.x)) 2085 x, y = np.atleast_2d(self.x), np.atleast_2d(self.y) 2086 err = np.atleast_2d(self.errors) if self.errors != None else np.zeros_like(self.x) 2087 2088 #-- transpose if the axis is 1 2089 if axis == 1: x, y, res, err = x.T, y.T, res.T, err.T 2090 2091 #-- create synthetic x-data 2092 xf = np.empty((len(x), points), dtype=float) 2093 for i, x_ in enumerate(x): 2094 xf[i] = np.linspace(np.min(x_), np.max(x_), points) 2095 2096 #-- calculate the synthetic y-data 2097 yf = np.atleast_2d(self.model.evaluate(np.squeeze(xf))) 2098 2099 #-- setup a colorMap 2100 cmap = cm = pl.get_cmap('spectral') 2101 norm = mpl.colors.Normalize(vmin=-len(x)-1, vmax=len(x)+1) 2102 color = mpl.cm.ScalarMappable(norm=norm, cmap=cmap) 2103 2104 #-- plot the data and fit 2105 for i, (x_, y_, e_) in enumerate(zip(x, y, err)): 2106 pl.errorbar(x_, y_, yerr=e_, marker='+', ms=5, ls='', color=color.to_rgba(-i-1), 2107 label='data %i'%(i+1)) 2108 for i, (xf_, yf_) in enumerate(zip(xf, yf)): 2109 pl.plot(xf_, yf_, ls='-', color=color.to_rgba(i+1), label='fit %i'%(i+1)) 2110 2111 if legend: 2112 pl.legend()

2113

2114 - def plot_residuals(self, axis=0, legend=False, **kwargs):

2115 """ 2116 Plot the residuals of the best fit. This method has some functionality to plot 2117 multi-dimensional data, but is limited to 2D data which it will plot 2118 consecutively allong the specified axis. 2119 2120 @param axis: In case of multi-dim input along which axis to split (0 or 1) 2121 @type axis: int 2122 @param legend: Display legend on plot 2123 @type legend: bool 2124 """ 2125 2126 #-- transform to 2D if nessessary 2127 res = np.atleast_2d(self.y - self.model.evaluate(self.x)) 2128 x = np.atleast_2d(self.x) 2129 err = np.atleast_2d(self.errors) if self.errors != None else np.zeros_like(self.x) 2130 if axis == 1: x, res, err = x.T, res.T, err.T 2131 2132 #-- setup a colorMap 2133 cmap = cm = pl.get_cmap('spectral') 2134 norm = mpl.colors.Normalize(vmin=-len(x)-1, vmax=len(x)+1) 2135 color = mpl.cm.ScalarMappable(norm=norm, cmap=cmap) 2136 2137 #-- plot residuals 2138 for i, (x_, res_, e_) in enumerate(zip(x,res, err)): 2139 pl.errorbar(x_, res_, yerr=e_, marker='+', ms=5, ls='', color=color.to_rgba(-i-1), 2140 label='data %i'%(i+1)) 2141 pl.axhline(y=0, ls=':', color='r') 2142 2143 if legend: 2144 pl.legend()

2145

2146 - def plot_results(self, points=1000, axis=0, legend=False, **kwargs):

2147 """ 2148 Creates a basic plot with the fit results and corresponding residuals. This 2149 method has some functionality to plot multi-dimensional data, but is up till 2150 now limited to 2D data which it will plot consecutively allong the specified 2151 axis. Based on the plot_fit and plot_residuals functions 2152 2153 @param points: Number of points to use when plotting the best fit model. 2154 @type points: int 2155 @param axis: In case of multi-dim input along which axis to split (0 or 1) 2156 @type axis: int 2157 @param legend: Display legend on plot 2158 @type legend: bool 2159 @param xlabel: label for the x-axis 2160 @type xlabel: str 2161 @param ylabel: label for the y-axis 2162 @type ylabel: str 2163 @param title: title of the plot 2164 @type title: str 2165 """ 2166 2167 xlabel = kwargs.pop('xlabel', '$X$') 2168 ylabel = kwargs.pop('ylabel', '$Y$') 2169 title = kwargs.pop('title', 'Fit Results') 2170 2171 pl.subplots_adjust(wspace=0.0, hspace=0.0) 2172 ax = pl.subplot2grid((3,4), (0,0), rowspan=2, colspan=4) 2173 self.plot_fit(points=points, axis=axis, legend=legend, **kwargs) 2174 xlim, ylim = pl.xlim(), pl.ylim() 2175 x, y = xlim[1] - 0.02 * ( xlim[1]-xlim[0] ), ylim[0] + 0.05 * ( ylim[1]-ylim[0] ) 2176 pl.text(x, y, r'$\chi^2 =$ {:0.2f}'.format(self.minimizer.chisqr), ha='right') 2177 pl.ylabel(ylabel) 2178 pl.title(title) 2179 for tick in ax.axes.get_xticklabels(): 2180 tick.set_visible(False) 2181 tick.set_fontsize(0.0) 2182 2183 ax = pl.subplot2grid((3,4), (2,0), colspan=4) 2184 self.plot_residuals(axis=axis, legend=False, **kwargs) 2185 pl.ylabel('$O-C$') 2186 pl.xlabel(xlabel)

2187

2188 - def plot_confidence_interval(self, xpar=None, ypar=None, res=10, limits=None, 2189 ctype='prob', filled=True, **kwargs):

2190 """ 2191 Plot the confidence interval for 2 given parameters. Both the confidence 2192 interval calculated using the F-test method from the I{estimate_error} method, 2193 and the normal chi squares can be plotted using the I{type} keyword. In case 2194 of chi2, the log10 of the chi squares is plotted to improve the clarity of the 2195 plot. 2196 2197 Extra kwargs are passed to C{confourf} or C{contour}. 2198 2199 @param xname: The parameter on the x axis 2200 @param yname: The parameter on the y axis 2201 @param res: The resolution of the grid over which the confidence intervall 2202 is calculated 2203 @param limits: The upper and lower limit on the parameters for which the 2204 confidence intervall is calculated. If None, 5 times the stderr 2205 is used. 2206 @param ctype: 'prob' for probabilities plot (using F-test), 'chi2' for chisquare 2207 plot. 2208 @param filled: True for filled contour plot, False for normal contour plot 2209 @param limits: The upper and lower limit on the parameters for which the confidence intervall is calculated. If None, 5 times the stderr is used. 2210 """ 2211 2212 x, y, grid = self.calculate_CI_2D(xpar=xpar, ypar=ypar, res=res, 2213 limits=limits, ctype=ctype) 2214 2215 if ctype=='prob': 2216 levels = np.linspace(0,100,25) 2217 ticks = [0,20,40,60,80,100] 2218 def func(x, pos=None): 2219 return "%0.0f"%(x)

2220 fmtr = mpl.ticker.FuncFormatter(func) 2221 elif ctype=='chi2': 2222 grid = np.log10(grid) 2223 levels = np.linspace(np.min(grid), np.max(grid), 25) 2224 ticks = np.round(np.linspace(np.min(grid), np.max(grid), 6), 2) 2225 def func(x, pos=None): 2226 return "%0.1f"%(10**x) 2227 fmtr = mpl.ticker.FuncFormatter(func) 2228 2229 if filled: 2230 pl.contourf(x,y,grid,levels,**kwargs) 2231 cbar = pl.colorbar(fraction=0.08, format=fmtr, ticks=ticks) 2232 cbar.set_label(ctype=='prob' and r'Probability' or r'$\chi^2$') 2233 else: 2234 cs = pl.contour(x,y,grid,np.linspace(0,100,11),**kwargs) 2235 cs = pl.contour(x,y,grid,[20,40,60,80,95],**kwargs) 2236 pl.clabel(cs, inline=1, fontsize=10) 2237 pl.plot(self.params[xname].value, self.params[yname].value, '+r', ms=10, mew=2) 2238 pl.xlabel(xname) 2239 pl.ylabel(yname) 2240

2241 - def plot_grid_convergence(self, xpar=None, ypar=None, chi2lim=None, chi2scale='log', 2242 show_colorbar='True', **kwargs):

2243 """ 2244 Plot the convergence path of the results from grid_minimize stored in this 2245 minimizer. The start values of the two selected parameters are plotted 2246 conected to there final best fit values, while using a color coding for the 2247 chisqr value of the fit result. 2248 2249 @param xpar: The parameter on the x axis 2250 @param ypar: The parameter on the y axis 2251 @param chi2lim: Optional limit on the chi2 value (in % of the maximum chi2) 2252 @param chi2scale: Scale for the chi2 values: 'log' or 'linear' 2253 """ 2254 2255 #-- Get the minimizer grid 2256 minis, models, chisqrs = self.grid 2257 2258 if chi2lim != None: 2259 selected = np.where(chisqrs <= chi2lim*max(chisqrs)) 2260 models = models[selected] 2261 chisqrs = np.abs(chisqrs[selected]) 2262 models, chisqrs = models[::-1], chisqrs[::-1] 2263 if chi2scale == 'log': 2264 chisqrs = np.log10(chisqrs) 2265 2266 #-- read the requested parameter values 2267 x1 = np.empty_like(chisqrs) 2268 y1 = np.empty_like(chisqrs) 2269 x2 = np.empty_like(chisqrs) 2270 y2 = np.empty_like(chisqrs) 2271 for i, mod in enumerate(models): 2272 x1[i] = mod.parameters[xpar].user_value 2273 y1[i] = mod.parameters[ypar].user_value 2274 x2[i] = mod.parameters[xpar].value 2275 y2[i] = mod.parameters[ypar].value 2276 2277 #-- set the colors 2278 jet = cm = pl.get_cmap('jet') 2279 cNorm = mpl.colors.Normalize(vmin=min(chisqrs), vmax=max(chisqrs)) 2280 scalarMap = mpl.cm.ScalarMappable(norm=cNorm, cmap=jet) 2281 2282 #-- plot the arrows 2283 ax=pl.gca() 2284 for x1_, y1_, x2_, y2_, chi2 in zip(x1,y1,x2,y2, chisqrs): 2285 colorVal = scalarMap.to_rgba(chi2) 2286 ax.add_patch(mpl.patches.FancyArrowPatch((x1_,y1_),(x2_,y2_), 2287 arrowstyle='->',mutation_scale=30, color=colorVal)) 2288 scatter = pl.scatter(x2, y2, s=30, c=chisqrs, cmap=mpl.cm.jet, norm=cNorm, 2289 edgecolors=None, lw=0) 2290 pl.plot(x2[-1],y2[-1], '+r', ms=12, mew=3) 2291 2292 #-- colorbar 2293 if show_colorbar: 2294 if chi2scale == 'log': 2295 def func(x, pos=None): 2296 return "%0.0f"%(10**x)

2297 fmtr = mpl.ticker.FuncFormatter(func) 2298 else: 2299 fmtr = None 2300 pl.colorbar(scatter, fraction=0.08, format=fmtr) 2301 2302 pl.xlim(min([min(x1),min(x2)]), max([max(x1),max(x2)])) 2303 pl.ylim(min([min(y1),min(y2)]), max([max(y1),max(y2)])) 2304 pl.xlabel(xpar) 2305 pl.ylabel(ypar) 2306 2307 #} 2308 2309 #{ Advanced attributes 2310 @property

2311 - def minimizer(self):

2312 'get minimizer' 2313 return self._minimizers[0]

2314 2315 @minimizer.setter

2316 - def minimizer(self, val):

2317 'set minimizer' 2318 self._minimizers[0] = val

2319 2320 @property

2321 - def errors(self):

2322 'get error' 2323 return self._error

2324 2325 @errors.setter

2326 - def errors(self, val):

2327 'set error' 2328 if val == None: 2329 self._error = None 2330 elif np.shape(val) == (): 2331 self._error = np.ones_like(self.x) * val 2332 elif np.shape(val) == np.shape(self.x): 2333 self._error = val 2334 else: 2335 self._error = np.ones_like(self.x) * val[0]

2336 2337 @property

2338 - def grid(self):

2339 'get minimizer grid' 2340 models = np.empty(len(self._minimizers), dtype=Model) 2341 chisqrs = np.empty(len(self._minimizers), dtype=float) 2342 for i, mini in enumerate(self._minimizers): 2343 models[i] = copy.deepcopy(self.model) 2344 models[i].parameters = mini.params 2345 chisqrs[i] = mini.chisqr 2346 return self._minimizers, models, chisqrs

2347 2348 #} 2349 2350 #{ Internal Functions 2351

2352 - def __getattr__(self, name):

2353 "allow to reach the attributes of the best fitting minimizer directly" 2354 if hasattr(self.minimizer, name): 2355 return getattr(self.minimizer, name) 2356 else: 2357 raise AttributeError

2358

2359 - def _setup_residual_function(self):

2360 "Internal function to setup the residual function for the minimizer." 2361 if self.resfunc != None: 2362 def residuals(params, x, y, weights=None, errors=None, **kwargs): 2363 synth = self.model.evaluate(x, params, **kwargs) 2364 return self.resfunc(synth, y, weights=weights, errors=errors, **kwargs)

2365 else: 2366 def residuals(params, x, y, weights=None, errors=None, **kwargs): 2367 return ( y - self.model.evaluate(x, params, **kwargs) ) * weights 2368 2369 self.residuals = residuals 2370

2371 - def _setup_jacobian_function(self):

2372 "Internal function to setup the jacobian function for the minimizer." 2373 if self.model.jacobian != None: 2374 def jacobian(params, x, y, weights=None, errors=None, **kwargs): 2375 return self.model.evaluate_jacobian(x, params, **kwargs)

2376 self.jacobian = jacobian 2377 else: 2378 self.jacobian = None 2379

2380 - def _prepare_minimizer(self, fcn_args, fcn_kws, grid_points=1, grid_params=None, 2381 append=False):

2382 "Internal function to prepare the minimizer" 2383 2384 params = self.model.parameters 2385 grid_params = params.can_kick(pnames=grid_params) 2386 minimizers = np.empty(grid_points, dtype=Minimizer) 2387 2388 if grid_points == 1 or len(grid_params) == 0: 2389 #-- just one fit 2390 minimizers[0] = lmfit.Minimizer(self.residuals, params, fcn_args=fcn_args, 2391 fcn_kws=fcn_kws, **self.fit_kws) 2392 else: 2393 #-- create the minimizer grid 2394 for i in range(grid_points): 2395 params_ = copy.deepcopy(params) 2396 params_.kick(pnames=grid_params) 2397 minimizers[i] = lmfit.Minimizer(self.residuals, params_, fcn_args=fcn_args, 2398 fcn_kws=fcn_kws, **self.fit_kws) 2399 if append: 2400 self._minimizers.append(minimizers) 2401 else: 2402 self._minimizers = minimizers

2403

2404 - def _start_minimize(self, engine, verbose=False, **kwargs):

2405 "Internal function that starts all minimizers one by one" 2406 #-- Possible termial output 2407 if len(self._minimizers) <= 1: verbose = False 2408 if verbose: print "Grid Minimizer ({:.0f} points):".format(len(self._minimizers)) 2409 if verbose: Pmeter = progress.ProgressMeter(total=len(self._minimizers)) 2410 2411 #-- Start all minimizers 2412 chisqrs = np.empty_like(self._minimizers, dtype=float) 2413 for i, mini in enumerate(self._minimizers): 2414 if verbose: Pmeter.update(1) 2415 mini.start_minimize(engine, **kwargs) 2416 chisqrs[i] = mini.chisqr 2417 2418 #-- Sort on chisqr 2419 inds = chisqrs.argsort() 2420 self._minimizers = self._minimizers[inds] 2421 self.model.parameters = self._minimizers[0].params

2422

2423 - def _perturb_input_data(self, points, **kwargs):

2424 "Internal function to perturb the input data for MC simulations" 2425 2426 #-- create iterator for the data points 2427 y_ = np.empty( (points,)+self.y.shape, dtype=float) 2428 it = np.nditer([self.y, self.errors], ['multi_index'], [['readonly'], ['readonly']]) 2429 2430 #-- perturb the data 2431 while not it.finished: 2432 index = (slice(0,points),) + it.multi_index 2433 y_[index] = np.random.normal(loc=it[0], scale=it[1], size=points) 2434 it.iternext() 2435 2436 return y_

2437

2438 - def _mc_error_from_parameters(self, params):

2439 " Use standard deviation to get the error on a parameter " 2440 #-- calculate the std 2441 pnames = params[0].keys() 2442 errors = np.zeros((len(params), len(pnames))) 2443 for i, pars in enumerate(params): 2444 errors[i] = np.array(pars.value) 2445 errors = np.std(errors, axis=0) 2446 2447 #-- store the error in the original parameter object 2448 params = self.model.parameters 2449 for name, error in zip(pnames, errors): 2450 params[name].mcerr = error 2451 2452 return pnames, errors

2453

2454 - def __str__(self):

2455 " String representation of the Minimizer object " 2456 out = [] 2457 out.append( ('Model', str(self.model)) ) 2458 out.append( ('Data shape', str(np.array(self.x).shape)) ) 2459 out.append( ('Errors', 'Provided' if self._error != None else 'Not Provided') ) 2460 out.append( ('Weights', 'Provided' if self.weights != None else 'Not Provided') ) 2461 out.append( ('Residuals', 'Standard' if self.resfunc == None else 'Custom') ) 2462 out.append( ('Jacobian', 'Provided' if self.jacobian != None else 'Not Provided') ) 2463 out.append( ('Engine', str(self.engine)) ) 2464 out.append( ("Grid points", "{:.0f}".format(len(self._minimizers))) ) 2465 temp = "{:<12s}: {:s}\n" 2466 outstr = "<Minimizer object>\n" 2467 for s in out: 2468 outstr += temp.format(*s) 2469 return outstr.rstrip()

2470

2471 #} 2472 2473 -def minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', 2474 args=None, kws=None, scale_covar=True, iter_cb=None, verbose=True, **fit_kws):

2475 """ 2476 Basic minimizer function using the L{Minimizer} class, find values for the parameters 2477 so that the sum-of-squares of M{(y-model(x))} is minimized. When the fitting process 2478 is completed, the parameters of the L{Model} are updated with the results. If the 2479 I{leastsq} engine is used, estimated uncertainties and correlations will be saved to 2480 the L{Model} as well. Returns a I{Minimizer} object. 2481 2482 Fitting engines 2483 =============== 2484 By default, the Levenberg-Marquardt algorithm is used for fitting. While often 2485 criticized, including the fact it finds a local minima, this approach has some 2486 distinct advantages. These include being fast, and well-behaved for most curve- 2487 fitting needs, and making it easy to estimate uncertainties for and correlations 2488 between pairs of fit variables. Alternative fitting algoritms are at least partially 2489 implemented, but not all functions will work with them. 2490 2491 - leastsq: U{Levenberg-Marquardt 2492 <http://en.wikipedia.org/wiki/Levenberg-Marquardt_algorithm>}, 2493 U{scipy.optimize.leastsq 2494 <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html>} 2495 - anneal: U{Simulated Annealing <http://en.wikipedia.org/wiki/Simulated_annealing.>}, 2496 U{scipy.optimize.anneal 2497 <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.anneal.html>} 2498 - lbfgsb: U{quasi-Newton optimization 2499 <http://en.wikipedia.org/wiki/Limited-memory_BFGS>}, 2500 U{scipy.optimize.fmin_l_bfgs_b 2501 <http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html>} 2502 2503 2504 @param x: the independent data array (x data) 2505 @type x: numpy array 2506 @param y: the dependent data array (y data) 2507 @type y: numpy array 2508 @param model: The I{Model} to fit to the data 2509 @param err: The errors on the y data, same dimentions as y 2510 @param weights: The weights given to the different y data 2511 @param resfunc: A function to calculate the residuals, if not provided standard 2512 residual function is used. 2513 @param engine: Which fitting engine to use: 'leastsq', 'anneal', 'lbfgsb' 2514 @param kws: Extra keyword arguments to be passed to the model 2515 @param fit_kws: Extra keyword arguments to be passed to the fitter function 2516 2517 @return: (I{Parameter} object, I{Minimizer} object) 2518 2519 """ 2520 2521 fitter = Minimizer(x, y, model, errors=errors, weights=weights, resfunc=resfunc, 2522 engine=engine, args=args, kws=kws, scale_covar=scale_covar, 2523 iter_cb=iter_cb, verbose=verbose, **fit_kws) 2524 if fitter.message and verbose: 2525 logger.warning(fitter.message) 2526 return fitter

2527

2528 2529 -def grid_minimize(x, y, model, errors=None, weights=None, resfunc=None, engine='leastsq', 2530 args=None, kws=None, scale_covar=True, iter_cb=None, points=100, 2531 parameters=None, return_all=False, verbose=True, **fit_kws):

2532 """ 2533 Grid minimizer. Offers the posibility to start minimizing from a grid of starting 2534 parameters defined by the used. The number of starting points can be specified, as 2535 well as the parameters that are varried. For each parameter for which the start 2536 value should be varied, a minimum and maximum value should be provided when setting 2537 up that parameter. The starting values are chosen randomly in the range [min,max]. 2538 The other arguments are the same as for the normal L{minimize} function. 2539 2540 If parameters are provided that can not be kicked (starting value can not be varried), 2541 they will be removed from the parameters array automaticaly. If no parameters can be 2542 kicked, only one minimize will be performed independently from the number of points 2543 provided. Pay attention with the vary function of the parameters, even if a parameter 2544 has vary = False, it will be kicked by the grid minimizer if it appears in parameters. 2545 This parameter will then be fixed at its new starting value. 2546 2547 @param parameters: The parameters that you want to randomly chose in the fitting process 2548 @type parameters: array of strings 2549 @param points: The number of starting points 2550 @type points: int 2551 @param return_all: if True, the results of all fits are returned, if False, only the 2552 best fit is returned. 2553 @type return_all: Boolean 2554 2555 @return: The best minimizer, or all minimizers as [minimizers, newmodels, chisqrs] 2556 @rtype: Minimizer object or array of [Minimizer, Model, float] 2557 """ 2558 2559 fitter = Minimizer(x, y, model, errors=errors, weights=weights, resfunc=resfunc, 2560 engine=engine, args=args, kws=kws, scale_covar=scale_covar, 2561 iter_cb=iter_cb, grid_points=points, grid_params=parameters, 2562 verbose=verbose, **fit_kws) 2563 if fitter.message and verbose: 2564 logger.warning(fitter.message) 2565 2566 if return_all: 2567 return fitter.grid 2568 else: 2569 return fitter

2570

2571 #} 2572 2573 #{ General purpose 2574 2575 -def get_correlation_factor(residus, full_output=False):

2576 """ 2577 Calculate the correlation facor rho (Schwarzenberg-Czerny, 2003). 2578 2579 Under white noise assumption, the residus are expected to change sign every 2580 2 observations (rho=1). Longer distances, 2*rho, are a sign of correlation. 2581 2582 The errors are then underestimated by a factor 1/sqrt(rho). 2583 2584 @param residus: residus after the fit 2585 @type residus: numpy array 2586 @param full_output: if True, the groups of data with same sign will be returned 2587 @type full_output: bool 2588 @return: rho(,same sign groups) 2589 @rtype: float(,list) 2590 """ 2591 same_sign_groups = [1] 2592 2593 for i in xrange(1,len(residus)): 2594 if np.sign(residus[i])==np.sign(residus[i-1]): 2595 same_sign_groups[-1] += 1 2596 else: 2597 same_sign_groups.append(0) 2598 2599 rho = np.average(same_sign_groups) 2600 2601 logger.debug("Correlation factor rho = %f, sqrt(rho)=%f"%(rho,np.sqrt(rho))) 2602 2603 if full_output: 2604 return rho, same_sign_groups 2605 else: 2606 return rho

2607

2608 #} 2609 2610 #{ Print and plot functions 2611 2612 -def _calc_length(par, accuracy, field=None):

2613 """ helper function to calculate the length of the given parameters for parameters2string """ 2614 if field == 'name': 2615 par = str(par) 2616 extralen = {'name':1, 2617 'value':0, 2618 'user_value':0, 2619 'init_value':0, 2620 'stderr':4, 2621 'mcerr':4, 2622 'cierr':5, 2623 'stderrpc':3, 2624 'mcerrpc':3, 2625 'cierrpc':3, 2626 'bounds':14} 2627 2628 def calculate_length(par): 2629 try: 2630 if type(par) == str: 2631 out = len(par) 2632 elif par == None or par == np.nan or np.isposinf(par): 2633 out = 3 2634 elif np.isneginf(par): 2635 out = 4 2636 else: 2637 old = np.seterr(divide='ignore') #turn division errors off temporary 2638 length = np.floor(np.log10(abs(par))) > 0 and np.floor(np.log10(abs(par))) + 1 or 1 2639 length = par < 0 and length + 1 or length # 1 for the minus sign 2640 length = length + accuracy + 1 # 1 for the decimal point 2641 np.seterr(divide=old['divide']) 2642 out = int(length) 2643 except Exception, e: 2644 logging.warning( 2645 'Could not calculate length of: %s, type = %s, field = %s\nerror: %s'%(par, type(par), field, e)) 2646 out = 0 2647 return out

2648 2649 if type(par) == tuple: 2650 out = 0 2651 for p in par: 2652 out += calculate_length(p) 2653 else: 2654 out = calculate_length(par) 2655 2656 if field != None and field in extralen: 2657 return out + extralen[field] 2658 else: 2659 return out 2660

2661 -def _format_field(par, field, maxlen=10, accuracy=2):

2662 fmt = '{{:.{0}f}}'.format(accuracy) 2663 2664 if field == 'name': 2665 temp = '{{:>{0}s}} ='.format(maxlen) 2666 return temp.format(par.name) 2667 elif field in ['value', 'user_value', 'init_value']: 2668 return fmt.format(getattr(par, field)) 2669 elif field in ['stderr', 'mcerr']: 2670 return '+/- ' + fmt.format(getattr(par, field)) 2671 elif re.match(r"^ci(\d\d+?)$", field): 2672 temp = '+ ' + fmt + ' - ' + fmt 2673 return temp.format(*getattr(par, field)) 2674 elif field == 'stderrpc': 2675 return '('+fmt.format( getattr(par, field) ) + '%)' 2676 elif field == 'bounds': 2677 temp = ' bounds = ' + fmt + ' <-> ' + fmt 2678 return temp.format(*getattr(par, field)) 2679 elif field == 'vary': 2680 return '(vary)' if getattr(par, field) else '(fixed)' 2681 elif field == 'expr': 2682 expr = getattr(par, field) 2683 return 'expr = %s'%(expr) if expr != None else '' 2684 else: 2685 return ''

2686

2687 2688 -def parameters2string(parameters, accuracy=2, error='stderr', output='result', **kwargs):

2689 """ Converts a parameter object to string """ 2690 out = "Parameters ({:s})\n".format(error) 2691 2692 if not hasattr(output, '__itter__'): 2693 if output == 'start': 2694 output = ['name', 'user_value', 'bounds', 'vary', 'expr'] 2695 out = "Parameters (initial)\n" 2696 elif output == 'result': 2697 output = ['name', 'value', error, error + 'pc'] 2698 out = "Parameters ({:s})\n".format(error) 2699 elif output == 'full': 2700 output = ['name', 'value', error, error + 'pc', 'bounds', 'vary', 'expr'] 2701 out = "Parameters ({:s})\n".format(error) 2702 else: 2703 output = ['name', 'value'] 2704 out = "Parameters \n" 2705 2706 #-- calculate the nessessary space 2707 maxlen = np.zeros(len(output), dtype=int) 2708 for i, field in enumerate(output): 2709 max_value = 0 2710 for name, par in parameters.items(): 2711 current = _calc_length(getattr(par, field), accuracy, field=field) 2712 if current > max_value: max_value = current 2713 maxlen[i] = max_value 2714 2715 #-- create matrix with all values in requered accuracy as string 2716 roundedvals = np.empty(shape=(len(parameters.keys()), len(output)), 2717 dtype="|S{:.0f}".format(np.max(maxlen))) 2718 for i, (name, par) in enumerate(parameters.items()): 2719 for j, field in enumerate(output): 2720 roundedvals[i,j] = _format_field(par, field, maxlen[j], accuracy) 2721 2722 #-- create the template 2723 template = ' ' 2724 for max_value in maxlen: 2725 template += "{{:<{0}s}} ".format(max_value) 2726 template += '\n' 2727 2728 #-- create the output string 2729 for line in roundedvals: 2730 out += template.format(*line) 2731 2732 return out.rstrip()

2733

2734 -def correlation2string(parameters, accuracy=3, limit=0.100):

2735 """ Converts the correlation of different parameters to string """ 2736 out = "Correlations (shown if larger as {{:.{0}f}})\n".format(accuracy).format(limit) 2737 2738 #-- first select all correlations we want 2739 cor_name, cor_value = [],[] 2740 for name1,par in parameters.items(): 2741 for name2, corr in par.correl.items(): 2742 n1 = name1 + ' - ' + name2 2743 n2 = name2 + ' - ' + name1 2744 if corr >=limit and not n1 in cor_name and not n2 in cor_name: 2745 cor_name.append(n1) 2746 cor_value.append(corr) 2747 cor_name, cor_value = np.array(cor_name, dtype=str),np.array(cor_value, dtype=float) 2748 ind = cor_value.argsort() 2749 cor_name, cor_value = cor_name[ind][::-1], cor_value[ind][::-1] 2750 2751 #-- calculate nessessary space 2752 max_name = 0 2753 max_value = 0 2754 for name, value in zip(cor_name, cor_value): 2755 max_name = len(name) if len(name) > max_name else max_name 2756 current = _calc_length(value, accuracy) 2757 max_value = current if current > max_value else max_value 2758 2759 #-- print correlations 2760 template = ' {{name:<{0}s}} = {{value:.{1}f}}\n'.format(max_name, accuracy) 2761 for name, value in zip(cor_name, cor_value): 2762 out += template.format(name=name, value=value) 2763 2764 return out.rstrip()

2765

2766 2767 -def confidence2string(parameters, accuracy=4):

2768 """ Converts the confidence intervals to string """ 2769 out="Confidence Intervals\n" 2770 2771 #-- find all calculated cis and the nessessary space 2772 max_name = 0 2773 max_value = 6 2774 sigmas = [] 2775 for name, par in parameters.items(): 2776 if len(name) > max_name: max_name = len(name) 2777 for ci in par.cierr.keys(): 2778 if not ci in sigmas: sigmas.append(ci) 2779 current = _calc_length(par.cierr[ci][0], accuracy) 2780 if current > max_value: max_value = current 2781 current = _calc_length(par.cierr[ci][1], accuracy) 2782 if current > max_value: max_value = current 2783 sigmas.sort() 2784 sigmas.reverse() 2785 2786 #-- create the output values 2787 template = "{{:.{0}f}}".format(accuracy) 2788 cis = np.empty(shape=(len(parameters.keys()), 2*len(sigmas)+1), 2789 dtype="|S{:.0f}".format(max_value)) 2790 for i, (name, par) in enumerate(parameters.items()): #cis 2791 for j, sigma in enumerate(sigmas): 2792 if sigma in par.cierr.keys(): 2793 cis[i,j] = template.format(par.cierr[sigma][0]) 2794 cis[i,-j-1] = template.format(par.cierr[sigma][1]) 2795 else: 2796 cis[i,j] = '-'*max_value 2797 cis[i,-j-1] = '-'*max_value 2798 for i, (name, par) in enumerate(parameters.items()): #values 2799 cis[i,len(sigmas)] = template.format(par.value) 2800 2801 template = "{:.2f}" 2802 header = np.empty(shape=2*len(sigmas)+1, dtype="|S{:.0f}".format(max_value)) 2803 for i, sigma in enumerate(sigmas): 2804 header[i] = template.format(sigma*100) 2805 header[-i-1] = template.format(sigma*100) 2806 header[len(sigmas)] = template.format(0) 2807 2808 #-- create the output 2809 template = "{{:>{0}s}}% ".format(max_value-1) * (2*len(sigmas)+1) 2810 template = " {{:>{0}s}} ".format(max_name) + template +"\n" 2811 out += template.format('', *header) 2812 2813 template = "{{:>{0}s}} ".format(max_value) * (2*len(sigmas)+1) 2814 template = " {{:>{0}s}}: ".format(max_name) + template +"\n" 2815 for name, ci in zip(parameters.keys(), cis): 2816 out += template.format(name, *ci) 2817 2818 return out.rstrip()

2819

2820 -def plot_convergence(startpars, models, chi2s, xpar=None, ypar=None, clim=None):

2821 """ 2822 Plot the convergence path of the results from grid_minimize. 2823 """ 2824 #-- sort the models on reverse chi2 so that the best model is plotted on top 2825 inds = chi2s.argsort()#[::-1] 2826 startpars = startpars[inds] 2827 models = models[inds] 2828 chi2s = chi2s[inds] 2829 2830 if clim != None: 2831 selected = np.where(chi2s <= clim*max(chi2s)) 2832 startpars = startpars[selected] 2833 models = models[selected] 2834 chi2s = chi2s[selected] 2835 2836 #-- read the requested parameter values 2837 points=len(startpars) 2838 x1 = np.zeros(points,dtype=float) 2839 y1 = np.zeros(points,dtype=float) 2840 x2 = np.zeros(points,dtype=float) 2841 y2 = np.zeros(points,dtype=float) 2842 for i in range(points): 2843 x1[i] = startpars[i][xpar].value 2844 y1[i] = startpars[i][ypar].value 2845 x2[i] = models[i].parameters[xpar].value 2846 y2[i] = models[i].parameters[ypar].value 2847 2848 #-- set the colors 2849 jet = cm = pl.get_cmap('jet') 2850 cNorm = mpl.colors.Normalize(vmin=0, vmax=max(chi2s)) 2851 scalarMap = mpl.cm.ScalarMappable(norm=cNorm, cmap=jet) 2852 2853 #-- plot the arrows 2854 ax=pl.gca() 2855 for x1_, y1_, x2_, y2_, chi2 in zip(x1,y1,x2,y2, chi2s): 2856 colorVal = scalarMap.to_rgba(chi2) 2857 ax.add_patch(mpl.patches.FancyArrowPatch((x1_,y1_),(x2_,y2_),arrowstyle='->',mutation_scale=30, color=colorVal)) 2858 pl.scatter(x2, y2, s=30, c=chi2s, cmap=mpl.cm.jet, edgecolors=None, lw=0) 2859 pl.plot(x2[-1],y2[-1], '+r', ms=12, mew=3) 2860 pl.colorbar(fraction=0.08) 2861 pl.xlim(min([min(x1),min(x2)]), max([max(x1),max(x2)])) 2862 pl.ylim(min([min(y1),min(y2)]), max([max(y1),max(y2)])) 2863 pl.xlabel(xpar) 2864 pl.ylabel(ypar)

2865 2866 2867 #} 2868 2869 2870 2871 2872 2873 2874 2875 2876 2877 2878 2879 2880 if __name__=="__main__": 2881 import doctest 2882 import pylab as pl 2883 import sys 2884 doctest.testmod() 2885 pl.show() 2886 sys.exit() 2887 2888 from ivs.timeseries import pergrams 2889 import pylab as pl 2890 2891 times = np.linspace(0,150,5000) 2892 np.random.seed(10) 2893 freqs = [1.23,2.59,3.89,4.65] 2894 freqs += [2*freqs[0],2*freqs[0]+freqs[2],freqs[3]-freqs[0],freqs[0]+freqs[3]-freqs[2]] 2895 freqs = np.array(freqs) 2896 amplitudes = np.sort(np.random.uniform(size=len(freqs),low=1,high=10))[::-1] 2897 phases = np.random.uniform(size=len(freqs),low=-0.5,high=0.5) 2898 parins = np.rec.fromarrays([np.zeros(len(freqs)),amplitudes,freqs,phases],names=('const','ampl','freq','phase')) 2899 signal = evaluate.sine(times,parins) 2900 signal_= signal + np.random.normal(size=len(signal),scale=1) 2901 2902 residus = signal_ + 0. 2903 frequencies = [] 2904 2905 for i in range(9): 2906 print "======== STEP %d ======"%(i) 2907 2908 2909 pergram = pergrams.scargle(times,residus,threads=2) 2910 frequency = pergram[0][np.argmax(pergram[1])] 2911 frequencies.append(frequency) 2912 parameters = sine(times,signal_,frequencies) 2913 e_parameters = e_sine(times,signal_,parameters) 2914 parameters,e_parameters,gain = optimize(times,signal_,parameters,'sine') 2915 frequencies = list(parameters['freq']) 2916 2917 signalf = evaluate.sine(times,parameters) 2918 2919 2920 if i<len(parins): 2921 print '' 2922 for par in ['const','ampl','freq','phase']: 2923 print par,parins[i][par],parameters[par],e_parameters['e_%s'%(par)] 2924 print 'S/N',parameters['ampl']/e_parameters['e_ampl'] 2925 2926 else: 2927 print '' 2928 for par in ['const','ampl','freq','phase']: 2929 print par,parameters[par],e_parameters['e_%s'%(par)] 2930 print 'S/N',parameters['ampl']/e_parameters['e_ampl'] 2931 2932 2933 2934 print '' 2935 2936 2937 2938 residus = signal_-signalf 2939 2940 2941 parameters = sine(times,signal_,frequencies) 2942 signalf = evaluate.sine(times,parameters) 2943 2944 pl.subplot(121) 2945 pl.plot(times,signal_,'ko') 2946 pl.plot(times,signal,'r-') 2947 pl.plot(times,signalf,'b-') 2948 2949 pl.subplot(122) 2950 pl.plot(*pergram) 2951 pl.show() 2952

Source Code for Module ivs.sigproc.fit