Package ivs :: Package timeseries :: Module freqanalyse

Module freqanalyse

Frequency Analysis Routines.

Here are some examples:

Section 1. Pulsation frequency analysis

Do a frequency analysis of the star HD129929, after Aerts 2004. We reproduce their Fig. 8 here.

Import necessary modules:

>>> from ivs.catalogs import vizier

Read in the data and do an iterative prewhitening frequency analysis:

>>> data,units,comms = vizier.search('J/A+A/415/241/table1')
>>> params = iterative_prewhitening(data.HJD,data.Umag-data.Umag.mean(),f0=6.2,fn=7.2,maxiter=6,          stopcrit=(stopcrit_scargle_snr,4.,6,))
>>> print pl.mlab.rec2txt(params,precision=6)
      const       ampl       freq       phase    e_const     e_ampl     e_freq    e_phase   stopcrit
   0.000310   0.014722   6.461700    0.443450   0.000401   0.001323   0.000006   0.089865   4.950496
   0.000000   0.014866   6.978306   -0.050189   0.000000   0.001224   0.000006   0.082305   5.677022
   0.000000   0.011687   6.449591    0.016647   0.000000   0.001090   0.000007   0.093280   5.747565
   0.000000   0.011546   6.990431   -0.482814   0.000000   0.001001   0.000006   0.086678   6.454955
   0.000000   0.009380   6.590938   -0.382048   0.000000   0.000938   0.000007   0.100016   6.510729
   0.000000   0.007645   6.966174    0.323627   0.000000   0.000863   0.000008   0.112876   5.703801

Plot the results:

>>> p = pl.figure()
>>> p = pl.vlines(params['freq'],0,params['ampl'],color='k',linestyle='-')
>>> p = pl.xlabel('Frequency [c d$^{-1}$]')
>>> p = pl.ylabel('Amplitude [magnitude]')
>>> p = pl.annotate('$\ell=2$',(6.46,0.015),xycoords='data',va='bottom',ha='center',fontsize='large')
>>> p = pl.annotate('$\ell=0$',(6.59,0.010),xycoords='data',va='bottom',ha='center',fontsize='large')
>>> p = pl.annotate('$\ell=1$',(6.99,0.015),xycoords='data',va='bottom',ha='center',fontsize='large')
>>> p = pl.xlim(6.2,7.2)
>>> p = pl.ylim(0,0.018)

Section 2: Line profile variability

We generate some line profiles as Gaussian profiles, with varying average. This as an analogue for purely radial variability.

>>> wave = np.linspace(4950.,5000.,200.)
>>> times = np.linspace(0,1.,100)
>>> A,sigma,mu = 0.5, 5., 4975.
>>> frequency = 1/0.2
>>> wshifts = 10.*np.sin(2*np.pi*frequency*times)
>>> spectra = np.zeros((len(times),len(wave)))
>>> for i,wshift in enumerate(wshifts):
...    spectra[i] = 1.-evaluate.gauss(wave,[A,mu+wshift,sigma])+np.random.normal(size=len(wave),scale=0.01)

Once the line profiles are constructed, we can compute the Fourier transform of these lines:

>>> output = spectrum_2D(times,wave,spectra,f0=frequency/2.,fn=3*frequency,df=0.001,threads=2,method='scargle',subs_av=True,full_output=True)

With this output, we can find retrieve the frequency. We plot the periodograms for each wavelength bin on the left, and on the right the average over all wavelength bins:

>>> p = pl.figure()
>>> p = pl.subplot(121)
>>> p = pl.imshow(output['pergram'][1][::-1],extent=[wave[0],wave[-1],frequency/2,3*frequency],aspect='auto')
>>> p = pl.xlabel('Wavelength (A)')
>>> p = pl.ylabel('Frequency (c/d)')
>>> cbar = pl.colorbar()
>>> cbar.set_label('Amplitude')
>>> p = pl.subplot(122)
>>> p = pl.plot(output['pergram'][1].mean(axis=1),output['pergram'][0],'k-')
>>> p = pl.ylim(frequency/2,3*frequency)
>>> p = pl.xlabel('Amplitude')
>>> p = pl.ylabel('Frequency (c/d)')

]]include figure]]timeseries_freqanalyse_02.png]

We can then fix the frequency and compute all the parameters. However, we now choose not to subtract the average profile, but instead use the GLS periodogram to include the constant

>>> output = spectrum_2D(times,wave,spectra,f0=frequency-0.05,fn=frequency+0.05,df=0.001,threads=2,method='gls',subs_av=False,full_output=True)

>>> p = pl.figure()
>>> p = pl.subplot(221)
>>> p = pl.errorbar(wave,output['pars']['const'],yerr=output['pars']['e_const'],fmt='ko-')
>>> p,q = pl.xlabel('Wavelength (A)'),pl.ylabel('Constant')
>>> p = pl.subplot(222)
>>> p = pl.errorbar(wave,output['pars']['ampl'],yerr=output['pars']['e_ampl'],fmt='ro-')
>>> p,q = pl.xlabel('Wavelength (A)'),pl.ylabel('Amplitude')
>>> p = pl.subplot(223)
>>> p = pl.errorbar(wave,output['pars']['freq'],yerr=output['pars']['e_freq'],fmt='ko-')
>>> p,q = pl.xlabel('Wavelength (A)'),pl.ylabel('Frequency (c/d)')
>>> p = pl.subplot(224)
>>> p = pl.errorbar(wave,output['pars']['phase'],yerr=output['pars']['e_phase'],fmt='ro-')
>>> p,q = pl.xlabel('Wavelength (A)'),pl.ylabel('Phase')

]]include figure]]timeseries_freqanalyse_03.png]

Section 3. Time-frequency analysis

Example usage: we generate some data where the frequency jumps to double its value in the middle of the time series. The first half is thus given by

>>> params = np.rec.fromarrays([[10.],[1.],[0.],[0.]],names=['freq','ampl','const','phase'])
>>> times = np.linspace(0,15,1000)
>>> signal = evaluate.sine(times,params)

And the second half by

>>> params['freq'] = 20.
>>> signal[:len(times)/2] = evaluate.sine(times[:len(times)/2],params)

The time-frequency analysis is calculate via the command

>>> output = time_frequency(times,signal,fn=30.)

And the outcome can be plotted via

>>> p = pl.figure()
>>> p = pl.imshow(output['pergram'][1].T[::-1],aspect='auto',extent=[times[0],times[-1],output['pergram'][0][0],output['pergram'][0][-1]])
>>> p,q = pl.xlabel('Time (d)'),pl.ylabel('Frequency (c/d)')

]]include figure]]timeseries_freqanalyse_04.png]

>>> p = pl.figure()
>>> p = pl.subplot(221)
>>> p = pl.errorbar(output['times'],output['pars']['const'],yerr=output['pars']['e_const'],fmt='ko-')
>>> p,q = pl.xlabel('Time (d)'),pl.ylabel('Constant')
>>> p = pl.subplot(222)
>>> p = pl.errorbar(output['times'],output['pars']['ampl'],yerr=output['pars']['e_ampl'],fmt='ro-')
>>> p,q = pl.xlabel('Time (d)'),pl.ylabel('Amplitude')
>>> p = pl.subplot(223)
>>> p = pl.errorbar(output['times'],output['pars']['freq'],yerr=output['pars']['e_freq'],fmt='ko-')
>>> p,q = pl.xlabel('Time (d)'),pl.ylabel('Frequency (c/d)')
>>> p = pl.subplot(224)
>>> p = pl.errorbar(output['times'],output['pars']['phase'],yerr=output['pars']['e_phase'],fmt='ro-')
>>> p,q = pl.xlabel('Time (d)'),pl.ylabel('Phase')

]]include figure]]timeseries_freqanalyse_05.png]

Author: Pieter Degroote

Functions

[hide private]

(ndarray,ndarray)

autocorrelation(frequencies, power, max_step=1.5, interval=(), threshold=None, method=1)
Compute the autocorrelation.

source code

Convenience functions

record array(, 2x1Darray, 1Darray)

find_frequency(times, signal, method='scargle', model='sine', full_output=False, optimize=0, max_loops=20, scale_region=0.1, scale_df=0.20, model_kwargs=None, correlation_correction=True, prewhiteningorder_snr=False, prewhiteningorder_snr_window=1., **kwargs)
Find one frequency, automatically going to maximum precision and return parameters & error estimates.

source code

rec array(, ndarray)

iterative_prewhitening(times, signal, maxiter=1000, optimize=0, method='scargle', model='sine', full_output=False, stopcrit=None, correlation_correction=True, prewhiteningorder_snr=False, prewhiteningorder_snr_window=1., **kwargs)
Fit one or more functions to a timeseries via iterative prewhitening.

source code

dict

spectrum_2D(x, y, matrix, weights_2d=None, show_progress=False, subs_av=True, full_output=False, **kwargs)
Compute a 2D periodogram.

source code

dict

time_frequency(times, signal, window_width=None, n_windows=100, window='rectangular', detrend=None, **kwargs)
Short Time (Fourier) Transform.

source code

Convenience stop-criteria

stopcrit_scargle_prob(times, signal, modelfunc, allparams, pergram, crit_value)
Stop criterium based on probability.

source code

stopcrit_scargle_snr(times, signal, modelfunc, allparams, pergram, crit_value, width=6.)
Stop criterium based on signal-to-noise ratio.

source code

Variables

[hide private]

logger = logging.getLogger("TS.FREQANAL")

Function Details

[hide private]

find_frequency(times, signal, method='scargle', model='sine', full_output=False, optimize=0, max_loops=20, scale_region=0.1, scale_df=0.20, model_kwargs=None, correlation_correction=True, prewhiteningorder_snr=False, prewhiteningorder_snr_window=1., **kwargs)

source code

Find one frequency, automatically going to maximum precision and return parameters & error estimates.

This routine makes the frequency grid finer until it is finer than the estimated error on the frequency. After that, it will compute (harmonic) parameters and estimate errors.

There is a possibility to escape this optimization by setting scale_df=0 or freqregscale=0.

You can include a nonlinear least square update of the parameters, by setting the keyword optimize=1 (optimization outside loop) or optimize=2 (optimization after each iteration).

The method with which to find the frequency can be set with the keyword method, the model used to fit and optimize should be set with model. Extra keywords for the model functions should go in model_kwargs. If method is a tuple, the first method will be used for the first frequency search only. This could be useful to take advantage of such methods as fasper which do not allow for iterative zoom-ins. By default, the function looks for the highest (or deepest in the case of the pdm method) peak, but instead it is possible to go for the peak having the highest SNR before prewhitening by setting prewhiteningorder_snr to True. In this case, the noise spectrum is calculated using a convolution with a prewhiteningorder_snr_window wide box.

Possible extra keywords: see definition of the used periodogram function.

Warning: the timeseries must be sorted in time and cannot contain the same timepoint twice. Otherwise, a 'ValueError, concatenation problem' can occur.

Example keywords:

'correlation_correction', default=True
'freqregscale', default=0.5: factor for zooming in on frequency
'dfscale', default = 0.25: factor for optimizing frequency resolution

Example usage: We generate a sine signal

>>> times = np.linspace(0,100,1000)
>>> signal = np.sin(2*np.pi*2.5*times) + np.random.normal(size=len(times))

Compute the frequency

>>> parameters, pgram, model = find_frequency(times,signal,full_output=True)

Make a plot:

>>> p = pl.figure()
>>> p = pl.axes([0.1,0.3,0.85,0.65])
>>> p = pl.plot(pgram[0],pgram[1],'k-')
>>> p = pl.xlim(2.2,2.8)
>>> p = pl.ylabel('Amplitude')
>>> p = pl.axes([0.1,0.1,0.85,0.2])
>>> p = pl.plot(pgram[0][:-1],np.diff(pgram[0]),'k-')
>>> p = pl.xlim(2.2,2.8)
>>> p,q = pl.xlabel('Frequency (c/d)'),pl.ylabel('Frequency resolution $\Delta F$')

]]include figure]]timeseries_freqanalyse_06.png]

Returns: record array(, 2x1Darray, 1Darray): parameters and errors(, periodogram, model function)

iterative_prewhitening(times, signal, maxiter=1000, optimize=0, method='scargle', model='sine', full_output=False, stopcrit=None, correlation_correction=True, prewhiteningorder_snr=False, prewhiteningorder_snr_window=1., **kwargs)

source code

Fit one or more functions to a timeseries via iterative prewhitening.

This function will use find_frequency to fit the function parameters to the original signal (including any previously found parameters), optimize the parameters if needed, and remove it from the data to search for a new frequency in the residuals.

It is always the original signal that is used to fit all parameters again; only the (optimized) frequency is remembered from step to step (Vanicek's method).

You best set maxiter to some sensable value, to hard-limit the number of frequencies that will be searched for. You can additionally use a stopcrit and stop looking for frequencies once it is reached. stopcrit should be a tuple; the first argument is the function to call, the other arguments are passed to the function, after the mandatory arguments times,signal, modelfunc,allparams,pergram. See stopcrit_scargle_snr for an example of such a function.

By default, the function looks for the highest (or deepest in the case of the pdm method) peak, but instead it is possible to go for the peak having the highest SNR before prewhitening by setting prewhiteningorder_snr to True. In this case, the noise spectrum is calculated using a convolution with a prewhiteningorder_snr_window wide box. Usage of this is strongly encouraged, especially combined with stopcrit_scargle_snr as stopcrit.

Returns: rec array(, ndarray): parameters, model(, model function)

spectrum_2D(x, y, matrix, weights_2d=None, show_progress=False, subs_av=True, full_output=False, **kwargs)

source code

Compute a 2D periodogram.

Rows (first axis) should be spectra chronologically

x are time points (length N) y are second axis units (e.g. wavelengths) (length NxM)

If the periodogram/wavelength combination has a large range, this script can produce a ValueError or MemoryError. To solve this, you could iterate this function over a subset of wavelength bins yourself, and write the results to a file.

This function also outputs a model, which you can use to subtract from the data (probably together with the average profile, which is also returned) and look further for any frequencies:

   output = spectrum_2D(x,y,matrix,subs_av=True)
   new_matrix = matrix - output['avprof'] - output['model'
   output2 = spectrum_2D(x,y,new_matrix,subs_av=False)

If you want to prewhiten a model, you'd probably want to use the same frequency across the whole line profile. You can hack this by setting f0=frequency and fn=frequency+df with df the size of the frequency step.

Example usage: first we generate some variable line profiles. In this case, this is just a radial velocity variation

>>> times = np.linspace(0,150,100)
>>> wavel = np.r_[4500:4600:1.0]
>>> matrix = np.zeros((len(times),len(wavel)))
>>> for i in xrange(len(times)):
...   central_wave = 5*np.sin(2*np.pi/10*times[i])
...   matrix[i,:] = 1 - 0.5*np.exp( -(wavel-4550-central_wave)**2/10**2)

Once the line profiles are constructed, we can compute the Fourier transform of these lines:

>>> output = spectrum_2D(times,wavel,matrix,method='scargle',model='sine',f0=0.05,fn=0.3,subs_av=True,full_output=True)

With this output, we can find retrieve the frequency. We plot the periodograms for each wavelength bin on the left, and on the right the average over all wavelength bins:

>>> p = pl.figure()
>>> p = pl.subplot(121)
>>> p = pl.imshow(output['pergram'][1][::-1],extent=[wavel[0],wavel[-1],0.05,0.3],aspect='auto')
>>> p = pl.subplot(122)
>>> p = pl.plot(output['pergram'][1].mean(axis=1),output['pergram'][0],'k-')
>>> p = pl.ylim(0.05,0.3)

We can then fix the frequency and compute all the parameters. However, we now choose not to subtract the average profile, but instead use the GLS periodogram to include the constant

>>> output = spectrum_2D(times,wavel,matrix,method='gls',model='sine',f0=0.095,fn=0.105,full_output=True)

>>> p = pl.figure()
>>> p = pl.subplot(221)
>>> p = pl.errorbar(wavel,output['pars']['const'],yerr=output['pars']['e_const'],fmt='ko-')
>>> p = pl.subplot(222)
>>> p = pl.errorbar(wavel,output['pars']['ampl'],yerr=output['pars']['e_ampl'],fmt='ro-')
>>> p = pl.subplot(223)
>>> p = pl.errorbar(wavel,output['pars']['freq'],yerr=output['pars']['e_freq'],fmt='ko-')
>>> p = pl.subplot(224)
>>> p = pl.errorbar(wavel,output['pars']['phase'],yerr=output['pars']['e_phase'],fmt='ro-')

Returns: dict: dict with keys avprof (2D array), pars (rec array), model (1D array), pergram (freqs,2Darray)

time_frequency(times, signal, window_width=None, n_windows=100, window='rectangular', detrend=None, **kwargs)

source code

Short Time (Fourier) Transform.

Slide a window through the timeseries, multiply the timeseries with the window, and perform a Fourier Transform.

It is best to fix explicitly f0, fn and df, to limit the computation time!

Extra kwargs go to find_frequency

Parameters:

n_windows (integer) - number of slices
window_width (float) - width of each slice (defaults to T/20)
detrend (callable) - detrending function, accepting times and signal as args
window (string) - window function to apply

Returns: dict

spectrogram, times used, parameters, errors, points used per slice

Decorators:

@defaults_pergram

autocorrelation(frequencies, power, max_step=1.5, interval=(), threshold=None, method=1)

source code

Compute the autocorrelation.

The formulate to estimate the autocorrelation in the periodogram is

R(k) = 1 / (n * s^2) * \sum_{t=1}^n (X_t - mu) * (X_{t+k} - mu)

where n is the number of points in the interval, mu an estimator for the sample mean and s is an estimator for the standard deviation of the sample.

When computed over a large interval, we have to take into account that we cannot sample more points than we have in the spectrum, so we have to average out over fewer and fewer points, the more we shift the spectrum.

Above formula becomes

R(k) = 1 / ((n-k) * s^2) * \sum_{t=1}^{n-k} (X_t - mu) * (X_{t+k} - mu)

Highs and lows are cut of. The lower cut off value is the sample mean, the higher cut off is a user-defined multiple of the sample mean.

This function can also be used to compute period spacings: just invert the frequency spacing, reverse the order and reinterpolate to be equidistant: Example:

>>> periods = linspace(1./p[0][-1],1./p[0][0],2*len(p[0]))
>>> ampl = interpol.linear_interpolation(1./p[0][::-1],p[1][::-1],periods)
>>> ampl = where(isnan(ampl),hstack([ampl[1:],ampl[:1]]),ampl)

Parameters:

frequencies (numpy 1d array) - frequency array
power (numpy 1d array) - power spectrum
max_step (float) - maximum time shift
interval (tuple of floats) - tuple of frequencies (start, end)
threshold (float) - high cut off value (in units of sample mean)

Returns: (ndarray,ndarray)

domain of autocorrelation and autocorrelation