Package ivs :: Package statistics :: Module linearregression :: Class LinearFit
[hide private]
[frames] | no frames]

Class LinearFit

source code


A class that allows to easily extract the fit coefficients, covariance matrix, predictions, residuals, etc.

Remarks: LinearFit instances can be printed

Instance Methods [hide private]
LinearFit
__init__(self, linearModel, observations)
Initialises the LinearFit instance
source code
ndarray
observations(self, weighted=False)
Returns the observations
source code
ndarray
regressionCoefficients(self)
Returns the regression coefficients.
source code
double
sumSqResiduals(self, weighted=False)
Returns the sum of the squared residuals
source code
double
residualVariance(self, weighted=False)
Estimates the variance of the residuals of the time series.
source code
ndarray
covarianceMatrix(self)
Returns the covariance matrix of the fit coefficients
source code
ndarray
correlationMatrix(self)
Returns the correlation matrix of the fit coefficients
source code
ndarray
errorBars(self)
Returns the formal error bars of the fit coefficients
source code
tuple
confidenceIntervals(self, alpha)
Returns the symmetric (1-alpha) confidence interval around the fit coefficients E.g.
source code
ndarray
t_values(self)
Returns the formal t-values of the fit coefficients
source code
ndarray
regressorTtest(self, alpha)
Performs a hypothesis T-test on each of the regressors Null hypothesis: H0 : fit coefficient == 0 Alternative hypothesis : H1 : fit coefficient != 0
source code
 
predictions(self, weighted=False)
Returns the predicted (fitted) values
source code
 
residuals(self, weighted=False)
Returns an array with the residuals.
source code
double
coefficientOfDetermination(self, weighted=False)
Returns the coefficient of determination
source code
double
BICvalue(self)
Returns the Bayesian Information Criterion value.
source code
double
AICvalue(self)
Returns the 2nd order Akaike Information Criterion value
source code
double
Fstatistic(self, weighted=False)
Returns the F-statistic, commonly used to assess the fit.
source code
boolean
FstatisticTest(self, alpha, weighted=False)
Performs a hypothesis F-test on the fit
source code
 
summary(self, outputStream=sys.stdout)
Writes some basic results of fitting the model to the observations.
source code
 
__str__(self)
Returns the string written by printing the LinearFit object
source code
ndarray
evaluate(self, regressors)
Evaluates your current best fit in regressors evaluated in new covariates.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, linearModel, observations)
(Constructor)

source code 

Initialises the LinearFit instance

Parameters:
  • linearModel (LinearModel) - a LinearModel instance
  • observations (ndarray) - the observations. The size of the array must be compatible with the number of observations that the linearModel expects.
Returns: LinearFit
a LinearFit instance
Overrides: object.__init__

observations(self, weighted=False)

source code 

Returns the observations

Remarks:

  • if no covariance matrix of the observations was specified for the model, the "decorrelated" observations are identical to the original observations.
Parameters:
  • weighted (boolean) - If false return the original observations, if True, return de decorrelated observations.
Returns: ndarray
the observations that were used to initialize the LinearFit instance

regressionCoefficients(self)

source code 

Returns the regression coefficients.

Returns: ndarray
the fit coefficients

sumSqResiduals(self, weighted=False)

source code 

Returns the sum of the squared residuals

Parameters:
  • weighted (boolean) - If false the unweighted residuals are used, if True, the weighted ones are used.
Returns: double
the sum of the squared residuals

residualVariance(self, weighted=False)

source code 

Estimates the variance of the residuals of the time series.

As normalization the degrees of freedom is used

Parameters:
  • weighted (boolean) - If false the unweighted residuals are used, if True, the weighted ones are used.
Returns: double
the variance of the residuals

covarianceMatrix(self)

source code 

Returns the covariance matrix of the fit coefficients

Returns: ndarray
the MxM covariance matrix of the fit coefficients with M the number of fit coefficients

correlationMatrix(self)

source code 

Returns the correlation matrix of the fit coefficients

Returns: ndarray
the MxM correlation matrix of the fit coefficients with M the number of fit coefficients

errorBars(self)

source code 

Returns the formal error bars of the fit coefficients

Returns: ndarray
The square root of the diagonal of the covariance matrix

confidenceIntervals(self, alpha)

source code 

Returns the symmetric (1-alpha) confidence interval around the fit coefficients E.g. if alpha = 0.05, the 95% symmetric confidence interval is returned.

Remarks:

  • The formula used assumes that the noise on the observations is independently, identically and gaussian distributed.
Parameters:
  • alpha (double) - confidence level in ]0,1[
Returns: tuple
the arrays lowerBounds[0..K-1] and upperBounds[0..K-1] which contain respectively the lower and the upper bounds of the symmetric interval around the fit coefficients, where K is the number of fit coeffs.

t_values(self)

source code 

Returns the formal t-values of the fit coefficients

Returns: ndarray
t-values = fit coefficients divided by their formal error bars

regressorTtest(self, alpha)

source code 

Performs a hypothesis T-test on each of the regressors Null hypothesis: H0 : fit coefficient == 0 Alternative hypothesis : H1 : fit coefficient != 0

Remarks:

  • This test assumes that the noise is independently, identically, and gaussian distributed. It is not robust against this assumption.
Parameters:
  • alpha (double) - significance level of the hypothesis test. In ]0,1[. E.g.: alpha = 0.05
Returns: ndarray
a boolean array of length K with K the number of regressors. "True" if the null hypothesis was rejected for the regressor, "False" otherwise.

predictions(self, weighted=False)

source code 

Returns the predicted (fitted) values

It concerns the predictions for the original observations, not the decorrelated ones.

Remarks:

  • If no covariance matrix of the observations was specified for the model, the weighted predictions are identical to the unweighted ones.
Parameters:
  • weighted (boolean) - If True/False, the predictions for the decorrelated/original observations will be used.

residuals(self, weighted=False)

source code 

Returns an array with the residuals. Residuals = observations minus the predictions

Remarks:

  • If no covariance matrix of the observations was specified for the model, the weighted residuals are identical to the unweighted ones.
Parameters:
  • weighted (boolean) - If True/False, the residuals for the decorrelated/original observations will be used.

coefficientOfDetermination(self, weighted=False)

source code 

Returns the coefficient of determination

The coeff of determination is defined by 1 - S1 / S2 with

  • S1 the sum of squared residuals: S1 = rac{\sum_{i=1}^N (y_i - \hat{y}_i)^2}
  • S2 the sample variance w.r.t. the mean: S2 = rac{\sum_{i=1}^N (y_i - \overline{y})^2}
  • If there is no intercept term in the model, then S2 is computed S2 = rac{\sum_{i=1}^N (y_i)^2}

Remarks:

  • If no covariance matrix of the observations was specified for the model, the weighted coeff of determination is identical to the unweighted one.
Parameters:
  • weighted (boolean) - If True/False, the residuals for the decorrelated/original observations will be used.
Returns: double
coefficient of determination

BICvalue(self)

source code 

Returns the Bayesian Information Criterion value.

Remarks:

  • Also called the Schwartz Information Criterion (SIC)
  • Gaussian noise is assumed, with unknown variance sigma^2
  • Constant terms in the log-likelihood are omitted

TODO: . make a weighted version

Returns: double
BIC value

AICvalue(self)

source code 

Returns the 2nd order Akaike Information Criterion value

Remarks:

  • Gaussian noise is assumed, with unknown variance sigma^2
  • Constant terms in the log-likelihood were omitted
  • If the number of observations equals the number of parameters + 1 then the 2nd order AIC is not defined. In this case the method gives a nan.

TODO: . make a weighted version

Returns: double
AIC value

Fstatistic(self, weighted=False)

source code 

Returns the F-statistic, commonly used to assess the fit.

Remarks:

  • if no covariance matrix of the observations was specified for the model, the weighted F-statistic is identical to the unweighted one
Parameters:
  • weighted (boolean) - If True/False, the residuals for the decorrelated/original observations will be used.
Returns: double
the F-statistic

FstatisticTest(self, alpha, weighted=False)

source code 

Performs a hypothesis F-test on the fit

Null hypothesis: H0 : all fit coefficients == 0 Alternative hypothesis : H1 : at least one fit coefficient != 0

Stated otherwise (with R^2 the coefficient of determination): Null hypothesis: H0 : R^2 == 0 Alternative hypothesis: H1: R^2 != 0

Remarks:

  • if no covariance matrix of the observations was specified for the model, the weighted F-test is identical to the unweighted one
Parameters:
  • alpha (double) - significance level of the hypothesis test. In ]0,1[.
  • weighted (boolean) - If True/False, the weighted/not-weighted F-statistic will be used.
Returns: boolean
"True" if null hypothesis was rejected, "False" otherwise

summary(self, outputStream=sys.stdout)

source code 

Writes some basic results of fitting the model to the observations.

Parameters:
  • outputStream (stream class. The .write() method is used.) - defaulted to the standard output console, but can be replaced by another open stream, like a file stream.
Returns:
nothing

__str__(self)
(Informal representation operator)

source code 

Returns the string written by printing the LinearFit object

Overrides: object.__str__

evaluate(self, regressors)

source code 

Evaluates your current best fit in regressors evaluated in new covariates.

Remark:

  • The new regressor functions should be exactly the same ones as you used to define the linear model. They should only be evaluated in new covariates. This is not checked for!

Example:

>>> noise = array([0.44, -0.48, 0.26, -2.00, -0.93, 2.21, -0.57, -2.04, -1.09, 1.53])
>>> x = linspace(0, 5, 10)
>>> obs = 2.0 + 3.0 * exp(x) + noise
>>> myModel = LinearModel([ones(10), exp(x)], ["1", "exp(x)"])
>>> print(myModel)
Model: y = a_0 + a_1 * exp(x)
Expected number of observations: 10
>>> myFit = myModel.fitData(obs)
>>> xnew = linspace(-5.0, +5.0, 20)
>>> y = myFit.evaluate([ones_like(xnew), exp(xnew)])
>>> print(y)
[ 1.53989966 1.55393018 1.57767944 1.61787944 1.68592536
  1.80110565 1.99606954 2.32608192 2.8846888 3.83023405
  5.43074394 8.13990238 12.72565316 20.48788288 33.62688959
  55.86708392 93.51271836 157.23490405 265.09646647 447.67207215]
Parameters:
  • regressors (either a list or an ndarray) - either a list of equally-sized numpy arrays with the regressors evaluated in the new covariates: [f_0(xnew),f_1(xnew),f_2(xnew),...], or an N x M design matrix (numpy array) where these regressor arrays are column-stacked, with N the number of regressors, and M the number of data points.
Returns: ndarray
the linear model evaluated in the new regressors