Package ivs :: Package statistics :: Module linearregression :: Class LinearFit

[frames] | no frames]

Class LinearFit

source code

A class that allows to easily extract the fit coefficients, covariance matrix, predictions, residuals, etc.

Remarks: LinearFit instances can be printed

Instance Methods

[hide private]

LinearFit

__init__(self, linearModel, observations)
Initialises the LinearFit instance

source code

ndarray

observations(self, weighted=False)
Returns the observations

source code

ndarray

regressionCoefficients(self)
Returns the regression coefficients.

source code

double

sumSqResiduals(self, weighted=False)
Returns the sum of the squared residuals

source code

double

residualVariance(self, weighted=False)
Estimates the variance of the residuals of the time series.

source code

ndarray

covarianceMatrix(self)
Returns the covariance matrix of the fit coefficients

source code

ndarray

correlationMatrix(self)
Returns the correlation matrix of the fit coefficients

source code

ndarray

errorBars(self)
Returns the formal error bars of the fit coefficients

source code

tuple

confidenceIntervals(self, alpha)
Returns the symmetric (1-alpha) confidence interval around the fit coefficients E.g.

source code

ndarray

t_values(self)
Returns the formal t-values of the fit coefficients

source code

ndarray

regressorTtest(self, alpha)
Performs a hypothesis T-test on each of the regressors Null hypothesis: H0 : fit coefficient == 0 Alternative hypothesis : H1 : fit coefficient != 0

source code

predictions(self, weighted=False)
Returns the predicted (fitted) values

source code

residuals(self, weighted=False)
Returns an array with the residuals.

source code

double

coefficientOfDetermination(self, weighted=False)
Returns the coefficient of determination

source code

double

BICvalue(self)
Returns the Bayesian Information Criterion value.

source code

double

AICvalue(self)
Returns the 2nd order Akaike Information Criterion value

source code

double

Fstatistic(self, weighted=False)
Returns the F-statistic, commonly used to assess the fit.

source code

boolean

FstatisticTest(self, alpha, weighted=False)
Performs a hypothesis F-test on the fit

source code

summary(self, outputStream=sys.stdout)
Writes some basic results of fitting the model to the observations.

source code

__str__(self)
Returns the string written by printing the LinearFit object

source code

ndarray

evaluate(self, regressors)
Evaluates your current best fit in regressors evaluated in new covariates.

source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __subclasshook__

Properties

[hide private]

Inherited from object: __class__

Method Details

[hide private]

init(self, linearModel, observations)
(Constructor)

source code

Initialises the LinearFit instance

Parameters:

linearModel (LinearModel) - a LinearModel instance
observations (ndarray) - the observations. The size of the array must be compatible with the number of observations that the linearModel expects.

Returns: LinearFit

a LinearFit instance

Overrides: object.__init__

observations(self, weighted=False)

source code

Returns the observations

Remarks:

if no covariance matrix of the observations was specified for the model, the "decorrelated" observations are identical to the original observations.

Parameters:

weighted (boolean) - If false return the original observations, if True, return de decorrelated observations.

Returns: ndarray

the observations that were used to initialize the LinearFit instance

regressionCoefficients(self)

source code

Returns the regression coefficients.

Returns: ndarray: the fit coefficients

sumSqResiduals(self, weighted=False)

source code

Returns the sum of the squared residuals

Parameters:

weighted (boolean) - If false the unweighted residuals are used, if True, the weighted ones are used.

Returns: double

the sum of the squared residuals

residualVariance(self, weighted=False)

source code

Estimates the variance of the residuals of the time series.

As normalization the degrees of freedom is used

Parameters:

weighted (boolean) - If false the unweighted residuals are used, if True, the weighted ones are used.

Returns: double

the variance of the residuals

covarianceMatrix(self)

source code

Returns the covariance matrix of the fit coefficients

Returns: ndarray: the MxM covariance matrix of the fit coefficients with M the number of fit coefficients

correlationMatrix(self)

source code

Returns the correlation matrix of the fit coefficients

Returns: ndarray: the MxM correlation matrix of the fit coefficients with M the number of fit coefficients

errorBars(self)

source code

Returns the formal error bars of the fit coefficients

Returns: ndarray: The square root of the diagonal of the covariance matrix

confidenceIntervals(self, alpha)

source code

Returns the symmetric (1-alpha) confidence interval around the fit coefficients E.g. if alpha = 0.05, the 95% symmetric confidence interval is returned.

Remarks:

The formula used assumes that the noise on the observations is independently, identically and gaussian distributed.

Parameters:

alpha (double) - confidence level in ]0,1[

Returns: tuple

the arrays lowerBounds[0..K-1] and upperBounds[0..K-1] which contain respectively the lower and the upper bounds of the symmetric interval around the fit coefficients, where K is the number of fit coeffs.

t_values(self)

source code

Returns the formal t-values of the fit coefficients

Returns: ndarray: t-values = fit coefficients divided by their formal error bars

regressorTtest(self, alpha)

source code

Performs a hypothesis T-test on each of the regressors Null hypothesis: H0 : fit coefficient == 0 Alternative hypothesis : H1 : fit coefficient != 0

Remarks:

This test assumes that the noise is independently, identically, and gaussian distributed. It is not robust against this assumption.

Parameters:

alpha (double) - significance level of the hypothesis test. In ]0,1[. E.g.: alpha = 0.05

Returns: ndarray

a boolean array of length K with K the number of regressors. "True" if the null hypothesis was rejected for the regressor, "False" otherwise.

predictions(self, weighted=False)

source code

Returns the predicted (fitted) values

It concerns the predictions for the original observations, not the decorrelated ones.

Remarks:

If no covariance matrix of the observations was specified for the model, the weighted predictions are identical to the unweighted ones.

Parameters:

weighted (boolean) - If True/False, the predictions for the decorrelated/original observations will be used.

residuals(self, weighted=False)

source code

Returns an array with the residuals. Residuals = observations minus the predictions

Remarks:

If no covariance matrix of the observations was specified for the model, the weighted residuals are identical to the unweighted ones.

Parameters:

weighted (boolean) - If True/False, the residuals for the decorrelated/original observations will be used.

coefficientOfDetermination(self, weighted=False)

source code

Returns the coefficient of determination

The coeff of determination is defined by 1 - S1 / S2 with

S1 the sum of squared residuals: S1 = rac{\sum_{i=1}^N (y_i - \hat{y}_i)^2}
S2 the sample variance w.r.t. the mean: S2 = rac{\sum_{i=1}^N (y_i - \overline{y})^2}
If there is no intercept term in the model, then S2 is computed S2 = rac{\sum_{i=1}^N (y_i)^2}

Remarks:

If no covariance matrix of the observations was specified for the model, the weighted coeff of determination is identical to the unweighted one.

Parameters:

weighted (boolean) - If True/False, the residuals for the decorrelated/original observations will be used.

Returns: double

coefficient of determination

BICvalue(self)

source code

Returns the Bayesian Information Criterion value.

Remarks:

Also called the Schwartz Information Criterion (SIC)
Gaussian noise is assumed, with unknown variance sigma^2
Constant terms in the log-likelihood are omitted

TODO: . make a weighted version

Returns: double: BIC value

AICvalue(self)

source code

Returns the 2nd order Akaike Information Criterion value

Remarks:

Gaussian noise is assumed, with unknown variance sigma^2
Constant terms in the log-likelihood were omitted
If the number of observations equals the number of parameters + 1 then the 2nd order AIC is not defined. In this case the method gives a nan.

TODO: . make a weighted version

Returns: double: AIC value

Fstatistic(self, weighted=False)

source code

Returns the F-statistic, commonly used to assess the fit.

Remarks:

if no covariance matrix of the observations was specified for the model, the weighted F-statistic is identical to the unweighted one

Parameters:

weighted (boolean) - If True/False, the residuals for the decorrelated/original observations will be used.

Returns: double

the F-statistic

FstatisticTest(self, alpha, weighted=False)

source code

Performs a hypothesis F-test on the fit

Null hypothesis: H0 : all fit coefficients == 0 Alternative hypothesis : H1 : at least one fit coefficient != 0

Stated otherwise (with R^2 the coefficient of determination): Null hypothesis: H0 : R^2 == 0 Alternative hypothesis: H1: R^2 != 0

Remarks:

if no covariance matrix of the observations was specified for the model, the weighted F-test is identical to the unweighted one

Parameters:

alpha (double) - significance level of the hypothesis test. In ]0,1[.
weighted (boolean) - If True/False, the weighted/not-weighted F-statistic will be used.

Returns: boolean

"True" if null hypothesis was rejected, "False" otherwise

summary(self, outputStream=sys.stdout)

source code

Writes some basic results of fitting the model to the observations.

Parameters:

outputStream (stream class. The .write() method is used.) - defaulted to the standard output console, but can be replaced by another open stream, like a file stream.

Returns:

nothing

str(self)
(Informal representation operator)

source code

Returns the string written by printing the LinearFit object

Overrides: object.__str__

evaluate(self, regressors)

source code

Evaluates your current best fit in regressors evaluated in new covariates.

Remark:

The new regressor functions should be exactly the same ones as you used to define the linear model. They should only be evaluated in new covariates. This is not checked for!

Example:

>>> noise = array([0.44, -0.48, 0.26, -2.00, -0.93, 2.21, -0.57, -2.04, -1.09, 1.53])
>>> x = linspace(0, 5, 10)
>>> obs = 2.0 + 3.0 * exp(x) + noise
>>> myModel = LinearModel([ones(10), exp(x)], ["1", "exp(x)"])
>>> print(myModel)
Model: y = a_0 + a_1 * exp(x)
Expected number of observations: 10

>>> myFit = myModel.fitData(obs)
>>> xnew = linspace(-5.0, +5.0, 20)
>>> y = myFit.evaluate([ones_like(xnew), exp(xnew)])
>>> print(y)
[ 1.53989966 1.55393018 1.57767944 1.61787944 1.68592536
  1.80110565 1.99606954 2.32608192 2.8846888 3.83023405
  5.43074394 8.13990238 12.72565316 20.48788288 33.62688959
  55.86708392 93.51271836 157.23490405 265.09646647 447.67207215]

Parameters:

regressors (either a list or an ndarray) - either a list of equally-sized numpy arrays with the regressors evaluated in the new covariates: [f_0(xnew),f_1(xnew),f_2(xnew),...], or an N x M design matrix (numpy array) where these regressor arrays are column-stacked, with N the number of regressors, and M the number of data points.

Returns: ndarray

the linear model evaluated in the new regressors

Class LinearFit

__init__(self, linearModel, observations) (Constructor)

observations(self, weighted=False)

regressionCoefficients(self)

sumSqResiduals(self, weighted=False)

residualVariance(self, weighted=False)

covarianceMatrix(self)

correlationMatrix(self)

errorBars(self)

confidenceIntervals(self, alpha)

t_values(self)

regressorTtest(self, alpha)

predictions(self, weighted=False)

residuals(self, weighted=False)

coefficientOfDetermination(self, weighted=False)

BICvalue(self)

AICvalue(self)

Fstatistic(self, weighted=False)

FstatisticTest(self, alpha, weighted=False)

summary(self, outputStream=sys.stdout)

__str__(self) (Informal representation operator)

evaluate(self, regressors)

init(self, linearModel, observations)
(Constructor)

str(self)
(Informal representation operator)