Package ivs :: Package statistics :: Module pca
[hide private]
[frames] | no frames]

Module pca

source code

Principal component analysis


Version: 1.1.02

Date: February 2008

Author: Henning Risvik

Functions [hide private]
    Preprocessing Methods
 
mean_center(X)
Returns: Mean centered X (always has same dimensions as X)
source code
 
standardization(X)
Returns: Standardized X (always has same dimensions as X)
source code
    NIPALS array help functions
 
get_column(E)
Get an acceptable column-vector of E.
source code
 
get_column_mat(E)
NIPALS matrix help function.
source code
 
vec_inner(v)
Returns: transpose(v) * v (float or int)
source code
 
mat_prod(A, x)
Returns: b of (Ax = b).
source code
 
remove_tp_prod(E, t, p)
sets: E = E - (t*transpose(p)) E: (m, n)-matrix, (t*transpose(p)): (m, n)-matrix
source code
    NIPALS Algorithm
 
nipals_mat(X, PCs, threshold, E_matrices)
PCA by NIPALS using numpy matrix
source code
 
nipals_arr(X, PCs, threshold, E_matrices)
PCA by NIPALS using numpy array
source code
    Principal Component Analysis (using NIPALS)
 
PCA_nipals(X, standardize=True, PCs=10, threshold=0.0001, E_matrices=False)
PCA by NIPALS and get Scores, Loadings, E
source code
 
PCA_nipals2(X, standardize=True, PCs=10, threshold=0.0001, E_matrices=False)
PCA by NIPALS and get Scores, Loadings, E
source code
 
PCA_nipals_c(X, standardize=True, PCs=10, threshold=0.0001, E_matrices=False)
PCA by NIPALS and get Scores, Loadings, E
source code
    Principal Component Analysis (using SVD)
 
PCA_svd(X, standardize=True)
PCA by SVD and get Scores, Loadings, E Remake of method made by Oliver Tomic Ph.D.
source code
    Correlation Loadings
 
CorrelationLoadings(X, Scores)
Get correlation loadings matrix based on Scores (T of PCA) and X (original variables, not mean centered).
source code
Function Details [hide private]

mean_center(X)

source code 
Parameters:
  • X (numpy array) - 2-dimensional matrix of number data
Returns:
Mean centered X (always has same dimensions as X)

standardization(X)

source code 
Parameters:
  • X (numpy array) - 2-dimensional matrix of number data
Returns:
Standardized X (always has same dimensions as X)

get_column(E)

source code 

Get an acceptable column-vector of E.

Parameters:
  • E (numpy array) - 2-dimensional matrix of number data
Returns:
a non-zero vector

get_column_mat(E)

source code 

NIPALS matrix help function. Get an acceptable column-vector of E.

Parameters:
  • E (numpy matrix) - 2-dimensional matrix of number data
Returns:
a non-zero vector

vec_inner(v)

source code 
Parameters:
  • v (numpy array) - Vector of number data.
Returns:
transpose(v) * v (float or int)

mat_prod(A, x)

source code 
Parameters:
  • A (numpy array) - 2-dimensional matrix of number data.
  • x (numpy array) - Vector of number data.
Returns:
b of (Ax = b). Product of: matrix A (m,n) * vector x (n) = vector b (m)

remove_tp_prod(E, t, p)

source code 

sets: E = E - (t*transpose(p)) E: (m, n)-matrix, (t*transpose(p)): (m, n)-matrix

Parameters:
  • E (numpy array) - 2-dimensional matrix of number data.
  • t (numpy array) - Vector of number data. Current Scores (of PC_i).
  • p (numpy array) - Vector of number data. Current Loading (of PC_i).
Returns:
None

nipals_mat(X, PCs, threshold, E_matrices)

source code 

PCA by NIPALS using numpy matrix

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • PCs (int) - Number of Principal Components.
  • threshold (float) - Convergence check value. For checking on convergence to zero difference (e.g. 0.000001).
  • E_matrices (bool) - If E-matrices should be retrieved or not. E-matrices (for each PC) or explained_var (explained variance for each PC).
Returns:
(Scores, Loadings, E)

nipals_arr(X, PCs, threshold, E_matrices)

source code 

PCA by NIPALS using numpy array

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • PCs (int) - Number of Principal Components.
  • threshold (float) - Convergence check value. For checking on convergence to zero (e.g. 0.000001).
  • E_matrices (bool) - If E-matrices should be retrieved or not. E-matrices (for each PC) or explained_var (explained variance for each PC).
Returns:
(Scores, Loadings, E)

PCA_nipals(X, standardize=True, PCs=10, threshold=0.0001, E_matrices=False)

source code 

PCA by NIPALS and get Scores, Loadings, E

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • standardize (bool) - Wheter X should be standardized or not.
  • PCs (int) - Number of Principal Components.
  • threshold (float) - Convergence check value. For checking on convergence to zero (e.g. 0.000001).
  • E_matrices (bool) - If E-matrices should be retrieved or not. E-matrices (for each PC) or explained_var (explained variance for each PC).
Returns:
nipals_mat(X, PCs, threshold, E_matrices)

PCA_nipals2(X, standardize=True, PCs=10, threshold=0.0001, E_matrices=False)

source code 

PCA by NIPALS and get Scores, Loadings, E

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • standardize (bool) - Wheter X should be standardized or not.
  • PCs (int) - Number of Principal Components.
  • threshold (float) - Convergence check value. For checking on convergence to zero (e.g. 0.000001).
  • E_matrices (bool) - If E-matrices should be retrieved or not. E-matrices (for each PC) or explained_var (explained variance for each PC).
Returns:
nipals_arr(X, PCs, threshold, E_matrices)

PCA_nipals_c(X, standardize=True, PCs=10, threshold=0.0001, E_matrices=False)

source code 

PCA by NIPALS and get Scores, Loadings, E

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • standardize (bool) - Wheter X should be standardized or not.
  • PCs (int) - Number of Principal Components.
  • threshold (float) - Convergence check value. For checking on convergence to zero (e.g. 0.000001).
  • E_matrices (bool) - If E-matrices should be retrieved or not. E-matrices (for each PC) or explained_var (explained variance for each PC).
Returns:
nipals_c(X, PCs, threshold, E_matrices)

PCA_svd(X, standardize=True)

source code 

PCA by SVD and get Scores, Loadings, E Remake of method made by Oliver Tomic Ph.D.

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • standardize (bool) - Wheter X should be standardized or not.
Returns:
(Scores, Loadings, explained_var)

CorrelationLoadings(X, Scores)

source code 

Get correlation loadings matrix based on Scores (T of PCA) and X (original variables, not mean centered). Remake of method made by Oliver Tomic Ph.D.

Parameters:
  • X (numpy array) - 2-dimensional matrix of number data.
  • Scores (numpy array) - Scores of PCA (T).
Returns:
Returns the correlation loadings matrix