statsmodels.regression.linear_model.GLS

class statsmodels.regression.linear_model.GLS(endog, exog, sigma=None, missing='none', hasconst=None, **kwargs)[source]

Generalized Least Squares

Parameters

endog : array_like

A 1-d endogenous response variable. The dependent variable.

exog : array_like

A nobs x k array where nobs is the number of observations and k is the number of regressors. An intercept is not included by default and should be added by the user. See statsmodels.tools.add_constant.

sigma : scalar or array

The array or scalar sigma is the weighting matrix of the covariance. The default is None for no scaling. If sigma is a scalar, it is assumed that sigma is an n x n diagonal matrix with the given scalar, sigma as the value of each diagonal element. If sigma is an n-length vector, then sigma is assumed to be a diagonal matrix with the given sigma on the diagonal. This should be the same as WLS.

missing : str

Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised. Default is ‘none’.

hasconst : None or bool

Indicates whether the RHS includes a user-supplied constant. If True, a constant is not checked for and k_constant is set to 1 and all result statistics are calculated as if a constant is present. If False, a constant is not checked for and k_constant is set to 0.

**kwargs

Extra arguments that are used to set model properties when using the formula interface.

See also

WLS

Fit a linear model using Weighted Least Squares.

OLS

Fit a linear model using Ordinary Least Squares.

Notes

If sigma is a function of the data making one of the regressors a constant, then the current postestimation statistics will not be correct.

Examples

>>> import statsmodels.api as sm
>>> data = sm.datasets.longley.load(as_pandas=False)
>>> data.exog = sm.add_constant(data.exog)
>>> ols_resid = sm.OLS(data.endog, data.exog).fit().resid
>>> res_fit = sm.OLS(ols_resid[1:], ols_resid[:-1]).fit()
>>> rho = res_fit.params

rho is a consistent estimator of the correlation of the residuals from an OLS fit of the longley data. It is assumed that this is the true rho of the AR process data.

>>> from scipy.linalg import toeplitz
>>> order = toeplitz(np.arange(16))
>>> sigma = rho**order

sigma is an n x n matrix of the autocorrelation structure of the data.

>>> gls_model = sm.GLS(data.endog, data.exog, sigma=sigma)
>>> gls_results = gls_model.fit()
>>> print(gls_results.summary())

Attributes

df_model

The model degree of freedom.

df_resid

The residual degree of freedom.

pinv_wexog

(ndarray) pinv_wexog is the p x n Moore-Penrose pseudoinverse of wexog.

cholsimgainv

(ndarray) The transpose of the Cholesky decomposition of the pseudoinverse.

llf

(float) The value of the likelihood function of the fitted model.

nobs

(float) The number of observations n.

normalized_cov_params

(ndarray) p x p array \((X^{T}\Sigma^{-1}X)^{-1}\)

results

(RegressionResults instance) A property that returns the RegressionResults class if fit.

sigma

(ndarray) sigma is the n x n covariance structure of the error terms.

wexog

(ndarray) Design matrix whitened by cholsigmainv

wendog

(ndarray) Response variable whitened by cholsigmainv

Methods

fit([method, cov_type, cov_kwds, use_t])

Full fit of the model.

fit_regularized([method, alpha, L1_wt, …])

Return a regularized fit to a linear regression model.

from_formula(formula, data[, subset, drop_cols])

Create a Model from a formula and dataframe.

get_distribution(params, scale[, exog, …])

Construct a random number generator for the predictive distribution.

hessian(params)

The Hessian matrix of the model.

hessian_factor(params[, scale, observed])

Compute weights for calculating Hessian.

information(params)

Fisher information matrix of model.

initialize()

Initialize model components.

loglike(params)

Compute the value of the Gaussian log-likelihood function at params.

predict(params[, exog])

Return linear predicted values from a design matrix.

score(params)

Score vector of model.

whiten(x)

GLS whiten method.

Methods

fit([method, cov_type, cov_kwds, use_t])

Full fit of the model.

fit_regularized([method, alpha, L1_wt, …])

Return a regularized fit to a linear regression model.

from_formula(formula, data[, subset, drop_cols])

Create a Model from a formula and dataframe.

get_distribution(params, scale[, exog, …])

Construct a random number generator for the predictive distribution.

hessian(params)

The Hessian matrix of the model.

hessian_factor(params[, scale, observed])

Compute weights for calculating Hessian.

information(params)

Fisher information matrix of model.

initialize()

Initialize model components.

loglike(params)

Compute the value of the Gaussian log-likelihood function at params.

predict(params[, exog])

Return linear predicted values from a design matrix.

score(params)

Score vector of model.

whiten(x)

GLS whiten method.

Properties

df_model

The model degree of freedom.

df_resid

The residual degree of freedom.

endog_names

Names of endogenous variables.

exog_names

Names of exogenous variables.