7.18. Tools

Our tool collection contains some convenience functions for users and functions that were written mainly for internal use.

Additional to this tools directory, several other subpackages have their own tools modules, for example statsmodels.tsa.tsatools

7.18.1. Module Reference

7.18.1.1. Basic tools tools

These are basic and miscellaneous tools. The full import path is statsmodels.tools.tools.

tools.add_constant(data[, prepend, has_constant]) This appends a column of ones to an array if prepend==False.

The next group are mostly helper functions that are not separately tested or insufficiently tested.

tools.categorical(data[, col, dictnames, drop]) Returns a dummy matrix given an array of categorical variables.
tools.ECDF(x[, side]) Return the Empirical CDF of an array as a step function.
tools.clean0(matrix) Erase columns of zeros: can save some time in pseudoinverse.
tools.fullrank(X[, r]) Return a matrix whose column span is the same as X.
tools.isestimable(C, D) True if (Q, P) contrast C is estimable for (N, P) design D
tools.monotone_fn_inverter(fn, x[, vectorized]) Given a monotone function x (no checking is done to verify monotonicity) and a set of x values, return an linearly interpolated approximation to its inverse from its values on x.
tools.rank(X[, cond]) Return the rank of a matrix X based on its generalized inverse, not the SVD.
tools.recipr(X) Return the reciprocal of an array, setting all entries less than or equal to 0 to 0.
tools.recipr0(X) Return the reciprocal of an array, setting all entries equal to 0 as 0.
tools.unsqueeze(data, axis, oldshape) Unsqueeze a collapsed array

7.18.1.2. Numerical Differentiation

numdiff.approx_fprime(x, f[, epsilon, args, ...]) Gradient of function, or Jacobian if function f returns 1d array
numdiff.approx_fprime_cs(x, f[, epsilon, ...]) Calculate gradient or Jacobian with complex step derivative approximation
numdiff.approx_hess1(x, f[, epsilon, args, ...]) Calculate Hessian with finite difference derivative approximation
numdiff.approx_hess2(x, f[, epsilon, args, ...]) Calculate Hessian with finite difference derivative approximation
numdiff.approx_hess3(x, f[, epsilon, args, ...]) Calculate Hessian with finite difference derivative approximation
numdiff.approx_hess_cs(x, f[, epsilon, ...]) Calculate Hessian with complex-step derivative approximation

7.18.1.3. Measure for fit performance eval_measures

The first group of function in this module are standalone versions of information criteria, aic bic and hqic. The function with _sigma suffix take the error sum of squares as argument, those without, take the value of the log-likelihood, llf, as argument.

The second group of function are measures of fit or prediction performance, which are mostly one liners to be used as helper functions. All of those calculate a performance or distance statistic for the difference between two arrays. For example in the case of Monte Carlo or cross-validation, the first array would be the estimation results for the different replications or draws, while the second array would be the true or observed values.

eval_measures.aic(llf, nobs, df_modelwc) Akaike information criterion
eval_measures.aic_sigma(sigma2, nobs, df_modelwc) Akaike information criterion
eval_measures.aicc(llf, nobs, df_modelwc) Akaike information criterion (AIC) with small sample correction
eval_measures.aicc_sigma(sigma2, nobs, ...) Akaike information criterion (AIC) with small sample correction
eval_measures.bic(llf, nobs, df_modelwc) Bayesian information criterion (BIC) or Schwarz criterion
eval_measures.bic_sigma(sigma2, nobs, df_modelwc) Bayesian information criterion (BIC) or Schwarz criterion
eval_measures.hqic(llf, nobs, df_modelwc) Hannan-Quinn information criterion (HQC)
eval_measures.hqic_sigma(sigma2, nobs, ...) Hannan-Quinn information criterion (HQC)
eval_measures.bias(x1, x2[, axis]) bias, mean error
eval_measures.iqr(x1, x2[, axis]) interquartile range of error
eval_measures.maxabs(x1, x2[, axis]) maximum absolute error
eval_measures.meanabs(x1, x2[, axis]) mean absolute error
eval_measures.medianabs(x1, x2[, axis]) median absolute error
eval_measures.medianbias(x1, x2[, axis]) median bias, median error
eval_measures.mse(x1, x2[, axis]) mean squared error
eval_measures.rmse(x1, x2[, axis]) root mean squared error
eval_measures.stde(x1, x2[, ddof, axis]) standard deviation of error
eval_measures.vare(x1, x2[, ddof, axis]) variance of error