7.10. Statistics stats

This section collects various statistical tests and tools. Some can be used independently of any models, some are intended as extension to the models and model results.

API Warning: The functions and objects in this category are spread out in various modules and might still be moved around. We expect that in future the statistical tests will return class instances with more informative reporting instead of only the raw numbers.

7.10.1. Residual Diagnostics and Specification Tests

durbin_watson(resids[, axis]) Calculates the Durbin-Watson statistic
jarque_bera(resids[, axis]) Calculate residual skewness, kurtosis, and do the JB test for normality
omni_normtest(resids[, axis]) Omnibus test for normality
acorr_ljungbox(x[, lags, boxpierce]) Ljung-Box test for no autocorrelation
acorr_breush_godfrey(results[, nlags, store]) Breush Godfrey Lagrange Multiplier tests for residual autocorrelation
HetGoldfeldQuandt test whether variance is the same in 2 subsamples
het_goldfeldquandt see class docstring
het_breushpagan(resid, exog_het) Breush-Pagan Lagrange Multiplier test for heteroscedasticity
het_white(resid, exog[, retres]) White’s Lagrange Multiplier Test for Heteroscedasticity
het_arch(resid[, maxlag, autolag, store, ...]) Engle’s Test for Autoregressive Conditional Heteroscedasticity (ARCH)
linear_harvey_collier(res) Harvey Collier test for linearity
linear_rainbow(res[, frac]) Rainbow test for linearity
linear_lm(resid, exog[, func]) Lagrange multiplier test for linearity against functional alternative
breaks_cusumolsresid(olsresidual[, ddof]) cusum test for parameter stability based on ols residuals
breaks_hansen(olsresults) test for model stability, breaks in parameters for ols, Hansen 1992
recursive_olsresiduals(olsresults[, skip, ...]) calculate recursive ols with residuals and cusum test statistic
CompareCox Cox Test for non-nested models
compare_cox Cox Test for non-nested models
CompareJ J-Test for comparing non-nested models
compare_j J-Test for comparing non-nested models
unitroot_adf(x[, maxlag, trendorder, ...])
normal_ad(x[, axis]) Anderson-Darling test for normal distribution unknown mean and variance
kstest_normal(x[, pvalmethod]) Lillifors test for normality,
lillifors(x[, pvalmethod]) Lillifors test for normality,

7.10.1.25. Outliers and influence measures

OLSInfluence(results) class to calculate outlier and influence measures for OLS result
variance_inflation_factor(exog, exog_idx) variance inflation factor, VIF, for one exogenous variable

See also the notes on notes on regression diagnostics

7.10.2. Sandwich Robust Covariances

The following functions calculate covariance matrices and standard errors for the parameter estimates that are robust to heteroscedasticity and autocorrelation in the errors. Similar to the methods that are available for the LinearModelResults, these methods are designed for use with OLS.

sandwich_covariance.cov_hac(results[, ...]) heteroscedasticity and autocorrelation robust covariance matrix (Newey-West)
sandwich_covariance.cov_nw_panel(results, ...) Panel HAC robust covariance matrix
sandwich_covariance.cov_nw_groupsum(results, ...) Driscoll and Kraay Panel robust covariance matrix
sandwich_covariance.cov_cluster(results, group) cluster robust covariance matrix
sandwich_covariance.cov_cluster_2groups(...) cluster robust covariance matrix for two groups/clusters
sandwich_covariance.cov_white_simple(results) heteroscedasticity robust covariance matrix (White)

The following are standalone versions of the heteroscedasticity robust standard errors attached to LinearModelResults

sandwich_covariance.cov_hc0(results) See statsmodels.RegressionResults
sandwich_covariance.cov_hc1(results) See statsmodels.RegressionResults
sandwich_covariance.cov_hc2(results) See statsmodels.RegressionResults
sandwich_covariance.cov_hc3(results) See statsmodels.RegressionResults
sandwich_covariance.se_cov(cov) get standard deviation from covariance matrix

7.10.3. Goodness of Fit Tests and Measures

some tests for goodness of fit for univariate distributions

powerdiscrepancy(observed, expected[, ...]) Calculates power discrepancy, a class of goodness-of-fit tests as a measure of discrepancy between observed and expected data.
gof_chisquare_discrete(distfn, arg, rvs, ...) perform chisquare test for random sample of a discrete distribution
gof_binning_discrete(rvs, distfn, arg[, nsupp]) get bins for chisquare type gof tests for a discrete distribution
chisquare_effectsize(probs0, probs1[, ...]) effect size for a chisquare goodness-of-fit test
normal_ad(x[, axis]) Anderson-Darling test for normal distribution unknown mean and variance
kstest_normal(x[, pvalmethod]) Lillifors test for normality,
lillifors(x[, pvalmethod]) Lillifors test for normality,

7.10.4. Non-Parametric Tests

mcnemar(x[, y, exact, correction]) McNemar test
symmetry_bowker(table) Test for symmetry of a (k, k) square contingency table
median_test_ksample(x, groups) chisquare test for equality of median/location
runstest_1samp(x[, cutoff, correction]) use runs test on binary discretized data above/below cutoff
runstest_2samp(x[, y, groups, correction]) Wald-Wolfowitz runstest for two samples
cochrans_q(x) Cochran’s Q test for identical effect of k treatments
Runs(x) class for runs in a binary sequence
sign_test(samp[, mu0]) Signs test.

7.10.5. Interrater Reliability and Agreement

The main function that statsmodels has currently available for interrater agreement measures and tests is Cohen’s Kappa. Fleiss’ Kappa is currently only implemented as a measures but without associated results statistics.

cohens_kappa(table[, weights, ...]) Compute Cohen’s kappa with variance and equal-zero test
fleiss_kappa(table) Fleiss’ kappa multi-rater agreement measure
to_table(data[, bins]) convert raw data with shape (subject, rater) to (rater1, rater2)
aggregate_raters(data[, n_cat]) convert raw data with shape (subject, rater) to (subject, cat_counts)

7.10.6. Multiple Tests and Multiple Comparison Procedures

multipletests is a function for p-value correction, which also includes p-value correction based on fdr in fdrcorrection. tukeyhsd performs simulatenous testing for the comparison of (independent) means. These three functions are verified. GroupsStats and MultiComparison are convenience classes to multiple comparisons similar to one way ANOVA, but still in developement

multipletests(pvals[, alpha, method, ...]) test results and p-value correction for multiple tests
fdrcorrection0(pvals[, alpha, method, is_sorted]) pvalue correction for false discovery rate
GroupsStats(x[, useranks, uni, intlab]) statistics by groups (another version)
MultiComparison(data, groups[, group_order]) Tests for multiple comparisons
TukeyHSDResults(mc_object, results_table, q_crit) Results from Tukey HSD test, with additional plot methods
pairwise_tukeyhsd(endog, groups[, alpha]) calculate all pairwise comparisons with TukeyHSD confidence intervals

The following functions are not (yet) public

varcorrection_pairs_unbalanced(nobs_all[, ...]) correction factor for variance with unequal sample sizes for all pairs
varcorrection_pairs_unequal(var_all, ...) return joint variance from samples with unequal variances and unequal
varcorrection_unbalanced(nobs_all[, srange]) correction factor for variance with unequal sample sizes
varcorrection_unequal(var_all, nobs_all, df_all) return joint variance from samples with unequal variances and unequal
StepDown(vals, nobs_all, var_all[, df]) a class for step down methods
catstack(args)
ccols
compare_ordered(vals, alpha) simple ordered sequential comparison of means
distance_st_range(mean_all, nobs_all, var_all) pairwise distance matrix, outsourced from tukeyhsd
ecdf(x) no frills empirical cdf used in fdrcorrection
get_tukeyQcrit(k, df[, alpha]) return critical values for Tukey’s HSD (Q)
homogeneous_subsets(vals, dcrit) recursively check all pairs of vals for minimum distance
line str(object=’‘) -> string
maxzero(x) find all up zero crossings and return the index of the highest
maxzerodown(x) find all up zero crossings and return the index of the highest
mcfdr([nrepl, nobs, ntests, ntrue, mu, ...]) MonteCarlo to test fdrcorrection
qcrit str(object=’‘) -> string
randmvn(rho[, size, standardize]) create random draws from equi-correlated multivariate normal distribution
rankdata(x) rankdata, equivalent to scipy.stats.rankdata
rejectionline(n[, alpha]) reference line for rejection in multiple tests
set_partition(ssli) extract a partition from a list of tuples
set_remove_subs(ssli) remove sets that are subsets of another set from a list of tuples
tiecorrect(xranks) should be equivalent of scipy.stats.tiecorrect

7.10.7. Basic Statistics and t-Tests with frequency weights

Besides basic statistics, like mean, variance, covariance and correlation for data with case weights, the classes here provide one and two sample tests for means. The t-tests have more options than those in scipy.stats, but are more restrictive in the shape of the arrays. Confidence intervals for means are provided based on the same assumptions as the t-tests.

Additionally, tests for equivalence of means are available for one sample and for two, either paired or independent, samples. These tests are based on TOST, two one-sided tests, which have as null hypothesis that the means are not “close” to each other.

DescrStatsW(data[, weights, ddof]) descriptive statistics and tests with weights for case weights
CompareMeans(d1, d2) class for two sample comparison
ttest_ind(x1, x2[, alternative, usevar, ...]) ttest independent sample
ttost_ind(x1, x2, low, upp[, usevar, ...]) test of (non-)equivalence for two independent samples
ttost_paired(x1, x2, low, upp[, transform, ...]) test of (non-)equivalence for two dependent, paired sample
ztest(x1[, x2, value, alternative, usevar, ddof]) test for mean based on normal distribution, one or two samples
ztost(x1, low, upp[, x2, usevar, ddof]) Equivalence test based on normal distribution
zconfint(x1[, x2, value, alpha, ...]) confidence interval based on normal distribution z-test

weightstats also contains tests and confidence intervals based on summary data

_tconfint_generic(mean, std_mean, dof, ...) generic t-confint to save typing
_tstat_generic(value1, value2, std_diff, ...) generic ttest to save typing
_zconfint_generic(mean, std_mean, alpha, ...) generic normal-confint to save typing
_zstat_generic(value1, value2, std_diff, ...) generic (normal) z-test to save typing
_zstat_generic2(value, std_diff, alternative) generic (normal) z-test to save typing

7.10.8. Power and Sample Size Calculations

The power module currently implements power and sample size calculations for the t-tests, normal based test, F-tests and Chisquare goodness of fit test. The implementation is class based, but the module also provides three shortcut functions, tt_solve_power, tt_ind_solve_power and zt_ind_solve_power to solve for any one of the parameters of the power equations.

TTestIndPower(**kwds) Statistical Power calculations for t-test for two independent sample
TTestPower(**kwds) Statistical Power calculations for one sample or paired sample t-test
GofChisquarePower(**kwds) Statistical Power calculations for one sample chisquare test
NormalIndPower([ddof]) Statistical Power calculations for z-test for two independent samples.
FTestAnovaPower(**kwds) Statistical Power calculations F-test for one factor balanced ANOVA
FTestPower(**kwds) Statistical Power calculations for generic F-test
tt_solve_power solve for any one parameter of the power of a one sample t-test
tt_ind_solve_power solve for any one parameter of the power of a two sample t-test
zt_ind_solve_power solve for any one parameter of the power of a two sample z-test

7.10.9. Proportion

Also available are hypothesis test, confidence intervals and effect size for proportions that can be used with NormalIndPower.

proportion_confint(count, nobs[, alpha, method]) confidence interval for a binomial proportion
proportion_effectsize(prop1, prop2[, method]) effect size for a test comparing two proportions
binom_test(count, nobs[, prop, alternative]) Perform a test that the probability of success is p.
binom_test_reject_interval(value, nobs[, ...]) rejection region for binomial test for one sample proportion
binom_tost(count, nobs, low, upp) exact TOST test for one proportion using binomial distribution
binom_tost_reject_interval(low, upp, nobs[, ...]) rejection region for binomial TOST
proportions_ztest(count, nobs[, value, ...]) test for proportions based on normal (z) test
proportions_ztost(count, nobs, low, upp[, ...]) Equivalence test based on normal distribution
proportions_chisquare(count, nobs[, value]) test for proportions based on chisquare test
proportions_chisquare_allpairs(count, nobs) chisquare test of proportions for all pairs of k samples
proportions_chisquare_pairscontrol(count, nobs) chisquare test of proportions for pairs of k samples compared to control
proportion_effectsize(prop1, prop2[, method]) effect size for a test comparing two proportions
power_binom_tost(low, upp, nobs[, p_alt, alpha])
power_ztost_prop(low, upp, nobs, p_alt[, ...]) Power of proportions equivalence test based on normal distribution
samplesize_confint_proportion(proportion, ...) find sample size to get desired confidence interval length

7.10.10. Moment Helpers

When there are missing values, then it is possible that a correlation or covariance matrix is not positive semi-definite. The following three functions can be used to find a correlation or covariance matrix that is positive definite and close to the original matrix.

corr_nearest(corr[, threshold, n_fact]) Find the nearest correlation matrix that is positive semi-definite.
corr_clipped(corr[, threshold]) Find a near correlation matrix that is positive semi-definite
cov_nearest(cov[, method, threshold, ...]) Find the nearest covariance matrix that is postive (semi-) definite

These are utility functions to convert between central and non-central moments, skew, kurtosis and cummulants.

cum2mc(kappa) convert non-central moments to cumulants
mc2mnc(mc) convert central to non-central moments, uses recursive formula
mc2mvsk(args) convert central moments to mean, variance, skew, kurtosis
mnc2cum(mnc) convert non-central moments to cumulants
mnc2mc(mnc[, wmean]) convert non-central to central moments, uses recursive formula
mnc2mvsk(args) convert central moments to mean, variance, skew, kurtosis
mvsk2mc(args) convert mean, variance, skew, kurtosis to central moments
mvsk2mnc(args) convert mean, variance, skew, kurtosis to non-central moments
cov2corr(cov[, return_std]) convert covariance matrix to correlation matrix
corr2cov(corr, std) convert correlation matrix to covariance matrix given standard deviation
se_cov(cov) get standard deviation from covariance matrix