5.8. Statistical functions (scipy.stats
)¶
This module contains a large number of probability distributions as well as a growing library of statistical functions.
Each univariate distribution is an instance of a subclass of rv_continuous (rv_discrete for discrete distributions):
rv_continuous ([momtype, a, b, xtol, ...]) |
A generic continuous random variable class meant for subclassing. |
rv_discrete ([a, b, name, badvalue, ...]) |
A generic discrete random variable class meant for subclassing. |
5.8.3. Continuous distributions¶
alpha |
An alpha continuous random variable. |
anglit |
An anglit continuous random variable. |
arcsine |
An arcsine continuous random variable. |
beta |
A beta continuous random variable. |
betaprime |
A beta prime continuous random variable. |
bradford |
A Bradford continuous random variable. |
burr |
A Burr (Type III) continuous random variable. |
cauchy |
A Cauchy continuous random variable. |
chi |
A chi continuous random variable. |
chi2 |
A chi-squared continuous random variable. |
cosine |
A cosine continuous random variable. |
dgamma |
A double gamma continuous random variable. |
dweibull |
A double Weibull continuous random variable. |
erlang |
An Erlang continuous random variable. |
expon |
An exponential continuous random variable. |
exponnorm |
An exponentially modified Normal continuous random variable. |
exponweib |
An exponentiated Weibull continuous random variable. |
exponpow |
An exponential power continuous random variable. |
f |
An F continuous random variable. |
fatiguelife |
A fatigue-life (Birnbaum-Saunders) continuous random variable. |
fisk |
A Fisk continuous random variable. |
foldcauchy |
A folded Cauchy continuous random variable. |
foldnorm |
A folded normal continuous random variable. |
frechet_r |
A Frechet right (or Weibull minimum) continuous random variable. |
frechet_l |
A Frechet left (or Weibull maximum) continuous random variable. |
genlogistic |
A generalized logistic continuous random variable. |
gennorm |
A generalized normal continuous random variable. |
genpareto |
A generalized Pareto continuous random variable. |
genexpon |
A generalized exponential continuous random variable. |
genextreme |
A generalized extreme value continuous random variable. |
gausshyper |
A Gauss hypergeometric continuous random variable. |
gamma |
A gamma continuous random variable. |
gengamma |
A generalized gamma continuous random variable. |
genhalflogistic |
A generalized half-logistic continuous random variable. |
gilbrat |
A Gilbrat continuous random variable. |
gompertz |
A Gompertz (or truncated Gumbel) continuous random variable. |
gumbel_r |
A right-skewed Gumbel continuous random variable. |
gumbel_l |
A left-skewed Gumbel continuous random variable. |
halfcauchy |
A Half-Cauchy continuous random variable. |
halflogistic |
A half-logistic continuous random variable. |
halfnorm |
A half-normal continuous random variable. |
halfgennorm |
The upper half of a generalized normal continuous random variable. |
hypsecant |
A hyperbolic secant continuous random variable. |
invgamma |
An inverted gamma continuous random variable. |
invgauss |
An inverse Gaussian continuous random variable. |
invweibull |
An inverted Weibull continuous random variable. |
johnsonsb |
A Johnson SB continuous random variable. |
johnsonsu |
A Johnson SU continuous random variable. |
ksone |
General Kolmogorov-Smirnov one-sided test. |
kstwobign |
Kolmogorov-Smirnov two-sided test for large N. |
laplace |
A Laplace continuous random variable. |
levy |
A Levy continuous random variable. |
levy_l |
A left-skewed Levy continuous random variable. |
levy_stable |
A Levy-stable continuous random variable. |
logistic |
A logistic (or Sech-squared) continuous random variable. |
loggamma |
A log gamma continuous random variable. |
loglaplace |
A log-Laplace continuous random variable. |
lognorm |
A lognormal continuous random variable. |
lomax |
A Lomax (Pareto of the second kind) continuous random variable. |
maxwell |
A Maxwell continuous random variable. |
mielke |
A Mielke’s Beta-Kappa continuous random variable. |
nakagami |
A Nakagami continuous random variable. |
ncx2 |
A non-central chi-squared continuous random variable. |
ncf |
A non-central F distribution continuous random variable. |
nct |
A non-central Student’s T continuous random variable. |
norm |
A normal continuous random variable. |
pareto |
A Pareto continuous random variable. |
pearson3 |
A pearson type III continuous random variable. |
powerlaw |
A power-function continuous random variable. |
powerlognorm |
A power log-normal continuous random variable. |
powernorm |
A power normal continuous random variable. |
rdist |
An R-distributed continuous random variable. |
reciprocal |
A reciprocal continuous random variable. |
rayleigh |
A Rayleigh continuous random variable. |
rice |
A Rice continuous random variable. |
recipinvgauss |
A reciprocal inverse Gaussian continuous random variable. |
semicircular |
A semicircular continuous random variable. |
t |
A Student’s T continuous random variable. |
triang |
A triangular continuous random variable. |
truncexpon |
A truncated exponential continuous random variable. |
truncnorm |
A truncated normal continuous random variable. |
tukeylambda |
A Tukey-Lamdba continuous random variable. |
uniform |
A uniform continuous random variable. |
vonmises |
A Von Mises continuous random variable. |
vonmises_line |
A Von Mises continuous random variable. |
wald |
A Wald continuous random variable. |
weibull_min |
A Frechet right (or Weibull minimum) continuous random variable. |
weibull_max |
A Frechet left (or Weibull maximum) continuous random variable. |
wrapcauchy |
A wrapped Cauchy continuous random variable. |
5.8.4. Multivariate distributions¶
multivariate_normal |
A multivariate normal random variable. |
matrix_normal |
A matrix normal random variable. |
dirichlet |
A Dirichlet random variable. |
wishart |
A Wishart random variable. |
invwishart |
An inverse Wishart random variable. |
5.8.5. Discrete distributions¶
bernoulli |
A Bernoulli discrete random variable. |
binom |
A binomial discrete random variable. |
boltzmann |
A Boltzmann (Truncated Discrete Exponential) random variable. |
dlaplace |
A Laplacian discrete random variable. |
geom |
A geometric discrete random variable. |
hypergeom |
A hypergeometric discrete random variable. |
logser |
A Logarithmic (Log-Series, Series) discrete random variable. |
nbinom |
A negative binomial discrete random variable. |
planck |
A Planck discrete exponential random variable. |
poisson |
A Poisson discrete random variable. |
randint |
A uniform discrete random variable. |
skellam |
A Skellam discrete random variable. |
zipf |
A Zipf discrete random variable. |
5.8.6. Statistical functions¶
Several of these functions have a similar version in scipy.stats.mstats which work for masked arrays.
describe (a[, axis, ddof, bias, nan_policy]) |
Computes several descriptive statistics of the passed array. |
gmean (a[, axis, dtype]) |
Compute the geometric mean along the specified axis. |
hmean (a[, axis, dtype]) |
Calculates the harmonic mean along the specified axis. |
kurtosis (a[, axis, fisher, bias, nan_policy]) |
Computes the kurtosis (Fisher or Pearson) of a dataset. |
kurtosistest (a[, axis, nan_policy]) |
Tests whether a dataset has normal kurtosis |
mode (a[, axis, nan_policy]) |
Returns an array of the modal (most common) value in the passed array. |
moment (a[, moment, axis, nan_policy]) |
Calculates the nth moment about the mean for a sample. |
normaltest (a[, axis, nan_policy]) |
Tests whether a sample differs from a normal distribution. |
skew (a[, axis, bias, nan_policy]) |
Computes the skewness of a data set. |
skewtest (a[, axis, nan_policy]) |
Tests whether the skew is different from the normal distribution. |
kstat (data[, n]) |
Return the nth k-statistic (1<=n<=4 so far). |
kstatvar (data[, n]) |
Returns an unbiased estimator of the variance of the k-statistic. |
tmean (a[, limits, inclusive, axis]) |
Compute the trimmed mean. |
tvar (a[, limits, inclusive, axis, ddof]) |
Compute the trimmed variance |
tmin (a[, lowerlimit, axis, inclusive, ...]) |
Compute the trimmed minimum |
tmax (a[, upperlimit, axis, inclusive, ...]) |
Compute the trimmed maximum |
tstd (a[, limits, inclusive, axis, ddof]) |
Compute the trimmed sample standard deviation |
tsem (a[, limits, inclusive, axis, ddof]) |
Compute the trimmed standard error of the mean. |
nanmean (*args, **kwds) |
nanmean is deprecated! |
nanstd (*args, **kwds) |
nanstd is deprecated! |
nanmedian (*args, **kwds) |
nanmedian is deprecated! |
variation (a[, axis, nan_policy]) |
Computes the coefficient of variation, the ratio of the biased standard deviation to the mean. |
find_repeats (arr) |
Find repeats and repeat counts. |
trim_mean (a, proportiontocut[, axis]) |
Return mean of array after trimming distribution from both tails. |
cumfreq (a[, numbins, defaultreallimits, weights]) |
Returns a cumulative frequency histogram, using the histogram function. |
histogram2 (*args, **kwds) |
histogram2 is deprecated! |
histogram (*args, **kwds) |
histogram is deprecated! |
itemfreq (a) |
Returns a 2-D array of item frequencies. |
percentileofscore (a, score[, kind]) |
The percentile rank of a score relative to a list of scores. |
scoreatpercentile (a, per[, limit, ...]) |
Calculate the score at a given percentile of the input sequence. |
relfreq (a[, numbins, defaultreallimits, weights]) |
Returns a relative frequency histogram, using the histogram function. |
binned_statistic (x, values[, statistic, ...]) |
Compute a binned statistic for one or more sets of data. |
binned_statistic_2d (x, y, values[, ...]) |
Compute a bidimensional binned statistic for one or more sets of data. |
binned_statistic_dd (sample, values[, ...]) |
Compute a multidimensional binned statistic for a set of data. |
obrientransform (*args) |
Computes the O’Brien transform on input data (any number of arrays). |
signaltonoise (*args, **kwds) |
signaltonoise is deprecated! |
bayes_mvs (data[, alpha]) |
Bayesian confidence intervals for the mean, var, and std. |
mvsdist (data) |
‘Frozen’ distributions for mean, variance, and standard deviation of data. |
sem (a[, axis, ddof, nan_policy]) |
Calculates the standard error of the mean (or standard error of measurement) of the values in the input array. |
zmap (scores, compare[, axis, ddof]) |
Calculates the relative z-scores. |
zscore (a[, axis, ddof]) |
Calculates the z score of each value in the sample, relative to the sample mean and standard deviation. |
sigmaclip (a[, low, high]) |
Iterative sigma-clipping of array elements. |
threshold (*args, **kwds) |
threshold is deprecated! |
trimboth (a, proportiontocut[, axis]) |
Slices off a proportion of items from both ends of an array. |
trim1 (a, proportiontocut[, tail, axis]) |
Slices off a proportion from ONE end of the passed array distribution. |
f_oneway (*args) |
Performs a 1-way ANOVA. |
pearsonr (x, y) |
Calculates a Pearson correlation coefficient and the p-value for testing non-correlation. |
spearmanr (a[, b, axis, nan_policy]) |
Calculates a Spearman rank-order correlation coefficient and the p-value to test for non-correlation. |
pointbiserialr (x, y) |
Calculates a point biserial correlation coefficient and its p-value. |
kendalltau (x, y[, initial_lexsort, nan_policy]) |
Calculates Kendall’s tau, a correlation measure for ordinal data. |
linregress (x[, y]) |
Calculate a linear least-squares regression for two sets of measurements. |
theilslopes (y[, x, alpha]) |
Computes the Theil-Sen estimator for a set of points (x, y). |
f_value (*args, **kwds) |
f_value is deprecated! |
ttest_1samp (a, popmean[, axis, nan_policy]) |
Calculates the T-test for the mean of ONE group of scores. |
ttest_ind (a, b[, axis, equal_var, nan_policy]) |
Calculates the T-test for the means of TWO INDEPENDENT samples of scores. |
ttest_ind_from_stats (mean1, std1, nobs1, ...) |
T-test for means of two independent samples from descriptive statistics. |
ttest_rel (a, b[, axis, nan_policy]) |
Calculates the T-test on TWO RELATED samples of scores, a and b. |
kstest (rvs, cdf[, args, N, alternative, mode]) |
Perform the Kolmogorov-Smirnov test for goodness of fit. |
chisquare (f_obs[, f_exp, ddof, axis]) |
Calculates a one-way chi square test. |
power_divergence (f_obs[, f_exp, ddof, axis, ...]) |
Cressie-Read power divergence statistic and goodness of fit test. |
ks_2samp (data1, data2) |
Computes the Kolmogorov-Smirnov statistic on 2 samples. |
mannwhitneyu (x, y[, use_continuity, alternative]) |
Computes the Mann-Whitney rank test on samples x and y. |
tiecorrect (rankvals) |
Tie correction factor for ties in the Mann-Whitney U and Kruskal-Wallis H tests. |
rankdata (a[, method]) |
Assign ranks to data, dealing with ties appropriately. |
ranksums (x, y) |
Compute the Wilcoxon rank-sum statistic for two samples. |
wilcoxon (x[, y, zero_method, correction]) |
Calculate the Wilcoxon signed-rank test. |
kruskal (*args, **kwargs) |
Compute the Kruskal-Wallis H-test for independent samples |
friedmanchisquare (*args) |
Computes the Friedman test for repeated measurements |
combine_pvalues (pvalues[, method, weights]) |
Methods for combining the p-values of independent tests bearing upon the same hypothesis. |
ss (*args, **kwds) |
ss is deprecated! |
square_of_sums (*args, **kwds) |
square_of_sums is deprecated! |
jarque_bera (x) |
Perform the Jarque-Bera goodness of fit test on sample data. |
ansari (x, y) |
Perform the Ansari-Bradley test for equal scale parameters |
bartlett (*args) |
Perform Bartlett’s test for equal variances |
levene (*args, **kwds) |
Perform Levene test for equal variances. |
shapiro (x[, a, reta]) |
Perform the Shapiro-Wilk test for normality. |
anderson (x[, dist]) |
Anderson-Darling test for data coming from a particular distribution |
anderson_ksamp (samples[, midrank]) |
The Anderson-Darling test for k-samples. |
binom_test (x[, n, p, alternative]) |
Perform a test that the probability of success is p. |
fligner (*args, **kwds) |
Perform Fligner-Killeen test for equality of variance. |
median_test (*args, **kwds) |
Mood’s median test. |
mood (x, y[, axis]) |
Perform Mood’s test for equal scale parameters. |
boxcox (x[, lmbda, alpha]) |
Return a positive dataset transformed by a Box-Cox power transformation. |
boxcox_normmax (x[, brack, method]) |
Compute optimal Box-Cox transform parameter for input data. |
boxcox_llf (lmb, data) |
The boxcox log-likelihood function. |
entropy (pk[, qk, base]) |
Calculate the entropy of a distribution for given probability values. |
chisqprob (*args, **kwds) |
chisqprob is deprecated! |
betai (*args, **kwds) |
betai is deprecated! |
5.8.7. Circular statistical functions¶
circmean (samples[, high, low, axis]) |
Compute the circular mean for samples in a range. |
circvar (samples[, high, low, axis]) |
Compute the circular variance for samples assumed to be in a range |
circstd (samples[, high, low, axis]) |
Compute the circular standard deviation for samples assumed to be in the range [low to high]. |
5.8.8. Contingency table functions¶
chi2_contingency (observed[, correction, lambda_]) |
Chi-square test of independence of variables in a contingency table. |
contingency.expected_freq (observed) |
Compute the expected frequencies from a contingency table. |
contingency.margins (a) |
Return a list of the marginal sums of the array a. |
fisher_exact (table[, alternative]) |
Performs a Fisher exact test on a 2x2 contingency table. |
5.8.9. Plot-tests¶
ppcc_max (x[, brack, dist]) |
Calculate the shape parameter that maximizes the PPCC |
ppcc_plot (x, a, b[, dist, plot, N]) |
Calculate and optionally plot probability plot correlation coefficient. |
probplot (x[, sparams, dist, fit, plot]) |
Calculate quantiles for a probability plot, and optionally show the plot. |
boxcox_normplot (x, la, lb[, plot, N]) |
Compute parameters for a Box-Cox normality plot, optionally show it. |
5.8.10. Masked statistics functions¶
- 5.8.10.1. Statistical functions for masked arrays (
scipy.stats.mstats
)- 5.8.10.1.1. scipy.stats.mstats.argstoarray
- 5.8.10.1.2. scipy.stats.mstats.betai
- 5.8.10.1.3. scipy.stats.mstats.chisquare
- 5.8.10.1.4. scipy.stats.mstats.count_tied_groups
- 5.8.10.1.5. scipy.stats.mstats.describe
- 5.8.10.1.6. scipy.stats.mstats.f_oneway
- 5.8.10.1.7. scipy.stats.mstats.f_value_wilks_lambda
- 5.8.10.1.8. scipy.stats.mstats.find_repeats
- 5.8.10.1.9. scipy.stats.mstats.friedmanchisquare
- 5.8.10.1.10. scipy.stats.mstats.kendalltau
- 5.8.10.1.11. scipy.stats.mstats.kendalltau_seasonal
- 5.8.10.1.12. scipy.stats.mstats.kruskalwallis
- 5.8.10.1.13. scipy.stats.mstats.ks_twosamp
- 5.8.10.1.14. scipy.stats.mstats.kurtosis
- 5.8.10.1.15. scipy.stats.mstats.kurtosistest
- 5.8.10.1.16. scipy.stats.mstats.linregress
- 5.8.10.1.17. scipy.stats.mstats.mannwhitneyu
- 5.8.10.1.18. scipy.stats.mstats.plotting_positions
- 5.8.10.1.19. scipy.stats.mstats.mode
- 5.8.10.1.20. scipy.stats.mstats.moment
- 5.8.10.1.21. scipy.stats.mstats.mquantiles
- 5.8.10.1.22. scipy.stats.mstats.msign
- 5.8.10.1.23. scipy.stats.mstats.normaltest
- 5.8.10.1.24. scipy.stats.mstats.obrientransform
- 5.8.10.1.25. scipy.stats.mstats.pearsonr
- 5.8.10.1.18. scipy.stats.mstats.plotting_positions
- 5.8.10.1.26. scipy.stats.mstats.pointbiserialr
- 5.8.10.1.27. scipy.stats.mstats.rankdata
- 5.8.10.1.28. scipy.stats.mstats.scoreatpercentile
- 5.8.10.1.29. scipy.stats.mstats.sem
- 5.8.10.1.30. scipy.stats.mstats.signaltonoise
- 5.8.10.1.31. scipy.stats.mstats.skew
- 5.8.10.1.32. scipy.stats.mstats.skewtest
- 5.8.10.1.33. scipy.stats.mstats.spearmanr
- 5.8.10.1.34. scipy.stats.mstats.theilslopes
- 5.8.10.1.35. scipy.stats.mstats.threshold
- 5.8.10.1.36. scipy.stats.mstats.tmax
- 5.8.10.1.37. scipy.stats.mstats.tmean
- 5.8.10.1.38. scipy.stats.mstats.tmin
- 5.8.10.1.39. scipy.stats.mstats.trim
- 5.8.10.1.40. scipy.stats.mstats.trima
- 5.8.10.1.41. scipy.stats.mstats.trimboth
- 5.8.10.1.42. scipy.stats.mstats.trimmed_stde
- 5.8.10.1.43. scipy.stats.mstats.trimr
- 5.8.10.1.44. scipy.stats.mstats.trimtail
- 5.8.10.1.45. scipy.stats.mstats.tsem
- 5.8.10.1.46. scipy.stats.mstats.ttest_onesamp
- 5.8.10.1.47. scipy.stats.mstats.ttest_ind
- 5.8.10.1.46. scipy.stats.mstats.ttest_onesamp
- 5.8.10.1.48. scipy.stats.mstats.ttest_rel
- 5.8.10.1.49. scipy.stats.mstats.tvar
- 5.8.10.1.50. scipy.stats.mstats.variation
- 5.8.10.1.51. scipy.stats.mstats.winsorize
- 5.8.10.1.52. scipy.stats.mstats.zmap
- 5.8.10.1.53. scipy.stats.mstats.zscore
- 5.8.10.1.54. scipy.stats.mstats.compare_medians_ms
- 5.8.10.1.55. scipy.stats.mstats.gmean
- 5.8.10.1.56. scipy.stats.mstats.hdmedian
- 5.8.10.1.57. scipy.stats.mstats.hdquantiles
- 5.8.10.1.58. scipy.stats.mstats.hdquantiles_sd
- 5.8.10.1.59. scipy.stats.mstats.hmean
- 5.8.10.1.60. scipy.stats.mstats.idealfourths
- 5.8.10.1.61. scipy.stats.mstats.kruskal
- 5.8.10.1.62. scipy.stats.mstats.ks_2samp
- 5.8.10.1.63. scipy.stats.mstats.median_cihs
- 5.8.10.1.64. scipy.stats.mstats.meppf
- 5.8.10.1.65. scipy.stats.mstats.mjci
- 5.8.10.1.66. scipy.stats.mstats.mquantiles_cimj
- 5.8.10.1.67. scipy.stats.mstats.rsh
- 5.8.10.1.68. scipy.stats.mstats.sen_seasonal_slopes
- 5.8.10.1.69. scipy.stats.mstats.trimmed_mean
- 5.8.10.1.70. scipy.stats.mstats.trimmed_mean_ci
- 5.8.10.1.71. scipy.stats.mstats.trimmed_std
- 5.8.10.1.72. scipy.stats.mstats.trimmed_var
- 5.8.10.1.73. scipy.stats.mstats.ttest_1samp
5.8.11. Univariate and multivariate kernel density estimation (scipy.stats.kde
)¶
gaussian_kde (dataset[, bw_method]) |
Representation of a kernel-density estimate using Gaussian kernels. |
For many more stat related functions install the software R and the interface package rpy.