7.10.3.3. statsmodels.stats.gof.gof_binning_discrete¶
-
statsmodels.stats.gof.
gof_binning_discrete
(rvs, distfn, arg, nsupp=20)[source]¶ get bins for chisquare type gof tests for a discrete distribution
Parameters: rvs : array
sample data
distname : string
name of distribution function
arg : sequence
parameters of distribution
nsupp : integer
number of bins. The algorithm tries to find bins with equal weights. depending on the distribution, the actual number of bins can be smaller.
Returns: freq : array
empirical frequencies for sample; not normalized, adds up to sample size
expfreq : array
theoretical frequencies according to distribution
histsupp : array
bin boundaries for histogram, (added 1e-8 for numerical robustness)
Notes
The results can be used for a chisquare test
(chis,pval) = stats.chisquare(freq, expfreq)
originally written for scipy.stats test suite, still needs to be checked for standalone usage, insufficient input checking may not run yet (after copy/paste)
- refactor: maybe a class, check returns, or separate binning from
- test results
- todo :
- optimal number of bins ? (check easyfit), recommendation in literature at least 5 expected observations in each bin