6.3.3. statsmodels.sandbox.distributions.estimators¶
estimate distribution parameters by various methods method of moments or matching quantiles, and Maximum Likelihood estimation based on binned data and Maximum Product-of-Spacings
- Warning: I’m still finding cut-and-paste and refactoring errors, e.g.
- hardcoded variables from outer scope in functions some results don’t seem to make sense for Pareto case, looks better now after correcting some name errors
- initially loosely based on a paper and blog for quantile matching
- by John D. Cook formula for gamma quantile (ppf) matching by him (from paper) http://www.codeproject.com/KB/recipes/ParameterPercentile.aspx http://www.johndcook.com/blog/2010/01/31/parameters-from-percentiles/ this is what I actually used (in parts): http://www.bepress.com/mdandersonbiostat/paper55/
6.3.3.1. quantile based estimator¶
only special cases for number or parameters so far Is there a literature for GMM estimation of distribution parameters? check
found one: Wu/Perloff 2007
6.3.3.2. binned estimator¶
- I added this also
- use it for chisquare tests with estimation distribution parameters
- move this to distribution_extras (next to gof tests powerdiscrepancy and continuous) or add to distribution_patch
example: t-distribution * works with quantiles if they contain tail quantiles * results with momentcondquant don’t look as good as mle estimate
TODOs * rearange and make sure I don’t use module globals (as I did initially) DONE
make two version exactly identified method of moments with fsolve and GMM (?) version with fmin and maybe the special cases of JD Cook update: maybe exact (MM) version is not so interesting compared to GMM
- add semifrozen version of moment and quantile based estimators, e.g. for beta (both loc and scale fixed), or gamma (loc fixed)
- add beta example to the semifrozen MLE, fitfr, code -> added method of moment estimator to _fitstart for beta
- start a list of how well different estimators, especially current mle work for the different distributions
- need general GMM code (with optimal weights ?), looks like a good example for it
- get example for binned data estimation, mailing list a while ago
- any idea when these are better than mle ?
- check language: I use quantile to mean the value of the random variable, not quantile between 0 and 1.
- for GMM: move moment conditions to separate function, so that they can be used for further analysis, e.g. covariance matrix of parameter estimates
- question: Are GMM properties different for matching quantiles with cdf or ppf? Estimate should be the same, but derivatives of moment conditions differ.
- add maximum spacings estimator, Wikipedia, Per Brodtkorb -> basic version Done
- add parameter estimation based on empirical characteristic function (Carrasco/Florens), especially for stable distribution
- provide a model class based on estimating all distributions, and collect all distribution specific information
6.3.3.2.1. References¶
Ximing Wu, Jeffrey M. Perloff, GMM estimation of a maximum entropy distribution with interval data, Journal of Econometrics, Volume 138, Issue 2, ‘Information and Entropy Econometrics’ - A Volume in Honor of Arnold Zellner, June 2007, Pages 532-546, ISSN 0304-4076, DOI: 10.1016/j.jeconom.2006.05.008. http://www.sciencedirect.com/science/article/B6VC0-4K606TK-4/2/78bc07c6245546374490f777a6bdbbcc http://escholarship.org/uc/item/7jf5w1ht (working paper)
Johnson, Kotz, Balakrishnan: Volume 2
Author : josef-pktd License : BSD created : 2010-04-20
changes: added Maximum Product-of-Spacings 2010-05-12
6.3.3.2.2. Functions¶
fit_mps (dist, data[, x0]) |
Estimate distribution parameters with Maximum Product-of-Spacings |
fitbinned (distfn, freq, binedges, start[, fixed]) |
estimate parameters of distribution function for binned data using MLE |
fitbinnedgmm (distfn, freq, binedges, start) |
estimate parameters of distribution function for binned data using GMM |
fitquantilesgmm (distfn, x[, start, pquant, ...]) |
|
gammamomentcond (distfn, params, mom2[, quantile]) |
estimate distribution parameters based method of moments (mean, |
gammamomentcond2 (distfn, params, mom2[, ...]) |
estimate distribution parameters based method of moments (mean, |
getstartparams (dist, data) |
get starting values for estimation of distribution parameters |
hess_ndt (fun, pars, args, options) |
|
logmps (params, xsorted, dist) |
calculate negative log of Product-of-Spacings |
momentcondquant (distfn, params, mom2[, ...]) |
moment conditions for estimating distribution parameters by matching |
momentcondunbound (distfn, params, mom2[, ...]) |
moment conditions for estimating distribution parameters using method |
momentcondunboundls (distfn, params, mom2[, ...]) |
moment conditions for estimating loc and scale of a distribution |