6.3.9. statsmodels.sandbox.distributions.mv_normal

Multivariate Normal and t distributions

Created on Sat May 28 15:38:23 2011

@author: Josef Perktold

TODO: * renaming,

  • after adding t distribution, cov doesn’t make sense for Sigma DONE
  • should mean also be renamed to mu, if there will be distributions with mean != mu
  • not sure about corner cases
    • behavior with (almost) singular sigma or transforms
    • df <= 2, is everything correct if variance is not finite or defined ?
  • check to return possibly univariate distribution for marginals or conditional

    distributions, does univariate special case work? seems ok for conditional

  • are all the extra transformation methods useful outside of testing ? - looks like I have some mixup in definitions of standardize, normalize

  • new methods marginal, conditional, ... just added, typos ? - largely tested for MVNormal, not yet for MVT DONE

  • conditional: reusing, vectorizing, should we reuse a projection matrix or allow for a vectorized, conditional_mean similar to OLS.predict

  • add additional things similar to LikelihoodModelResults? quadratic forms, F distribution, and others ???

  • add Delta method for nonlinear functions here, current function is hidden somewhere in miscmodels

  • raise ValueErrors for wrong input shapes, currently only partially checked

  • quantile method (ppf for equal bounds for multiple testing) is missing http://svitsrv25.epfl.ch/R-doc/library/mvtnorm/html/qmvt.html seems to use just a root finder for inversion of cdf

  • normalize has ambiguous definition, and mixing it up in different versions std from sigma or std from cov ? I would like to get what I need for mvt-cdf, or not univariate standard t distribution has scale=1 but std>1 FIXED: add std_sigma, and normalize uses std_sigma

  • more work: bivariate distributions, inherit from multivariate but overwrite some methods for better efficiency, e.g. cdf and expect

I kept the original MVNormal0 class as reference, can be deleted

6.3.9.1. See Also

sandbox/examples/ex_mvelliptical.py

6.3.9.2. Examples

Note, several parts of these examples are random and the numbers will not be (exactly) the same.

>>> import numpy as np
>>> import statsmodels.sandbox.distributions.mv_normal as mvd
>>>
>>> from numpy.testing import assert_array_almost_equal
>>>
>>> cov3 = np.array([[ 1.  ,  0.5 ,  0.75],
...                    [ 0.5 ,  1.5 ,  0.6 ],
...                    [ 0.75,  0.6 ,  2.  ]])
>>> mu = np.array([-1, 0.0, 2.0])

6.3.9.3. multivariate normal distribution

>>> mvn3 = mvd.MVNormal(mu, cov3)
>>> mvn3.rvs(size=3)
array([[-0.08559948, -1.0319881 ,  1.76073533],
       [ 0.30079522,  0.55859618,  4.16538667],
       [-1.36540091, -1.50152847,  3.87571161]])
>>> mvn3.std
array([ 1.        ,  1.22474487,  1.41421356])
>>> a = [0.0, 1.0, 1.5]
>>> mvn3.pdf(a)
0.013867410439318712
>>> mvn3.cdf(a)
0.31163181123730122

Monte Carlo integration

>>> mvn3.expect_mc(lambda x: (x<a).all(-1), size=100000)
0.30958999999999998
>>> mvn3.expect_mc(lambda x: (x<a).all(-1), size=1000000)
0.31197399999999997

6.3.9.4. multivariate t distribution

>>> mvt3 = mvd.MVT(mu, cov3, 4)
>>> mvt3.rvs(size=4)
array([[-0.94185437,  0.3933273 ,  2.40005487],
       [ 0.07563648,  0.06655433,  7.90752238],
       [ 1.06596474,  0.32701158,  2.03482886],
       [ 3.80529746,  7.0192967 ,  8.41899229]])
>>> mvt3.pdf(a)
0.010402959362646937
>>> mvt3.cdf(a)
0.30269483623249821
>>> mvt3.expect_mc(lambda x: (x<a).all(-1), size=1000000)
0.30271199999999998
>>> mvt3.cov
array([[ 2. ,  1. ,  1.5],
       [ 1. ,  3. ,  1.2],
       [ 1.5,  1.2,  4. ]])
>>> mvt3.corr
array([[ 1.        ,  0.40824829,  0.53033009],
       [ 0.40824829,  1.        ,  0.34641016],
       [ 0.53033009,  0.34641016,  1.        ]])

get normalized distribution

>>> mvt3n = mvt3.normalized()
>>> mvt3n.sigma
array([[ 1.        ,  0.40824829,  0.53033009],
       [ 0.40824829,  1.        ,  0.34641016],
       [ 0.53033009,  0.34641016,  1.        ]])
>>> mvt3n.cov
array([[ 2.        ,  0.81649658,  1.06066017],
       [ 0.81649658,  2.        ,  0.69282032],
       [ 1.06066017,  0.69282032,  2.        ]])

What’s currently there?

>>> [i for i in dir(mvn3) if not i[0]=='_']
['affine_transformed', 'cdf', 'cholsigmainv', 'conditional', 'corr', 'cov',
'expect_mc', 'extra_args', 'logdetsigma', 'logpdf', 'marginal', 'mean',
'normalize', 'normalized', 'normalized2', 'nvars', 'pdf', 'rvs', 'sigma',
'sigmainv', 'standardize', 'standardized', 'std', 'std_sigma', 'whiten']
>>> [i for i in dir(mvt3) if not i[0]=='_']
['affine_transformed', 'cdf', 'cholsigmainv', 'corr', 'cov', 'df', 'expect_mc',
'extra_args', 'logdetsigma', 'logpdf', 'marginal', 'mean', 'normalize',
'normalized', 'normalized2', 'nvars', 'pdf', 'rvs', 'sigma', 'sigmainv',
'standardize', 'standardized', 'std', 'std_sigma', 'whiten']

6.3.9.5. Functions

bivariate_normal(x, mu, cov) Bivariate Gaussian distribution for equal shape X, Y.
expect_mc(dist[, func, size]) calculate expected value of function by Monte Carlo integration
expect_mc_bounds(dist[, func, size, lower, ...]) calculate expected value of function by Monte Carlo integration
mvnormcdf(upper, mu, cov[, lower]) multivariate normal cumulative distribution function
mvstdnormcdf(lower, upper, corrcoef, **kwds) standardized multivariate normal cumulative distribution function
mvstdtprob(a, b, R, df[, ieps, quadkwds, ...]) probability of rectangular area of standard t distribution
quad2d([func, lower, upper])

6.3.9.6. Classes

BivariateNormal(mean, cov)
MVElliptical(mean, sigma, *args, **kwds) Base Class for multivariate elliptical distributions, normal and t
MVNormal(mean, sigma, *args, **kwds) Class for Multivariate Normal Distribution
MVNormal0(mean, cov) Class for Multivariate Normal Distribution
MVT(mean, sigma, df) initialize instance