3.4 Expanding Windows
A common alternative to rolling statistics is to use an expanding window, which yields the value of the statistic with all the data available up to that point in time.
These follow a similar interface to .rolling
, with the .expanding
method
returning an Expanding
object.
As these calculations are a special case of rolling statistics, they are implemented in pandas such that the following two calls are equivalent:
In [1]: df.rolling(window=len(df), min_periods=1).mean()[:5]
Out[1]:
A B C D
2000-01-01 -0.218470 -0.061645 -0.723780 0.551225
2000-01-02 -0.467353 0.357114 -0.172157 -0.007968
2000-01-03 -0.731308 0.165367 0.514631 -0.303931
2000-01-04 -1.003844 0.069892 0.877411 -0.204479
2000-01-05 -1.505896 0.107398 0.957243 -0.025130
In [2]: df.expanding(min_periods=1).mean()[:5]
Out[2]:
A B C D
2000-01-01 -0.218470 -0.061645 -0.723780 0.551225
2000-01-02 -0.467353 0.357114 -0.172157 -0.007968
2000-01-03 -0.731308 0.165367 0.514631 -0.303931
2000-01-04 -1.003844 0.069892 0.877411 -0.204479
2000-01-05 -1.505896 0.107398 0.957243 -0.025130
These have a similar set of methods to .rolling
methods.
3.4.1 Method Summary
Function | Description |
---|---|
count() |
Number of non-null observations |
sum() |
Sum of values |
mean() |
Mean of values |
median() |
Arithmetic median of values |
min() |
Minimum |
max() |
Maximum |
std() |
Unbiased standard deviation |
var() |
Unbiased variance |
skew() |
Unbiased skewness (3rd moment) |
kurt() |
Unbiased kurtosis (4th moment) |
quantile() |
Sample quantile (value at %) |
apply() |
Generic apply |
cov() |
Unbiased covariance (binary) |
corr() |
Correlation (binary) |
Aside from not having a window
parameter, these functions have the same
interfaces as their .rolling
counterparts. Like above, the parameters they
all accept are:
min_periods
: threshold of non-null data points to require. Defaults to minimum needed to compute statistic. NoNaNs
will be output oncemin_periods
non-null data points have been seen.center
: boolean, whether to set the labels at the center (default is False)
Note
The output of the .rolling
and .expanding
methods do not return a
NaN
if there are at least min_periods
non-null values in the current
window. This differs from cumsum
, cumprod
, cummax
, and
cummin
, which return NaN
in the output wherever a NaN
is
encountered in the input.
An expanding window statistic will be more stable (and less responsive) than
its rolling window counterpart as the increasing window size decreases the
relative impact of an individual data point. As an example, here is the
mean()
output for the previous time series dataset:
In [3]: s.plot(style='k--')
Out[3]: <matplotlib.axes._subplots.AxesSubplot at 0x2b35b9e1f250>
In [4]: s.expanding().mean().plot(style='k')
Out[4]: <matplotlib.axes._subplots.AxesSubplot at 0x2b35b9e1f250>