3 Time Stamps vs. Time Spans
Time-stamped data is the most basic type of timeseries data that associates values with points in time. For pandas objects it means using the points in time.
In [1]: pd.Timestamp(datetime(2012, 5, 1))
Out[1]: Timestamp('2012-05-01 00:00:00')
In [2]: pd.Timestamp('2012-05-01')
Out[2]: Timestamp('2012-05-01 00:00:00')
In [3]: pd.Timestamp(2012, 5, 1)
Out[3]: Timestamp('2012-05-01 00:00:00')
However, in many cases it is more natural to associate things like change
variables with a time span instead. The span represented by Period
can be
specified explicitly, or inferred from datetime string format.
For example:
In [4]: pd.Period('2011-01')
Out[4]: Period('2011-01', 'M')
In [5]: pd.Period('2012-05', freq='D')
Out[5]: Period('2012-05-01', 'D')
Timestamp
and Period
can be the index. Lists of Timestamp
and
Period
are automatically coerce to DatetimeIndex
and PeriodIndex
respectively.
In [6]: dates = [pd.Timestamp('2012-05-01'), pd.Timestamp('2012-05-02'), pd.Timestamp('2012-05-03')]
In [7]: ts = pd.Series(np.random.randn(3), dates)
In [8]: type(ts.index)
Out[8]: pandas.tseries.index.DatetimeIndex
In [9]: ts.index
Out[9]: DatetimeIndex(['2012-05-01', '2012-05-02', '2012-05-03'], dtype='datetime64[ns]', freq=None)
In [10]: ts
Out[10]:
2012-05-01 0.469112
2012-05-02 -0.282863
2012-05-03 -1.509059
dtype: float64
In [11]: periods = [pd.Period('2012-01'), pd.Period('2012-02'), pd.Period('2012-03')]
In [12]: ts = pd.Series(np.random.randn(3), periods)
In [13]: type(ts.index)
Out[13]: pandas.tseries.period.PeriodIndex
In [14]: ts.index
Out[14]: PeriodIndex(['2012-01', '2012-02', '2012-03'], dtype='int64', freq='M')
In [15]: ts
Out[15]:
2012-01 -1.135632
2012-02 1.212112
2012-03 -0.173215
Freq: M, dtype: float64
pandas allows you to capture both representations and
convert between them. Under the hood, pandas represents timestamps using
instances of Timestamp
and sequences of timestamps using instances of
DatetimeIndex
. For regular time spans, pandas uses Period
objects for
scalar values and PeriodIndex
for sequences of spans. Better support for
irregular intervals with arbitrary start and end points are forth-coming in
future releases.