9 Time Series
pandas has simple, powerful, and efficient functionality for performing resampling operations during frequency conversion (e.g., converting secondly data into 5-minutely data). This is extremely common in, but not limited to, financial applications. See the Time Series section
In [1]: rng = pd.date_range('1/1/2012', periods=100, freq='S')
In [2]: ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)
In [3]: ts.resample('5Min').sum()
Out[3]:
2012-01-01 26448
Freq: 5T, dtype: int64
Time zone representation
In [4]: rng = pd.date_range('3/6/2012 00:00', periods=5, freq='D')
In [5]: ts = pd.Series(np.random.randn(len(rng)), rng)
In [6]: ts
Out[6]:
2012-03-06 -0.919854
2012-03-07 -0.042379
2012-03-08 1.247642
2012-03-09 -0.009920
2012-03-10 0.290213
Freq: D, dtype: float64
In [7]: ts_utc = ts.tz_localize('UTC')
In [8]: ts_utc
Out[8]:
2012-03-06 00:00:00+00:00 -0.919854
2012-03-07 00:00:00+00:00 -0.042379
2012-03-08 00:00:00+00:00 1.247642
2012-03-09 00:00:00+00:00 -0.009920
2012-03-10 00:00:00+00:00 0.290213
Freq: D, dtype: float64
Convert to another time zone
In [9]: ts_utc.tz_convert('US/Eastern')
Out[9]:
2012-03-05 19:00:00-05:00 -0.919854
2012-03-06 19:00:00-05:00 -0.042379
2012-03-07 19:00:00-05:00 1.247642
2012-03-08 19:00:00-05:00 -0.009920
2012-03-09 19:00:00-05:00 0.290213
Freq: D, dtype: float64
Converting between time span representations
In [10]: rng = pd.date_range('1/1/2012', periods=5, freq='M')
In [11]: ts = pd.Series(np.random.randn(len(rng)), index=rng)
In [12]: ts
Out[12]:
2012-01-31 0.495767
2012-02-29 0.362949
2012-03-31 1.548106
2012-04-30 -1.131345
2012-05-31 -0.089329
Freq: M, dtype: float64
In [13]: ps = ts.to_period()
In [14]: ps
Out[14]:
2012-01 0.495767
2012-02 0.362949
2012-03 1.548106
2012-04 -1.131345
2012-05 -0.089329
Freq: M, dtype: float64
In [15]: ps.to_timestamp()
Out[15]:
2012-01-01 0.495767
2012-02-01 0.362949
2012-03-01 1.548106
2012-04-01 -1.131345
2012-05-01 -0.089329
Freq: MS, dtype: float64
Converting between period and timestamp enables some convenient arithmetic functions to be used. In the following example, we convert a quarterly frequency with year ending in November to 9am of the end of the month following the quarter end:
In [16]: prng = pd.period_range('1990Q1', '2000Q4', freq='Q-NOV')
In [17]: ts = pd.Series(np.random.randn(len(prng)), prng)
In [18]: ts.index = (prng.asfreq('M', 'e') + 1).asfreq('H', 's') + 9
In [19]: ts.head()
Out[19]:
1990-03-01 09:00 0.337863
1990-06-01 09:00 -0.945867
1990-09-01 09:00 -0.932132
1990-12-01 09:00 1.956030
1991-03-01 09:00 0.017587
Freq: H, dtype: float64