13 Representing out-of-bounds spans
If you have data that is outside of the Timestamp
bounds, see Timestamp limitations,
then you can use a PeriodIndex
and/or Series
of Periods
to do computations.
In [1]: span = pd.period_range('1215-01-01', '1381-01-01', freq='D')
In [2]: span
Out[2]:
PeriodIndex(['1215-01-01', '1215-01-02', '1215-01-03', '1215-01-04',
'1215-01-05', '1215-01-06', '1215-01-07', '1215-01-08',
'1215-01-09', '1215-01-10',
...
'1380-12-23', '1380-12-24', '1380-12-25', '1380-12-26',
'1380-12-27', '1380-12-28', '1380-12-29', '1380-12-30',
'1380-12-31', '1381-01-01'],
dtype='int64', length=60632, freq='D')
To convert from a int64
based YYYYMMDD representation.
In [3]: s = pd.Series([20121231, 20141130, 99991231])
In [4]: s
Out[4]:
0 20121231
1 20141130
2 99991231
dtype: int64
In [5]: def conv(x):
...: return pd.Period(year = x // 10000, month = x//100 % 100, day = x%100, freq='D')
...:
In [6]: s.apply(conv)
Out[6]:
0 2012-12-31
1 2014-11-30
2 9999-12-31
dtype: object
In [7]: s.apply(conv)[2]
Out[7]: Period('9999-12-31', 'D')
These can easily be converted to a PeriodIndex
In [8]: span = pd.PeriodIndex(s.apply(conv))
In [9]: span
Out[9]: PeriodIndex(['2012-12-31', '2014-11-30', '9999-12-31'], dtype='int64', freq='D')