7 DatetimeIndex

One of the main uses for DatetimeIndex is as an index for pandas objects. The DatetimeIndex class contains many timeseries related optimizations:

A large range of dates for various offsets are pre-computed and cached under the hood in order to make generating subsequent date ranges very fast (just have to grab a slice)

Fast shifting using the shift and tshift method on pandas objects

Unioning of overlapping DatetimeIndex objects with the same frequency is very fast (important for fast data alignment)

Quick access to date fields via properties such as year, month, etc.

Regularization functions like snap and very fast asof logic

DatetimeIndex objects has all the basic functionality of regular Index objects and a smorgasbord of advanced timeseries-specific methods for easy frequency processing.

7.1 DatetimeIndex Partial String Indexing

You can pass in dates and strings that parse to dates as indexing parameters:

In [6]: ts['1/31/2011']
Out[6]: 0.46911229990718628

In [7]: ts[datetime(2011, 12, 25):]
Out[7]: 
2011-12-30    1.071804
Freq: BM, dtype: float64

In [8]: ts['10/31/2011':'12/31/2011']
Out[8]: 
2011-10-31   -2.104569
2011-11-30   -0.494929
2011-12-30    1.071804
Freq: BM, dtype: float64

To provide convenience for accessing longer time series, you can also pass in the year or year and month as strings:

In [9]: ts['2011']
Out[9]: 
2011-01-31    0.469112
2011-02-28   -0.282863
2011-03-31   -1.509059
2011-04-29   -1.135632
                ...   
2011-09-30   -0.861849
2011-10-31   -2.104569
2011-11-30   -0.494929
2011-12-30    1.071804
Freq: BM, dtype: float64

In [10]: ts['2011-6']
Out[10]: 
2011-06-30   -0.173215
Freq: BM, dtype: float64

This type of slicing will work on a DataFrame with a DateTimeIndex as well. Since the partial string selection is a form of label slicing, the endpoints will be included. This would include matching times on an included date. Here’s an example:

In [11]: dft = pd.DataFrame(randn(100000,1),
   ....:                    columns=['A'],
   ....:                    index=pd.date_range('20130101',periods=100000,freq='T'))
   ....: 

In [12]: dft
Out[12]: 
                            A
2013-01-01 00:00:00  0.721555
2013-01-01 00:01:00 -0.706771
2013-01-01 00:02:00 -1.039575
2013-01-01 00:03:00  0.271860
...                       ...
2013-03-11 10:36:00 -0.777369
2013-03-11 10:37:00  0.293270
2013-03-11 10:38:00  1.076066
2013-03-11 10:39:00 -1.562855

[100000 rows x 1 columns]

In [13]: dft['2013']
Out[13]: 
                            A
2013-01-01 00:00:00  0.721555
2013-01-01 00:01:00 -0.706771
2013-01-01 00:02:00 -1.039575
2013-01-01 00:03:00  0.271860
...                       ...
2013-03-11 10:36:00 -0.777369
2013-03-11 10:37:00  0.293270
2013-03-11 10:38:00  1.076066
2013-03-11 10:39:00 -1.562855

[100000 rows x 1 columns]

This starts on the very first time in the month, and includes the last date & time for the month

In [14]: dft['2013-1':'2013-2']
Out[14]: 
                            A
2013-01-01 00:00:00  0.721555
2013-01-01 00:01:00 -0.706771
2013-01-01 00:02:00 -1.039575
2013-01-01 00:03:00  0.271860
...                       ...
2013-02-28 23:56:00 -0.969105
2013-02-28 23:57:00  0.367078
2013-02-28 23:58:00  0.069096
2013-02-28 23:59:00  0.096317

[84960 rows x 1 columns]

This specifies a stop time that includes all of the times on the last day

In [15]: dft['2013-1':'2013-2-28']
Out[15]: 
                            A
2013-01-01 00:00:00  0.721555
2013-01-01 00:01:00 -0.706771
2013-01-01 00:02:00 -1.039575
2013-01-01 00:03:00  0.271860
...                       ...
2013-02-28 23:56:00 -0.969105
2013-02-28 23:57:00  0.367078
2013-02-28 23:58:00  0.069096
2013-02-28 23:59:00  0.096317

[84960 rows x 1 columns]

This specifies an exact stop time (and is not the same as the above)

In [16]: dft['2013-1':'2013-2-28 00:00:00']
Out[16]: 
                            A
2013-01-01 00:00:00  0.721555
2013-01-01 00:01:00 -0.706771
2013-01-01 00:02:00 -1.039575
2013-01-01 00:03:00  0.271860
...                       ...
2013-02-27 23:57:00 -0.102232
2013-02-27 23:58:00 -0.186728
2013-02-27 23:59:00  0.694242
2013-02-28 00:00:00  1.604295

[83521 rows x 1 columns]

We are stopping on the included end-point as it is part of the index

In [17]: dft['2013-1-15':'2013-1-15 12:30:00']
Out[17]: 
                            A
2013-01-15 00:00:00  0.276491
2013-01-15 00:01:00 -0.286093
2013-01-15 00:02:00 -0.457773
2013-01-15 00:03:00  0.133433
...                       ...
2013-01-15 12:27:00  1.327543
2013-01-15 12:28:00 -0.160075
2013-01-15 12:29:00  1.295527
2013-01-15 12:30:00 -0.850073

[751 rows x 1 columns]

Warning

The following selection will raise a KeyError; otherwise this selection methodology would be inconsistent with other selection methods in pandas (as this is not a slice, nor does it resolve to one)

dft['2013-1-15 12:30:00']

To select a single row, use .loc

In [18]: dft.loc['2013-1-15 12:30:00']
Out[18]: 
A   -0.850073
Name: 2013-01-15 12:30:00, dtype: float64

New in version 0.18.0.

DatetimeIndex Partial String Indexing also works on DataFrames with a MultiIndex. For example:

In [19]: dft2 = pd.DataFrame(np.random.randn(20, 1),
   ....:                     columns=['A'],
   ....:                     index=pd.MultiIndex.from_product([pd.date_range('20130101',
   ....:                                                                     periods=10,
   ....:                                                                     freq='12H'),
   ....:                                                      ['a', 'b']]))
   ....: 

In [20]: dft2
Out[20]: 
                              A
2013-01-01 00:00:00 a -0.776320
                    b -0.747967
2013-01-01 12:00:00 a -0.034523
                    b -0.201754
...                         ...
2013-01-05 00:00:00 a  0.488744
                    b  0.550788
2013-01-05 12:00:00 a  0.514308
                    b  0.054993

[20 rows x 1 columns]

In [21]: dft2.loc['2013-01-05']
Out[21]: 
                              A
2013-01-05 00:00:00 a  0.488744
                    b  0.550788
2013-01-05 12:00:00 a  0.514308
                    b  0.054993

In [22]: idx = pd.IndexSlice

In [23]: dft2 = dft2.swaplevel(0, 1).sort_index()

In [24]: dft2.loc[idx[:, '2013-01-05'], :]
Out[24]: 
                              A
a 2013-01-05 00:00:00  0.488744
  2013-01-05 12:00:00  0.514308
b 2013-01-05 00:00:00  0.550788
  2013-01-05 12:00:00  0.054993

7.2 Datetime Indexing

Indexing a DateTimeIndex with a partial string depends on the “accuracy” of the period, in other words how specific the interval is in relation to the frequency of the index. In contrast, indexing with datetime objects is exact, because the objects have exact meaning. These also follow the semantics of including both endpoints.

These datetime objects are specific hours, minutes, and seconds even though they were not explicitly specified (they are 0).

In [25]: dft[datetime(2013, 1, 1):datetime(2013,2,28)]
Out[25]: 
                            A
2013-01-01 00:00:00  0.721555
2013-01-01 00:01:00 -0.706771
2013-01-01 00:02:00 -1.039575
2013-01-01 00:03:00  0.271860
...                       ...
2013-02-27 23:57:00 -0.102232
2013-02-27 23:58:00 -0.186728
2013-02-27 23:59:00  0.694242
2013-02-28 00:00:00  1.604295

[83521 rows x 1 columns]

With no defaults.

In [26]: dft[datetime(2013, 1, 1, 10, 12, 0):datetime(2013, 2, 28, 10, 12, 0)]
Out[26]: 
                            A
2013-01-01 10:12:00  0.074192
2013-01-01 10:13:00 -0.249712
2013-01-01 10:14:00  1.162248
2013-01-01 10:15:00 -0.288962
...                       ...
2013-02-28 10:09:00  0.309910
2013-02-28 10:10:00 -0.000479
2013-02-28 10:11:00  1.785966
2013-02-28 10:12:00 -1.393114

[83521 rows x 1 columns]

7.3 Truncating & Fancy Indexing

A truncate convenience function is provided that is equivalent to slicing:

In [27]: ts.truncate(before='10/31/2011', after='12/31/2011')
Out[27]: 
2011-10-31   -2.104569
2011-11-30   -0.494929
2011-12-30    1.071804
Freq: BM, dtype: float64

Even complicated fancy indexing that breaks the DatetimeIndex’s frequency regularity will result in a DatetimeIndex (but frequency is lost):

In [28]: ts[[0, 2, 6]].index
Out[28]: DatetimeIndex(['2011-01-31', '2011-03-31', '2011-07-29'], dtype='datetime64[ns]', freq=None)

7.4 Time/Date Components

There are several time/date properties that one can access from Timestamp or a collection of timestamps like a DateTimeIndex.

Property	Description
year	The year of the datetime
month	The month of the datetime
day	The days of the datetime
hour	The hour of the datetime
minute	The minutes of the datetime
second	The seconds of the datetime
microsecond	The microseconds of the datetime
nanosecond	The nanoseconds of the datetime
date	Returns datetime.date (does not contain timezone information)
time	Returns datetime.time (does not contain timezone information)
dayofyear	The ordinal day of year
weekofyear	The week ordinal of the year
week	The week ordinal of the year
dayofweek	The numer of the day of the week with Monday=0, Sunday=6
weekday	The number of the day of the week with Monday=0, Sunday=6
weekday_name	The name of the day in a week (ex: Friday)
quarter	Quarter of the date: Jan=Mar = 1, Apr-Jun = 2, etc.
days_in_month	The number of days in the month of the datetime
is_month_start	Logical indicating if first day of month (defined by frequency)
is_month_end	Logical indicating if last day of month (defined by frequency)
is_quarter_start	Logical indicating if first day of quarter (defined by frequency)
is_quarter_end	Logical indicating if last day of quarter (defined by frequency)
is_year_start	Logical indicating if first day of year (defined by frequency)
is_year_end	Logical indicating if last day of year (defined by frequency)
is_leap_year	Logical indicating if the date belongs to a leap year

Furthermore, if you have a Series with datetimelike values, then you can access these properties via the .dt accessor, see the docs