9 .dt accessor
Series
has an accessor to succinctly return datetime like properties for the
values of the Series, if it is a datetime/period like Series.
This will return a Series, indexed like the existing Series.
# datetime
In [1]: s = pd.Series(pd.date_range('20130101 09:10:12', periods=4))
In [2]: s
Out[2]:
0 2013-01-01 09:10:12
1 2013-01-02 09:10:12
2 2013-01-03 09:10:12
3 2013-01-04 09:10:12
dtype: datetime64[ns]
In [3]: s.dt.hour
Out[3]:
0 9
1 9
2 9
3 9
dtype: int64
In [4]: s.dt.second
Out[4]:
0 12
1 12
2 12
3 12
dtype: int64
In [5]: s.dt.day
Out[5]:
0 1
1 2
2 3
3 4
dtype: int64
This enables nice expressions like this:
In [6]: s[s.dt.day==2]
Out[6]:
1 2013-01-02 09:10:12
dtype: datetime64[ns]
You can easily produces tz aware transformations:
In [7]: stz = s.dt.tz_localize('US/Eastern')
In [8]: stz
Out[8]:
0 2013-01-01 09:10:12-05:00
1 2013-01-02 09:10:12-05:00
2 2013-01-03 09:10:12-05:00
3 2013-01-04 09:10:12-05:00
dtype: datetime64[ns, US/Eastern]
In [9]: stz.dt.tz
Out[9]: <DstTzInfo 'US/Eastern' LMT-1 day, 19:04:00 STD>
You can also chain these types of operations:
In [10]: s.dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
Out[10]:
0 2013-01-01 04:10:12-05:00
1 2013-01-02 04:10:12-05:00
2 2013-01-03 04:10:12-05:00
3 2013-01-04 04:10:12-05:00
dtype: datetime64[ns, US/Eastern]
You can also format datetime values as strings with Series.dt.strftime()
which
supports the same format as the standard strftime()
.
# DatetimeIndex
In [11]: s = pd.Series(pd.date_range('20130101', periods=4))
In [12]: s
Out[12]:
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
dtype: datetime64[ns]
In [13]: s.dt.strftime('%Y/%m/%d')
Out[13]:
0 2013/01/01
1 2013/01/02
2 2013/01/03
3 2013/01/04
dtype: object
# PeriodIndex
In [14]: s = pd.Series(pd.period_range('20130101', periods=4))
In [15]: s
Out[15]:
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
dtype: object
In [16]: s.dt.strftime('%Y/%m/%d')
Out[16]:
0 2013/01/01
1 2013/01/02
2 2013/01/03
3 2013/01/04
dtype: object
The .dt
accessor works for period and timedelta dtypes.
# period
In [17]: s = pd.Series(pd.period_range('20130101', periods=4, freq='D'))
In [18]: s
Out[18]:
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
dtype: object
In [19]: s.dt.year
Out[19]:
0 2013
1 2013
2 2013
3 2013
dtype: int64
In [20]: s.dt.day
Out[20]:
0 1
1 2
2 3
3 4
dtype: int64
# timedelta
In [21]: s = pd.Series(pd.timedelta_range('1 day 00:00:05', periods=4, freq='s'))
In [22]: s
Out[22]:
0 1 days 00:00:05
1 1 days 00:00:06
2 1 days 00:00:07
3 1 days 00:00:08
dtype: timedelta64[ns]
In [23]: s.dt.days
Out[23]:
0 1
1 1
2 1
3 1
dtype: int64
In [24]: s.dt.seconds
Out[24]:
0 5
1 6
2 7
3 8
dtype: int64
In [25]: s.dt.components
Out[25]:
days hours minutes seconds milliseconds microseconds nanoseconds
0 1 0 0 5 0 0 0
1 1 0 0 6 0 0 0
2 1 0 0 7 0 0 0
3 1 0 0 8 0 0 0
Note
Series.dt
will raise a TypeError
if you access with a non-datetimelike values