>>> from datetime import datetime, timedelta
>>> import numpy as np
>>> import pandas as pd
>>> np.random.seed(123456)
>>> from pandas import *
>>> randn = np.random.randn
>>> randint = np.random.randint
>>> np.set_printoptions(precision=4, suppress=True)
>>> options.display.max_rows=15
>>> import dateutil
>>> import pytz
>>> from dateutil.relativedelta import relativedelta
>>> from pandas.tseries.api import *
>>> from pandas.tseries.offsets import *
4 Time Deltas
Note
Starting in v0.15.0, we introduce a new scalar type Timedelta
, which is a subclass of datetime.timedelta
, and behaves in a similar manner,
but allows compatibility with np.timedelta64
types as well as a host of custom representation, parsing, and attributes.
Timedeltas are differences in times, expressed in difference units, e.g. days, hours, minutes, seconds. They can be both positive and negative.
4.1 Parsing
You can construct a Timedelta
scalar through various arguments:
# strings
In [1]: Timedelta('1 days')
Out[1]: Timedelta('1 days 00:00:00')
In [2]: Timedelta('1 days 00:00:00')
Out[2]: Timedelta('1 days 00:00:00')
In [3]: Timedelta('1 days 2 hours')
Out[3]: Timedelta('1 days 02:00:00')
In [4]: Timedelta('-1 days 2 min 3us')
Out[4]: Timedelta('-2 days +23:57:59.999997')
# like datetime.timedelta
# note: these MUST be specified as keyword arguments
In [5]: Timedelta(days=1, seconds=1)
Out[5]: Timedelta('1 days 00:00:01')
# integers with a unit
In [6]: Timedelta(1, unit='d')
Out[6]: Timedelta('1 days 00:00:00')
# from a timedelta/np.timedelta64
In [7]: Timedelta(timedelta(days=1, seconds=1))
Out[7]: Timedelta('1 days 00:00:01')
In [8]: Timedelta(np.timedelta64(1, 'ms'))
Out[8]: Timedelta('0 days 00:00:00.001000')
# negative Timedeltas have this string repr
# to be more consistent with datetime.timedelta conventions
In [9]: Timedelta('-1us')
Out[9]: Timedelta('-1 days +23:59:59.999999')
# a NaT
In [10]: Timedelta('nan')
Out[10]: NaT
In [11]: Timedelta('nat')
Out[11]: NaT
DateOffsets (Day, Hour, Minute, Second, Milli, Micro, Nano
) can also be used in construction.
In [12]: Timedelta(Second(2))
Out[12]: Timedelta('0 days 00:00:02')
Further, operations among the scalars yield another scalar Timedelta
.
In [13]: Timedelta(Day(2)) + Timedelta(Second(2)) + Timedelta('00:00:00.000123')
Out[13]: Timedelta('2 days 00:00:02.000123')
4.1.1 to_timedelta
Warning
Prior to 0.15.0 pd.to_timedelta
would return a Series
for list-like/Series input, and a np.timedelta64
for scalar input.
It will now return a TimedeltaIndex
for list-like input, Series
for Series input, and Timedelta
for scalar input.
The arguments to pd.to_timedelta
are now (arg, unit='ns', box=True)
, previously were (arg, box=True, unit='ns')
as these are more logical.
Using the top-level pd.to_timedelta
, you can convert a scalar, array, list, or Series from a recognized timedelta format / value into a Timedelta
type.
It will construct Series if the input is a Series, a scalar if the input is scalar-like, otherwise will output a TimedeltaIndex
.
You can parse a single string to a Timedelta:
In [14]: to_timedelta('1 days 06:05:01.00003')
Out[14]: Timedelta('1 days 06:05:01.000030')
In [15]: to_timedelta('15.5us')
Out[15]: Timedelta('0 days 00:00:00.000015')
or a list/array of strings:
In [16]: to_timedelta(['1 days 06:05:01.00003', '15.5us', 'nan'])
Out[16]: TimedeltaIndex(['1 days 06:05:01.000030', '0 days 00:00:00.000015', NaT], dtype='timedelta64[ns]', freq=None)
The unit
keyword argument specifies the unit of the Timedelta:
In [17]: to_timedelta(np.arange(5), unit='s')
Out[17]: TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04'], dtype='timedelta64[ns]', freq=None)
In [18]: to_timedelta(np.arange(5), unit='d')
Out[18]: TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)
4.1.2 Timedelta limitations
Pandas represents Timedeltas
in nanosecond resolution using
64 bit integers. As such, the 64 bit integer limits determine
the Timedelta
limits.
In [19]: pd.Timedelta.min
Out[19]: Timedelta('-106752 days +00:12:43.145224')
In [20]: pd.Timedelta.max
Out[20]: Timedelta('106751 days 23:47:16.854775')
4.2 Operations
You can operate on Series/DataFrames and construct timedelta64[ns]
Series through
subtraction operations on datetime64[ns]
Series, or Timestamps
.
In [21]: s = Series(date_range('2012-1-1', periods=3, freq='D'))
In [22]: td = Series([ Timedelta(days=i) for i in range(3) ])
In [23]: df = DataFrame(dict(A = s, B = td))
In [24]: df
Out[24]:
A B
0 2012-01-01 0 days
1 2012-01-02 1 days
2 2012-01-03 2 days
In [25]: df['C'] = df['A'] + df['B']
In [26]: df
Out[26]:
A B C
0 2012-01-01 0 days 2012-01-01
1 2012-01-02 1 days 2012-01-03
2 2012-01-03 2 days 2012-01-05
In [27]: df.dtypes
Out[27]:
A datetime64[ns]
B timedelta64[ns]
C datetime64[ns]
dtype: object
In [28]: s - s.max()
Out[28]:
0 -2 days
1 -1 days
2 0 days
dtype: timedelta64[ns]
In [29]: s - datetime(2011, 1, 1, 3, 5)
Out[29]:
0 364 days 20:55:00
1 365 days 20:55:00
2 366 days 20:55:00
dtype: timedelta64[ns]
In [30]: s + timedelta(minutes=5)
Out[30]:
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
In [31]: s + Minute(5)
Out[31]:
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
In [32]: s + Minute(5) + Milli(5)
Out[32]:
0 2012-01-01 00:05:00.005
1 2012-01-02 00:05:00.005
2 2012-01-03 00:05:00.005
dtype: datetime64[ns]
Operations with scalars from a timedelta64[ns]
series:
In [33]: y = s - s[0]
In [34]: y
Out[34]:
0 0 days
1 1 days
2 2 days
dtype: timedelta64[ns]
Series of timedeltas with NaT
values are supported:
In [35]: y = s - s.shift()
In [36]: y
Out[36]:
0 NaT
1 1 days
2 1 days
dtype: timedelta64[ns]
Elements can be set to NaT
using np.nan
analogously to datetimes:
In [37]: y[1] = np.nan
In [38]: y
Out[38]:
0 NaT
1 NaT
2 1 days
dtype: timedelta64[ns]
Operands can also appear in a reversed order (a singular object operated with a Series):
In [39]: s.max() - s
Out[39]:
0 2 days
1 1 days
2 0 days
dtype: timedelta64[ns]
In [40]: datetime(2011, 1, 1, 3, 5) - s
Out[40]:
0 -365 days +03:05:00
1 -366 days +03:05:00
2 -367 days +03:05:00
dtype: timedelta64[ns]
In [41]: timedelta(minutes=5) + s
Out[41]:
0 2012-01-01 00:05:00
1 2012-01-02 00:05:00
2 2012-01-03 00:05:00
dtype: datetime64[ns]
min, max
and the corresponding idxmin, idxmax
operations are supported on frames:
In [42]: A = s - Timestamp('20120101') - Timedelta('00:05:05')
In [43]: B = s - Series(date_range('2012-1-2', periods=3, freq='D'))
In [44]: df = DataFrame(dict(A=A, B=B))
In [45]: df
Out[45]:
A B
0 -1 days +23:54:55 -1 days
1 0 days 23:54:55 -1 days
2 1 days 23:54:55 -1 days
In [46]: df.min()
Out[46]:
A -1 days +23:54:55
B -1 days +00:00:00
dtype: timedelta64[ns]
In [47]: df.min(axis=1)
Out[47]:
0 -1 days
1 -1 days
2 -1 days
dtype: timedelta64[ns]
In [48]: df.idxmin()
Out[48]:
A 0
B 0
dtype: int64
In [49]: df.idxmax()
Out[49]:
A 2
B 0
dtype: int64
min, max, idxmin, idxmax
operations are supported on Series as well. A scalar result will be a Timedelta
.
In [50]: df.min().max()
Out[50]: Timedelta('-1 days +23:54:55')
In [51]: df.min(axis=1).min()
Out[51]: Timedelta('-1 days +00:00:00')
In [52]: df.min().idxmax()
Out[52]: 'A'
In [53]: df.min(axis=1).idxmin()
Out[53]: 0
You can fillna on timedeltas. Integers will be interpreted as seconds. You can pass a timedelta to get a particular value.
In [54]: y.fillna(0)
Out[54]:
0 0 days
1 0 days
2 1 days
dtype: timedelta64[ns]
In [55]: y.fillna(10)
Out[55]:
0 0 days 00:00:10
1 0 days 00:00:10
2 1 days 00:00:00
dtype: timedelta64[ns]
In [56]: y.fillna(Timedelta('-1 days, 00:00:05'))
Out[56]:
0 -1 days +00:00:05
1 -1 days +00:00:05
2 1 days 00:00:00
dtype: timedelta64[ns]
You can also negate, multiply and use abs
on Timedeltas
:
In [57]: td1 = Timedelta('-1 days 2 hours 3 seconds')
In [58]: td1
Out[58]: Timedelta('-2 days +21:59:57')
In [59]: -1 * td1
Out[59]: Timedelta('1 days 02:00:03')
In [60]: - td1
Out[60]: Timedelta('1 days 02:00:03')
In [61]: abs(td1)
Out[61]: Timedelta('1 days 02:00:03')
4.3 Reductions
Numeric reduction operation for timedelta64[ns]
will return Timedelta
objects. As usual
NaT
are skipped during evaluation.
In [62]: y2 = Series(to_timedelta(['-1 days +00:00:05', 'nat', '-1 days +00:00:05', '1 days']))
In [63]: y2
Out[63]:
0 -1 days +00:00:05
1 NaT
2 -1 days +00:00:05
3 1 days 00:00:00
dtype: timedelta64[ns]
In [64]: y2.mean()
Out[64]: Timedelta('-1 days +16:00:03.333333')
In [65]: y2.median()
Out[65]: Timedelta('-1 days +00:00:05')
In [66]: y2.quantile(.1)
Out[66]: Timedelta('-1 days +00:00:05')
In [67]: y2.sum()
Out[67]: Timedelta('-1 days +00:00:10')
4.4 Frequency Conversion
New in version 0.13.
Timedelta Series, TimedeltaIndex
, and Timedelta
scalars can be converted to other ‘frequencies’ by dividing by another timedelta,
or by astyping to a specific timedelta type. These operations yield Series and propagate NaT
-> nan
.
Note that division by the numpy scalar is true division, while astyping is equivalent of floor division.
In [68]: td = Series(date_range('20130101', periods=4)) - \
....: Series(date_range('20121201', periods=4))
....:
In [69]: td[2] += timedelta(minutes=5, seconds=3)
In [70]: td[3] = np.nan
In [71]: td
Out[71]:
0 31 days 00:00:00
1 31 days 00:00:00
2 31 days 00:05:03
3 NaT
dtype: timedelta64[ns]
# to days
In [72]: td / np.timedelta64(1, 'D')
Out[72]:
0 31.000000
1 31.000000
2 31.003507
3 NaN
dtype: float64
In [73]: td.astype('timedelta64[D]')
Out[73]:
0 31.0
1 31.0
2 31.0
3 NaN
dtype: float64
# to seconds
In [74]: td / np.timedelta64(1, 's')
Out[74]:
0 2678400.0
1 2678400.0
2 2678703.0
3 NaN
dtype: float64
In [75]: td.astype('timedelta64[s]')
Out[75]:
0 2678400.0
1 2678400.0
2 2678703.0
3 NaN
dtype: float64
# to months (these are constant months)
In [76]: td / np.timedelta64(1, 'M')
Out[76]:
0 1.018501
1 1.018501
2 1.018617
3 NaN
dtype: float64
Dividing or multiplying a timedelta64[ns]
Series by an integer or integer Series
yields another timedelta64[ns]
dtypes Series.
In [77]: td * -1
Out[77]:
0 -31 days +00:00:00
1 -31 days +00:00:00
2 -32 days +23:54:57
3 NaT
dtype: timedelta64[ns]
In [78]: td * Series([1, 2, 3, 4])
Out[78]:
0 31 days 00:00:00
1 62 days 00:00:00
2 93 days 00:15:09
3 NaT
dtype: timedelta64[ns]
4.5 Attributes
You can access various components of the Timedelta
or TimedeltaIndex
directly using the attributes days,seconds,microseconds,nanoseconds
. These are identical to the values returned by datetime.timedelta
, in that, for example, the .seconds
attribute represents the number of seconds >= 0 and < 1 day. These are signed according to whether the Timedelta
is signed.
These operations can also be directly accessed via the .dt
property of the Series
as well.
Note
Note that the attributes are NOT the displayed values of the Timedelta
. Use .components
to retrieve the displayed values.
For a Series
:
In [79]: td.dt.days
Out[79]:
0 31.0
1 31.0
2 31.0
3 NaN
dtype: float64
In [80]: td.dt.seconds
Out[80]:
0 0.0
1 0.0
2 303.0
3 NaN
dtype: float64
You can access the value of the fields for a scalar Timedelta
directly.
In [81]: tds = Timedelta('31 days 5 min 3 sec')
In [82]: tds.days
Out[82]: 31
In [83]: tds.seconds
Out[83]: 303
In [84]: (-tds).seconds
Out[84]: 86097
You can use the .components
property to access a reduced form of the timedelta. This returns a DataFrame
indexed
similarly to the Series
. These are the displayed values of the Timedelta
.
In [85]: td.dt.components
Out[85]:
days hours minutes seconds milliseconds microseconds nanoseconds
0 31.0 0.0 0.0 0.0 0.0 0.0 0.0
1 31.0 0.0 0.0 0.0 0.0 0.0 0.0
2 31.0 0.0 5.0 3.0 0.0 0.0 0.0
3 NaN NaN NaN NaN NaN NaN NaN
In [86]: td.dt.components.seconds
Out[86]:
0 0.0
1 0.0
2 3.0
3 NaN
Name: seconds, dtype: float64
4.6 TimedeltaIndex
New in version 0.15.0.
To generate an index with time delta, you can use either the TimedeltaIndex
or
the timedelta_range
constructor.
Using TimedeltaIndex
you can pass string-like, Timedelta
, timedelta
,
or np.timedelta64
objects. Passing np.nan/pd.NaT/nat
will represent missing values.
In [87]: TimedeltaIndex(['1 days', '1 days, 00:00:05',
....: np.timedelta64(2,'D'), timedelta(days=2,seconds=2)])
....:
Out[87]:
TimedeltaIndex(['1 days 00:00:00', '1 days 00:00:05', '2 days 00:00:00',
'2 days 00:00:02'],
dtype='timedelta64[ns]', freq=None)
Similarly to date_range
, you can construct regular ranges of a TimedeltaIndex
:
In [88]: timedelta_range(start='1 days', periods=5, freq='D')
Out[88]: TimedeltaIndex(['1 days', '2 days', '3 days', '4 days', '5 days'], dtype='timedelta64[ns]', freq='D')
In [89]: timedelta_range(start='1 days', end='2 days', freq='30T')
Out[89]:
TimedeltaIndex(['1 days 00:00:00', '1 days 00:30:00', '1 days 01:00:00',
'1 days 01:30:00', '1 days 02:00:00', '1 days 02:30:00',
'1 days 03:00:00', '1 days 03:30:00', '1 days 04:00:00',
'1 days 04:30:00', '1 days 05:00:00', '1 days 05:30:00',
'1 days 06:00:00', '1 days 06:30:00', '1 days 07:00:00',
'1 days 07:30:00', '1 days 08:00:00', '1 days 08:30:00',
'1 days 09:00:00', '1 days 09:30:00', '1 days 10:00:00',
'1 days 10:30:00', '1 days 11:00:00', '1 days 11:30:00',
'1 days 12:00:00', '1 days 12:30:00', '1 days 13:00:00',
'1 days 13:30:00', '1 days 14:00:00', '1 days 14:30:00',
'1 days 15:00:00', '1 days 15:30:00', '1 days 16:00:00',
'1 days 16:30:00', '1 days 17:00:00', '1 days 17:30:00',
'1 days 18:00:00', '1 days 18:30:00', '1 days 19:00:00',
'1 days 19:30:00', '1 days 20:00:00', '1 days 20:30:00',
'1 days 21:00:00', '1 days 21:30:00', '1 days 22:00:00',
'1 days 22:30:00', '1 days 23:00:00', '1 days 23:30:00',
'2 days 00:00:00'],
dtype='timedelta64[ns]', freq='30T')
4.6.1 Using the TimedeltaIndex
Similarly to other of the datetime-like indices, DatetimeIndex
and PeriodIndex
, you can use
TimedeltaIndex
as the index of pandas objects.
In [90]: s = Series(np.arange(100),
....: index=timedelta_range('1 days', periods=100, freq='h'))
....:
In [91]: s
Out[91]:
1 days 00:00:00 0
1 days 01:00:00 1
1 days 02:00:00 2
1 days 03:00:00 3
1 days 04:00:00 4
1 days 05:00:00 5
1 days 06:00:00 6
..
4 days 21:00:00 93
4 days 22:00:00 94
4 days 23:00:00 95
5 days 00:00:00 96
5 days 01:00:00 97
5 days 02:00:00 98
5 days 03:00:00 99
Freq: H, dtype: int64
Selections work similarly, with coercion on string-likes and slices:
In [92]: s['1 day':'2 day']
Out[92]:
1 days 00:00:00 0
1 days 01:00:00 1
1 days 02:00:00 2
1 days 03:00:00 3
1 days 04:00:00 4
1 days 05:00:00 5
1 days 06:00:00 6
..
2 days 17:00:00 41
2 days 18:00:00 42
2 days 19:00:00 43
2 days 20:00:00 44
2 days 21:00:00 45
2 days 22:00:00 46
2 days 23:00:00 47
Freq: H, dtype: int64
In [93]: s['1 day 01:00:00']
Out[93]: 1
In [94]: s[Timedelta('1 day 1h')]
Out[94]: 1
Furthermore you can use partial string selection and the range will be inferred:
In [95]: s['1 day':'1 day 5 hours']
Out[95]:
1 days 00:00:00 0
1 days 01:00:00 1
1 days 02:00:00 2
1 days 03:00:00 3
1 days 04:00:00 4
1 days 05:00:00 5
Freq: H, dtype: int64
4.6.2 Operations
Finally, the combination of TimedeltaIndex
with DatetimeIndex
allow certain combination operations that are NaT preserving:
In [96]: tdi = TimedeltaIndex(['1 days', pd.NaT, '2 days'])
In [97]: tdi.tolist()
Out[97]: [Timedelta('1 days 00:00:00'), NaT, Timedelta('2 days 00:00:00')]
In [98]: dti = date_range('20130101', periods=3)
In [99]: dti.tolist()
Out[99]:
[Timestamp('2013-01-01 00:00:00', freq='D'),
Timestamp('2013-01-02 00:00:00', freq='D'),
Timestamp('2013-01-03 00:00:00', freq='D')]
In [100]: (dti + tdi).tolist()
Out[100]: [Timestamp('2013-01-02 00:00:00'), NaT, Timestamp('2013-01-05 00:00:00')]
In [101]: (dti - tdi).tolist()
Out[101]: [Timestamp('2012-12-31 00:00:00'), NaT, Timestamp('2013-01-01 00:00:00')]
4.6.3 Conversions
Similarly to frequency conversion on a Series
above, you can convert these indices to yield another Index.
In [102]: tdi / np.timedelta64(1,'s')
Out[102]: Float64Index([86400.0, nan, 172800.0], dtype='float64')
In [103]: tdi.astype('timedelta64[s]')
Out[103]: Float64Index([86400.0, nan, 172800.0], dtype='float64')
Scalars type ops work as well. These can potentially return a different type of index.
# adding or timedelta and date -> datelike
In [104]: tdi + Timestamp('20130101')
Out[104]: DatetimeIndex(['2013-01-02', 'NaT', '2013-01-03'], dtype='datetime64[ns]', freq=None)
# subtraction of a date and a timedelta -> datelike
# note that trying to subtract a date from a Timedelta will raise an exception
In [105]: (Timestamp('20130101') - tdi).tolist()
Out[105]: [Timestamp('2012-12-31 00:00:00'), NaT, Timestamp('2012-12-30 00:00:00')]
# timedelta + timedelta -> timedelta
In [106]: tdi + Timedelta('10 days')
Out[106]: TimedeltaIndex(['11 days', NaT, '12 days'], dtype='timedelta64[ns]', freq=None)
# division can result in a Timedelta if the divisor is an integer
In [107]: tdi / 2
Out[107]: TimedeltaIndex(['0 days 12:00:00', NaT, '1 days 00:00:00'], dtype='timedelta64[ns]', freq=None)
# or a Float64Index if the divisor is a Timedelta
In [108]: tdi / tdi[0]
Out[108]: Float64Index([1.0, nan, 2.0], dtype='float64')
4.7 Resampling
Similar to timeseries resampling, we can resample with a TimedeltaIndex
.
In [109]: s.resample('D').mean()
Out[109]:
1 days 11.5
2 days 35.5
3 days 59.5
4 days 83.5
5 days 97.5
Freq: D, dtype: float64