14 Time Zone Handling

Pandas provides rich support for working with timestamps in different time zones using pytz and dateutil libraries. dateutil support is new in 0.14.1 and currently only supported for fixed offset and tzfile zones. The default library is pytz. Support for dateutil is provided for compatibility with other applications e.g. if you use dateutil in other python packages.

14.1 Working with Time Zones

By default, pandas objects are time zone unaware:

In [1]: rng = pd.date_range('3/6/2012 00:00', periods=15, freq='D')

In [2]: rng.tz is None
Out[2]: True

To supply the time zone, you can use the tz keyword to date_range and other functions. Dateutil time zone strings are distinguished from pytz time zones by starting with dateutil/.

  • In pytz you can find a list of common (and less common) time zones using from pytz import common_timezones, all_timezones.
  • dateutil uses the OS timezones so there isn’t a fixed list available. For common zones, the names are the same as pytz.
# pytz
In [3]: rng_pytz = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
   ...:                          tz='Europe/London')
   ...: 

In [4]: rng_pytz.tz
Out[4]: <DstTzInfo 'Europe/London' LMT-1 day, 23:59:00 STD>

# dateutil
In [5]: rng_dateutil = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
   ...:                              tz='dateutil/Europe/London')
   ...: 

In [6]: rng_dateutil.tz
Out[6]: tzfile('/usr/share/zoneinfo/Europe/London')

# dateutil - utc special case
In [7]: rng_utc = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
   ...:                         tz=dateutil.tz.tzutc())
   ...: 

In [8]: rng_utc.tz
Out[8]: tzutc()

Note that the UTC timezone is a special case in dateutil and should be constructed explicitly as an instance of dateutil.tz.tzutc. You can also construct other timezones explicitly first, which gives you more control over which time zone is used:

# pytz
In [9]: tz_pytz = pytz.timezone('Europe/London')

In [10]: rng_pytz = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
   ....:                          tz=tz_pytz)
   ....: 

In [11]: rng_pytz.tz == tz_pytz
Out[11]: True

# dateutil
In [12]: tz_dateutil = dateutil.tz.gettz('Europe/London')

In [13]: rng_dateutil = pd.date_range('3/6/2012 00:00', periods=10, freq='D',
   ....:                              tz=tz_dateutil)
   ....: 

In [14]: rng_dateutil.tz == tz_dateutil
Out[14]: True

Timestamps, like Python’s datetime.datetime object can be either time zone naive or time zone aware. Naive time series and DatetimeIndex objects can be localized using tz_localize:

In [15]: ts = pd.Series(np.random.randn(len(rng)), rng)

In [16]: ts_utc = ts.tz_localize('UTC')

In [17]: ts_utc
Out[17]: 
2012-03-06 00:00:00+00:00    0.469112
2012-03-07 00:00:00+00:00   -0.282863
2012-03-08 00:00:00+00:00   -1.509059
2012-03-09 00:00:00+00:00   -1.135632
                               ...   
2012-03-17 00:00:00+00:00    1.071804
2012-03-18 00:00:00+00:00    0.721555
2012-03-19 00:00:00+00:00   -0.706771
2012-03-20 00:00:00+00:00   -1.039575
Freq: D, dtype: float64

Again, you can explicitly construct the timezone object first. You can use the tz_convert method to convert pandas objects to convert tz-aware data to another time zone:

In [18]: ts_utc.tz_convert('US/Eastern')
Out[18]: 
2012-03-05 19:00:00-05:00    0.469112
2012-03-06 19:00:00-05:00   -0.282863
2012-03-07 19:00:00-05:00   -1.509059
2012-03-08 19:00:00-05:00   -1.135632
                               ...   
2012-03-16 20:00:00-04:00    1.071804
2012-03-17 20:00:00-04:00    0.721555
2012-03-18 20:00:00-04:00   -0.706771
2012-03-19 20:00:00-04:00   -1.039575
Freq: D, dtype: float64

Warning

Be wary of conversions between libraries. For some zones pytz and dateutil have different definitions of the zone. This is more of a problem for unusual timezones than for ‘standard’ zones like US/Eastern.

Warning

Be aware that a timezone definition across versions of timezone libraries may not be considered equal. This may cause problems when working with stored data that is localized using one version and operated on with a different version. See here for how to handle such a situation.

Warning

It is incorrect to pass a timezone directly into the datetime.datetime constructor (e.g., datetime.datetime(2011, 1, 1, tz=timezone('US/Eastern')). Instead, the datetime needs to be localized using the the localize method on the timezone.

Under the hood, all timestamps are stored in UTC. Scalar values from a DatetimeIndex with a time zone will have their fields (day, hour, minute) localized to the time zone. However, timestamps with the same UTC value are still considered to be equal even if they are in different time zones:

In [19]: rng_eastern = rng_utc.tz_convert('US/Eastern')

In [20]: rng_berlin = rng_utc.tz_convert('Europe/Berlin')

In [21]: rng_eastern[5]
Out[21]: Timestamp('2012-03-10 19:00:00-0500', tz='US/Eastern', freq='D')

In [22]: rng_berlin[5]
Out[22]: Timestamp('2012-03-11 01:00:00+0100', tz='Europe/Berlin', freq='D')

In [23]: rng_eastern[5] == rng_berlin[5]
Out[23]: True

Like Series, DataFrame, and DatetimeIndex, Timestamp``s can be converted to other time zones using ``tz_convert:

In [24]: rng_eastern[5]
Out[24]: Timestamp('2012-03-10 19:00:00-0500', tz='US/Eastern', freq='D')

In [25]: rng_berlin[5]
Out[25]: Timestamp('2012-03-11 01:00:00+0100', tz='Europe/Berlin', freq='D')

In [26]: rng_eastern[5].tz_convert('Europe/Berlin')
Out[26]: Timestamp('2012-03-11 01:00:00+0100', tz='Europe/Berlin')

Localization of Timestamp functions just like DatetimeIndex and Series:

In [27]: rng[5]
Out[27]: Timestamp('2012-03-11 00:00:00', freq='D')

In [28]: rng[5].tz_localize('Asia/Shanghai')
Out[28]: Timestamp('2012-03-11 00:00:00+0800', tz='Asia/Shanghai')

Operations between Series in different time zones will yield UTC Series, aligning the data on the UTC timestamps:

In [29]: eastern = ts_utc.tz_convert('US/Eastern')

In [30]: berlin = ts_utc.tz_convert('Europe/Berlin')

In [31]: result = eastern + berlin

In [32]: result
Out[32]: 
2012-03-06 00:00:00+00:00    0.938225
2012-03-07 00:00:00+00:00   -0.565727
2012-03-08 00:00:00+00:00   -3.018117
2012-03-09 00:00:00+00:00   -2.271265
                               ...   
2012-03-17 00:00:00+00:00    2.143608
2012-03-18 00:00:00+00:00    1.443110
2012-03-19 00:00:00+00:00   -1.413542
2012-03-20 00:00:00+00:00   -2.079150
Freq: D, dtype: float64

In [33]: result.index
Out[33]: 
DatetimeIndex(['2012-03-06', '2012-03-07', '2012-03-08', '2012-03-09',
               '2012-03-10', '2012-03-11', '2012-03-12', '2012-03-13',
               '2012-03-14', '2012-03-15', '2012-03-16', '2012-03-17',
               '2012-03-18', '2012-03-19', '2012-03-20'],
              dtype='datetime64[ns, UTC]', freq='D')

To remove timezone from tz-aware DatetimeIndex, use tz_localize(None) or tz_convert(None). tz_localize(None) will remove timezone holding local time representations. tz_convert(None) will remove timezone after converting to UTC time.

In [34]: didx = pd.DatetimeIndex(start='2014-08-01 09:00', freq='H', periods=10, tz='US/Eastern')

In [35]: didx
Out[35]: 
DatetimeIndex(['2014-08-01 09:00:00-04:00', '2014-08-01 10:00:00-04:00',
               '2014-08-01 11:00:00-04:00', '2014-08-01 12:00:00-04:00',
               '2014-08-01 13:00:00-04:00', '2014-08-01 14:00:00-04:00',
               '2014-08-01 15:00:00-04:00', '2014-08-01 16:00:00-04:00',
               '2014-08-01 17:00:00-04:00', '2014-08-01 18:00:00-04:00'],
              dtype='datetime64[ns, US/Eastern]', freq='H')

In [36]: didx.tz_localize(None)
Out[36]: 
DatetimeIndex(['2014-08-01 09:00:00', '2014-08-01 10:00:00',
               '2014-08-01 11:00:00', '2014-08-01 12:00:00',
               '2014-08-01 13:00:00', '2014-08-01 14:00:00',
               '2014-08-01 15:00:00', '2014-08-01 16:00:00',
               '2014-08-01 17:00:00', '2014-08-01 18:00:00'],
              dtype='datetime64[ns]', freq='H')

In [37]: didx.tz_convert(None)
Out[37]: 
DatetimeIndex(['2014-08-01 13:00:00', '2014-08-01 14:00:00',
               '2014-08-01 15:00:00', '2014-08-01 16:00:00',
               '2014-08-01 17:00:00', '2014-08-01 18:00:00',
               '2014-08-01 19:00:00', '2014-08-01 20:00:00',
               '2014-08-01 21:00:00', '2014-08-01 22:00:00'],
              dtype='datetime64[ns]', freq='H')

# tz_convert(None) is identical with tz_convert('UTC').tz_localize(None)
In [38]: didx.tz_convert('UCT').tz_localize(None)
Out[38]: 
DatetimeIndex(['2014-08-01 13:00:00', '2014-08-01 14:00:00',
               '2014-08-01 15:00:00', '2014-08-01 16:00:00',
               '2014-08-01 17:00:00', '2014-08-01 18:00:00',
               '2014-08-01 19:00:00', '2014-08-01 20:00:00',
               '2014-08-01 21:00:00', '2014-08-01 22:00:00'],
              dtype='datetime64[ns]', freq='H')

14.2 Ambiguous Times when Localizing

In some cases, localize cannot determine the DST and non-DST hours when there are duplicates. This often happens when reading files or database records that simply duplicate the hours. Passing ambiguous='infer' (infer_dst argument in prior releases) into tz_localize will attempt to determine the right offset. Below the top example will fail as it contains ambiguous times and the bottom will infer the right offset.

In [39]: rng_hourly = pd.DatetimeIndex(['11/06/2011 00:00', '11/06/2011 01:00',
   ....:                                '11/06/2011 01:00', '11/06/2011 02:00',
   ....:                                '11/06/2011 03:00'])
   ....: 

This will fail as there are ambiguous times

In [2]: rng_hourly.tz_localize('US/Eastern')
AmbiguousTimeError: Cannot infer dst time from Timestamp('2011-11-06 01:00:00'), try using the 'ambiguous' argument

Infer the ambiguous times

In [40]: rng_hourly_eastern = rng_hourly.tz_localize('US/Eastern', ambiguous='infer')

In [41]: rng_hourly_eastern.tolist()
Out[41]: 
[Timestamp('2011-11-06 00:00:00-0400', tz='US/Eastern'),
 Timestamp('2011-11-06 01:00:00-0400', tz='US/Eastern'),
 Timestamp('2011-11-06 01:00:00-0500', tz='US/Eastern'),
 Timestamp('2011-11-06 02:00:00-0500', tz='US/Eastern'),
 Timestamp('2011-11-06 03:00:00-0500', tz='US/Eastern')]

In addition to ‘infer’, there are several other arguments supported. Passing an array-like of bools or 0s/1s where True represents a DST hour and False a non-DST hour, allows for distinguishing more than one DST transition (e.g., if you have multiple records in a database each with their own DST transition). Or passing ‘NaT’ will fill in transition times with not-a-time values. These methods are available in the DatetimeIndex constructor as well as tz_localize.

In [42]: rng_hourly_dst = np.array([1, 1, 0, 0, 0])

In [43]: rng_hourly.tz_localize('US/Eastern', ambiguous=rng_hourly_dst).tolist()
Out[43]: 
[Timestamp('2011-11-06 00:00:00-0400', tz='US/Eastern'),
 Timestamp('2011-11-06 01:00:00-0400', tz='US/Eastern'),
 Timestamp('2011-11-06 01:00:00-0500', tz='US/Eastern'),
 Timestamp('2011-11-06 02:00:00-0500', tz='US/Eastern'),
 Timestamp('2011-11-06 03:00:00-0500', tz='US/Eastern')]

In [44]: rng_hourly.tz_localize('US/Eastern', ambiguous='NaT').tolist()
Out[44]: 
[Timestamp('2011-11-06 00:00:00-0400', tz='US/Eastern'),
 NaT,
 NaT,
 Timestamp('2011-11-06 02:00:00-0500', tz='US/Eastern'),
 Timestamp('2011-11-06 03:00:00-0500', tz='US/Eastern')]

In [45]: didx = pd.DatetimeIndex(start='2014-08-01 09:00', freq='H', periods=10, tz='US/Eastern')

In [46]: didx
Out[46]: 
DatetimeIndex(['2014-08-01 09:00:00-04:00', '2014-08-01 10:00:00-04:00',
               '2014-08-01 11:00:00-04:00', '2014-08-01 12:00:00-04:00',
               '2014-08-01 13:00:00-04:00', '2014-08-01 14:00:00-04:00',
               '2014-08-01 15:00:00-04:00', '2014-08-01 16:00:00-04:00',
               '2014-08-01 17:00:00-04:00', '2014-08-01 18:00:00-04:00'],
              dtype='datetime64[ns, US/Eastern]', freq='H')

In [47]: didx.tz_localize(None)
Out[47]: 
DatetimeIndex(['2014-08-01 09:00:00', '2014-08-01 10:00:00',
               '2014-08-01 11:00:00', '2014-08-01 12:00:00',
               '2014-08-01 13:00:00', '2014-08-01 14:00:00',
               '2014-08-01 15:00:00', '2014-08-01 16:00:00',
               '2014-08-01 17:00:00', '2014-08-01 18:00:00'],
              dtype='datetime64[ns]', freq='H')

In [48]: didx.tz_convert(None)
Out[48]: 
DatetimeIndex(['2014-08-01 13:00:00', '2014-08-01 14:00:00',
               '2014-08-01 15:00:00', '2014-08-01 16:00:00',
               '2014-08-01 17:00:00', '2014-08-01 18:00:00',
               '2014-08-01 19:00:00', '2014-08-01 20:00:00',
               '2014-08-01 21:00:00', '2014-08-01 22:00:00'],
              dtype='datetime64[ns]', freq='H')

# tz_convert(None) is identical with tz_convert('UTC').tz_localize(None)
In [49]: didx.tz_convert('UCT').tz_localize(None)
Out[49]: 
DatetimeIndex(['2014-08-01 13:00:00', '2014-08-01 14:00:00',
               '2014-08-01 15:00:00', '2014-08-01 16:00:00',
               '2014-08-01 17:00:00', '2014-08-01 18:00:00',
               '2014-08-01 19:00:00', '2014-08-01 20:00:00',
               '2014-08-01 21:00:00', '2014-08-01 22:00:00'],
              dtype='datetime64[ns]', freq='H')

14.3 TZ aware Dtypes

New in version 0.17.0.

Series/DatetimeIndex with a timezone naive value are represented with a dtype of datetime64[ns].

In [50]: s_naive = pd.Series(pd.date_range('20130101',periods=3))

In [51]: s_naive
Out[51]: 
0   2013-01-01
1   2013-01-02
2   2013-01-03
dtype: datetime64[ns]

Series/DatetimeIndex with a timezone aware value are represented with a dtype of datetime64[ns, tz].

In [52]: s_aware = pd.Series(pd.date_range('20130101',periods=3,tz='US/Eastern'))

In [53]: s_aware
Out[53]: 
0   2013-01-01 00:00:00-05:00
1   2013-01-02 00:00:00-05:00
2   2013-01-03 00:00:00-05:00
dtype: datetime64[ns, US/Eastern]

Both of these Series can be manipulated via the .dt accessor, see here.

For example, to localize and convert a naive stamp to timezone aware.

In [54]: s_naive.dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
Out[54]: 
0   2012-12-31 19:00:00-05:00
1   2013-01-01 19:00:00-05:00
2   2013-01-02 19:00:00-05:00
dtype: datetime64[ns, US/Eastern]

Further more you can .astype(...) timezone aware (and naive). This operation is effectively a localize AND convert on a naive stamp, and a convert on an aware stamp.

# localize and convert a naive timezone
In [55]: s_naive.astype('datetime64[ns, US/Eastern]')
Out[55]: 
0   2012-12-31 19:00:00-05:00
1   2013-01-01 19:00:00-05:00
2   2013-01-02 19:00:00-05:00
dtype: datetime64[ns, US/Eastern]

# make an aware tz naive
In [56]: s_aware.astype('datetime64[ns]')
Out[56]: 
0   2013-01-01 05:00:00
1   2013-01-02 05:00:00
2   2013-01-03 05:00:00
dtype: datetime64[ns]

# convert to a new timezone
In [57]: s_aware.astype('datetime64[ns, CET]')
Out[57]: 
0   2013-01-01 06:00:00+01:00
1   2013-01-02 06:00:00+01:00
2   2013-01-03 06:00:00+01:00
dtype: datetime64[ns, CET]

Note

Using the .values accessor on a Series, returns an numpy array of the data. These values are converted to UTC, as numpy does not currently support timezones (even though it is printing in the local timezone!).

In [58]: s_naive.values
Out[58]: 
array(['2013-01-01T00:00:00.000000000', '2013-01-02T00:00:00.000000000',
       '2013-01-03T00:00:00.000000000'], dtype='datetime64[ns]')

In [59]: s_aware.values
Out[59]: 
array(['2013-01-01T05:00:00.000000000', '2013-01-02T05:00:00.000000000',
       '2013-01-03T05:00:00.000000000'], dtype='datetime64[ns]')

Further note that once converted to a numpy array these would lose the tz tenor.

In [60]: pd.Series(s_aware.values)
Out[60]: 
0   2013-01-01 05:00:00
1   2013-01-02 05:00:00
2   2013-01-03 05:00:00
dtype: datetime64[ns]

However, these can be easily converted

In [61]: pd.Series(s_aware.values).dt.tz_localize('UTC').dt.tz_convert('US/Eastern')
Out[61]: 
0   2013-01-01 00:00:00-05:00
1   2013-01-02 00:00:00-05:00
2   2013-01-03 00:00:00-05:00
dtype: datetime64[ns, US/Eastern]