.. ipython:: python :suppress: import numpy as np np.set_printoptions(precision=4, suppress=True) import pandas as pd pd.options.display.max_rows=8 Miscellaneous indexing gotchas ------------------------------ Reindex versus ix gotchas ~~~~~~~~~~~~~~~~~~~~~~~~~ Many users will find themselves using the ``ix`` indexing capabilities as a concise means of selecting data from a pandas object: .. ipython:: python df = pd.DataFrame(np.random.randn(6, 4), columns=['one', 'two', 'three', 'four'], index=list('abcdef')) df df.ix[['b', 'c', 'e']] This is, of course, completely equivalent *in this case* to using the ``reindex`` method: .. ipython:: python df.reindex(['b', 'c', 'e']) Some might conclude that ``ix`` and ``reindex`` are 100% equivalent based on this. This is indeed true **except in the case of integer indexing**. For example, the above operation could alternately have been expressed as: .. ipython:: python df.ix[[1, 2, 4]] If you pass ``[1, 2, 4]`` to ``reindex`` you will get another thing entirely: .. ipython:: python df.reindex([1, 2, 4]) So it's important to remember that ``reindex`` is **strict label indexing only**. This can lead to some potentially surprising results in pathological cases where an index contains, say, both integers and strings: .. ipython:: python s = pd.Series([1, 2, 3], index=['a', 0, 1]) s s.ix[[0, 1]] s.reindex([0, 1]) Because the index in this case does not contain solely integers, ``ix`` falls back on integer indexing. By contrast, ``reindex`` only looks for the values passed in the index, thus finding the integers ``0`` and ``1``. While it would be possible to insert some logic to check whether a passed sequence is all contained in the index, that logic would exact a very high cost in large data sets. Reindex potentially changes underlying Series dtype ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The use of ``reindex_like`` can potentially change the dtype of a ``Series``. .. ipython:: python series = pd.Series([1, 2, 3]) x = pd.Series([True]) x.dtype x = pd.Series([True]).reindex_like(series) x.dtype This is because ``reindex_like`` silently inserts ``NaNs`` and the ``dtype`` changes accordingly. This can cause some issues when using ``numpy`` ``ufuncs`` such as ``numpy.logical_and``. See the `this old issue `__ for a more detailed discussion.