.. currentmodule:: pandas .. ipython:: python :suppress: import numpy as np np.random.seed(123456) import pandas as pd pd.options.display.max_rows=8 np.set_printoptions(precision=4, suppress=True) Factorizing values ------------------ To encode 1-d values as an enumerated type use ``factorize``: .. ipython:: python x = pd.Series(['A', 'A', np.nan, 'B', 3.14, np.inf]) x labels, uniques = pd.factorize(x) labels uniques Note that ``factorize`` is similar to ``numpy.unique``, but differs in its handling of NaN: .. note:: The following ``numpy.unique`` will fail under Python 3 with a ``TypeError`` because of an ordering bug. See also `Here `__ .. ipython:: python pd.factorize(x, sort=True) np.unique(x, return_inverse=True)[::-1] .. note:: If you just want to handle one column as a categorical variable (like R's factor), you can use ``df["cat_col"] = pd.Categorical(df["col"])`` or ``df["cat_col"] = df["col"].astype("category")``. For full docs on :class:`~pandas.Categorical`, see the :ref:`Categorical introduction ` and the :ref:`API documentation `. This feature was introduced in version 0.15.