7.7 Tiling

The cut function computes groupings for the values of the input array and is often used to transform continuous variables to discrete or categorical variables:

In [1]: ages = np.array([10, 15, 13, 12, 23, 25, 28, 59, 60])

In [2]: pd.cut(ages, bins=3)
Out[2]: 
[(9.95, 26.667], (9.95, 26.667], (9.95, 26.667], (9.95, 26.667], (9.95, 26.667], (9.95, 26.667], (26.667, 43.333], (43.333, 60], (43.333, 60]]
Categories (3, object): [(9.95, 26.667] < (26.667, 43.333] < (43.333, 60]]

If the bins keyword is an integer, then equal-width bins are formed. Alternatively we can specify custom bin-edges:

In [3]: pd.cut(ages, bins=[0, 18, 35, 70])
Out[3]: 
[(0, 18], (0, 18], (0, 18], (0, 18], (18, 35], (18, 35], (18, 35], (35, 70], (35, 70]]
Categories (3, object): [(0, 18] < (18, 35] < (35, 70]]