.. currentmodule:: pandas .. ipython:: python :suppress: import numpy as np np.random.seed(123456) import pandas as pd pd.options.display.max_rows=8 np.set_printoptions(precision=4, suppress=True) Reshaping by pivoting DataFrame objects --------------------------------------- .. code-block:: python import pandas.util.testing as tm; tm.N = 3 def unpivot(frame): N, K = frame.shape data = {'value' : frame.values.ravel('F'), 'variable' : np.asarray(frame.columns).repeat(N), 'date' : np.tile(np.asarray(frame.index), K)} return pd.DataFrame(data, columns=['date', 'variable', 'value']) df = unpivot(tm.makeTimeDataFrame()) .. ipython:: :suppress: In [1]: import pandas.util.testing as tm; tm.N = 3 In [2]: def unpivot(frame): ...: N, K = frame.shape ...: data = {'value' : frame.values.ravel('F'), ...: 'variable' : np.asarray(frame.columns).repeat(N), ...: 'date' : np.tile(np.asarray(frame.index), K)} ...: columns = ['date', 'variable', 'value'] ...: return pd.DataFrame(data, columns=columns) ...: In [3]: df = unpivot(tm.makeTimeDataFrame()) Data is often stored in CSV files or databases in so-called "stacked" or "record" format: .. ipython:: python df To select out everything for variable ``A`` we could do: .. ipython:: python df[df['variable'] == 'A'] But suppose we wish to do time series operations with the variables. A better representation would be where the ``columns`` are the unique variables and an ``index`` of dates identifies individual observations. To reshape the data into this form, use the ``pivot`` function: .. ipython:: python df df.pivot(index='date', columns='variable', values='value') If the ``values`` argument is omitted, and the input DataFrame has more than one column of values which are not used as column or index inputs to ``pivot``, then the resulting "pivoted" DataFrame will have :ref:`hierarchical columns ` whose topmost level indicates the respective value column: .. ipython:: python df['value2'] = df['value'] * 2 df pivoted = df.pivot('date', 'variable') pivoted You of course can then select subsets from the pivoted DataFrame: .. ipython:: python pivoted['value2'] Note that this returns a view on the underlying data in the case where the data are homogeneously-typed.