3 Panel

Panel is a somewhat less-used, but still important container for 3-dimensional data. The term panel data is derived from econometrics and is partially responsible for the name pandas: pan(el)-da(ta)-s. The names for the 3 axes are intended to give some semantic meaning to describing operations involving panel data and, in particular, econometric analysis of panel data. However, for the strict purposes of slicing and dicing a collection of DataFrame objects, you may find the axis names slightly arbitrary:

items: axis 0, each item corresponds to a DataFrame contained inside

major_axis: axis 1, it is the index (rows) of each of the DataFrames

minor_axis: axis 2, it is the columns of each of the DataFrames

Construction of Panels works about like you would expect:

3.1 From 3D ndarray with optional axis labels

In [1]: wp = pd.Panel(np.random.randn(2, 5, 4), items=['Item1', 'Item2'],
   ...:               major_axis=pd.date_range('1/1/2000', periods=5),
   ...:               minor_axis=['A', 'B', 'C', 'D'])
   ...: 

In [2]: wp
Out[2]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 5 (major_axis) x 4 (minor_axis)
Items axis: Item1 to Item2
Major_axis axis: 2000-01-01 00:00:00 to 2000-01-05 00:00:00
Minor_axis axis: A to D

3.2 From dict of DataFrame objects

In [3]: data = {'Item1' : pd.DataFrame(np.random.randn(4, 3)),
   ...:         'Item2' : pd.DataFrame(np.random.randn(4, 2))}
   ...: 

In [4]: pd.Panel(data)
Out[4]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 4 (major_axis) x 3 (minor_axis)
Items axis: Item1 to Item2
Major_axis axis: 0 to 3
Minor_axis axis: 0 to 2

Note that the values in the dict need only be convertible to DataFrame. Thus, they can be any of the other valid inputs to DataFrame as per above.

One helpful factory method is Panel.from_dict, which takes a dictionary of DataFrames as above, and the following named parameters:

Parameter	Default	Description
intersect	`False`	drops elements whose indices do not align
orient	`items`	use `minor` to use DataFrames’ columns as panel items

For example, compare to the construction above:

In [5]: pd.Panel.from_dict(data, orient='minor')
Out[5]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 3 (items) x 4 (major_axis) x 2 (minor_axis)
Items axis: 0 to 2
Major_axis axis: 0 to 3
Minor_axis axis: Item1 to Item2

Orient is especially useful for mixed-type DataFrames. If you pass a dict of DataFrame objects with mixed-type columns, all of the data will get upcasted to dtype=object unless you pass orient='minor':

In [6]: df = pd.DataFrame({'a': ['foo', 'bar', 'baz'],
   ...:                    'b': np.random.randn(3)})
   ...: 

In [7]: df
Out[7]: 
     a       b
0  foo  0.0623
1  bar -0.1104
2  baz -1.1844

In [8]: data = {'item1': df, 'item2': df}

In [9]: panel = pd.Panel.from_dict(data, orient='minor')

In [10]: panel['a']
Out[10]: 
  item1 item2
0   foo   foo
1   bar   bar
2   baz   baz

In [11]: panel['b']
Out[11]: 
    item1   item2
0  0.0623  0.0623
1 -0.1104 -0.1104
2 -1.1844 -1.1844

In [12]: panel['b'].dtypes
Out[12]: 
item1    float64
item2    float64
dtype: object

Note

Unfortunately Panel, being less commonly used than Series and DataFrame, has been slightly neglected feature-wise. A number of methods and options available in DataFrame are not available in Panel. This will get worked on, of course, in future releases. And faster if you join me in working on the codebase.

3.3 From DataFrame using `to_panel` method

This method was introduced in v0.7 to replace LongPanel.to_long, and converts a DataFrame with a two-level index to a Panel.

In [13]: midx = pd.MultiIndex(levels=[['one', 'two'], ['x','y']], labels=[[1,1,0,0],[1,0,1,0]])

In [14]: df = pd.DataFrame({'A' : [1, 2, 3, 4], 'B': [5, 6, 7, 8]}, index=midx)

In [15]: df.to_panel()
Out[15]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 2 (major_axis) x 2 (minor_axis)
Items axis: A to B
Major_axis axis: one to two
Minor_axis axis: x to y

3.4 Item selection / addition / deletion

Similar to DataFrame functioning as a dict of Series, Panel is like a dict of DataFrames:

In [16]: wp['Item1']
Out[16]: 
                 A       B       C       D
2000-01-01 -0.0579 -0.3682 -1.1441  0.8612
2000-01-02  0.8002  0.7821 -1.0691 -1.0992
2000-01-03  0.2553  0.0097  0.6611  0.3793
2000-01-04 -0.0084  1.9525 -1.0567  0.5339
2000-01-05 -1.2270  0.0404 -0.5075 -0.2301

In [17]: wp['Item3'] = wp['Item1'] / wp['Item2']

The API for insertion and deletion is the same as for DataFrame. And as with DataFrame, if the item is a valid python identifier, you can access it as an attribute and tab-complete it in IPython.

3.5 Transposing

A Panel can be rearranged using its transpose method (which does not make a copy by default unless the data are heterogeneous):

In [18]: wp.transpose(2, 0, 1)
Out[18]: 
<class 'pandas.core.panel.Panel'>
Dimensions: 4 (items) x 3 (major_axis) x 5 (minor_axis)
Items axis: A to D
Major_axis axis: Item1 to Item3
Minor_axis axis: 2000-01-01 00:00:00 to 2000-01-05 00:00:00

3.6 Indexing / Selection

Operation	Syntax	Result
Select item	`wp[item]`	DataFrame
Get slice at major_axis label	`wp.major_xs(val)`	DataFrame
Get slice at minor_axis label	`wp.minor_xs(val)`	DataFrame

For example, using the earlier example data, we could do:

In [19]: wp['Item1']
Out[19]: 
                 A       B       C       D
2000-01-01 -0.0579 -0.3682 -1.1441  0.8612
2000-01-02  0.8002  0.7821 -1.0691 -1.0992
2000-01-03  0.2553  0.0097  0.6611  0.3793
2000-01-04 -0.0084  1.9525 -1.0567  0.5339
2000-01-05 -1.2270  0.0404 -0.5075 -0.2301

In [20]: wp.major_xs(wp.major_axis[2])
Out[20]: 
    Item1   Item2   Item3
A  0.2553  0.6046  0.4222
B  0.0097  2.1215  0.0046
C  0.6611  0.5977  1.1060
D  0.3793  0.5637  0.6729

In [21]: wp.minor_axis
Out[21]: Index([u'A', u'B', u'C', u'D'], dtype='object')

In [22]: wp.minor_xs('C')
Out[22]: 
             Item1   Item2   Item3
2000-01-01 -1.1441 -1.6525  0.6923
2000-01-02 -1.0691  1.1460 -0.9329
2000-01-03  0.6611  0.5977  1.1060
2000-01-04 -1.0567  1.3750 -0.7685
2000-01-05 -0.5075  0.3780 -1.3428

3.7 Squeezing

Another way to change the dimensionality of an object is to squeeze a 1-len object, similar to wp['Item1']

In [23]: wp.reindex(items=['Item1']).squeeze()
Out[23]: 
                 A       B       C       D
2000-01-01 -0.0579 -0.3682 -1.1441  0.8612
2000-01-02  0.8002  0.7821 -1.0691 -1.0992
2000-01-03  0.2553  0.0097  0.6611  0.3793
2000-01-04 -0.0084  1.9525 -1.0567  0.5339
2000-01-05 -1.2270  0.0404 -0.5075 -0.2301

In [24]: wp.reindex(items=['Item1'], minor=['B']).squeeze()
Out[24]: 
2000-01-01   -0.3682
2000-01-02    0.7821
2000-01-03    0.0097
2000-01-04    1.9525
2000-01-05    0.0404
Freq: D, Name: B, dtype: float64

3.8 Conversion to DataFrame

A Panel can be represented in 2D form as a hierarchically indexed DataFrame. See the section hierarchical indexing for more on this. To convert a Panel to a DataFrame, use the to_frame method:

In [25]: panel = pd.Panel(np.random.randn(3, 5, 4), items=['one', 'two', 'three'],
   ....:                  major_axis=pd.date_range('1/1/2000', periods=5),
   ....:                  minor_axis=['a', 'b', 'c', 'd'])
   ....: 

In [26]: panel.to_frame()
Out[26]: 
                     one     two   three
major      minor                        
2000-01-01 a     -0.5581 -0.2238 -1.3776
           b      0.0778  1.3974  0.4993
           c      0.6295  1.5039 -1.4053
           d     -1.0353 -0.4789  0.1626
...                  ...     ...     ...
2000-01-05 a     -1.2905 -0.3902  0.2525
           b      0.7879  1.2071  1.5006
           c      1.5157  0.1787  1.0532
           d     -0.2765 -1.0042 -2.3386

[20 rows x 3 columns]