7.3 Reshaping by Melt

The melt() function is useful to massage a DataFrame into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are “unpivoted” to the row axis, leaving just two non-identifier columns, “variable” and “value”. The names of those columns can be customized by supplying the var_name and value_name parameters.

For instance,

In [1]: cheese = pd.DataFrame({'first' : ['John', 'Mary'],
   ...:                        'last' : ['Doe', 'Bo'],
   ...:                        'height' : [5.5, 6.0],
   ...:                        'weight' : [130, 150]})
   ...: 

In [2]: cheese
Out[2]: 
  first  height last  weight
0  John     5.5  Doe     130
1  Mary     6.0   Bo     150

In [3]: pd.melt(cheese, id_vars=['first', 'last'])
Out[3]: 
  first last variable  value
0  John  Doe   height    5.5
1  Mary   Bo   height    6.0
2  John  Doe   weight  130.0
3  Mary   Bo   weight  150.0

In [4]: pd.melt(cheese, id_vars=['first', 'last'], var_name='quantity')
Out[4]: 
  first last quantity  value
0  John  Doe   height    5.5
1  Mary   Bo   height    6.0
2  John  Doe   weight  130.0
3  Mary   Bo   weight  150.0

Another way to transform is to use the wide_to_long panel data convenience function.

In [5]: dft = pd.DataFrame({"A1970" : {0 : "a", 1 : "b", 2 : "c"},
   ...:                     "A1980" : {0 : "d", 1 : "e", 2 : "f"},
   ...:                     "B1970" : {0 : 2.5, 1 : 1.2, 2 : .7},
   ...:                     "B1980" : {0 : 3.2, 1 : 1.3, 2 : .1},
   ...:                     "X"     : dict(zip(range(3), np.random.randn(3)))
   ...:                    })
   ...: 

In [6]: dft["id"] = dft.index

In [7]: dft
Out[7]: 
  A1970 A1980  B1970  B1980         X  id
0     a     d    2.5    3.2  0.469112   0
1     b     e    1.2    1.3 -0.282863   1
2     c     f    0.7    0.1 -1.509059   2

In [8]: pd.wide_to_long(dft, ["A", "B"], i="id", j="year")
Out[8]: 
                X  A    B
id year                  
0  1970  0.469112  a  2.5
1  1970 -0.282863  b  1.2
2  1970 -1.509059  c  0.7
0  1980  0.469112  d  3.2
1  1980 -0.282863  e  1.3
2  1980 -1.509059  f  0.1