9.6 Creating Indicator Variables
You can extract dummy variables from string columns.
For example if they are separated by a '|'
:
In [1]: s = pd.Series(['a', 'a|b', np.nan, 'a|c'])
In [2]: s.str.get_dummies(sep='|')
Out[2]:
a b c
0 1 0 0
1 1 1 0
2 0 0 0
3 1 0 1
String Index
also supports get_dummies
which returns a MultiIndex
.
New in version 0.18.1.
In [3]: idx = pd.Index(['a', 'a|b', np.nan, 'a|c'])
In [4]: idx.str.get_dummies(sep='|')
Out[4]:
MultiIndex(levels=[[0, 1], [0, 1], [0, 1]],
labels=[[1, 1, 0, 1], [0, 1, 0, 0], [0, 0, 0, 1]],
names=[u'a', u'b', u'c'])
See also get_dummies()
.