5.3 Iterating through groups

With the GroupBy object in hand, iterating through the grouped data is very natural and functions similarly to itertools.groupby:

In [1]: df
Out[1]: 
     A      B       C       D
0  foo    one  0.4691 -0.8618
1  bar    one -0.2829 -2.1046
2  foo    two -1.5091 -0.4949
3  bar  three -1.1356  1.0718
4  foo    two  1.2121  0.7216
5  bar    two -0.1732 -0.7068
6  foo    one  0.1192 -1.0396
7  foo  three -1.0442  0.2719

In [2]: grouped = df.groupby('A')

In [3]: for name, group in grouped:
   ...:        print(name)
   ...:        print(group)
   ...: 
bar
     A      B       C       D
1  bar    one -0.2829 -2.1046
3  bar  three -1.1356  1.0718
5  bar    two -0.1732 -0.7068
foo
     A      B       C       D
0  foo    one  0.4691 -0.8618
2  foo    two -1.5091 -0.4949
4  foo    two  1.2121  0.7216
6  foo    one  0.1192 -1.0396
7  foo  three -1.0442  0.2719

In the case of grouping by multiple keys, the group name will be a tuple:

In [4]: for name, group in df.groupby(['A', 'B']):
   ...:        print(name)
   ...:        print(group)
   ...: 
('bar', 'one')
     A    B       C       D
1  bar  one -0.2829 -2.1046
('bar', 'three')
     A      B       C       D
3  bar  three -1.1356  1.0718
('bar', 'two')
     A    B       C       D
5  bar  two -0.1732 -0.7068
('foo', 'one')
     A    B       C       D
0  foo  one  0.4691 -0.8618
6  foo  one  0.1192 -1.0396
('foo', 'three')
     A      B       C       D
7  foo  three -1.0442  0.2719
('foo', 'two')
     A    B       C       D
2  foo  two -1.5091 -0.4949
4  foo  two  1.2121  0.7216

It’s standard Python-fu but remember you can unpack the tuple in the for loop statement if you wish: for (k1, k2), group in grouped:.

5.4 Selecting a group

A single group can be selected using GroupBy.get_group():

In [5]: grouped.get_group('bar')
Out[5]: 
     A      B       C       D
1  bar    one -0.2829 -2.1046
3  bar  three -1.1356  1.0718
5  bar    two -0.1732 -0.7068

Or for an object grouped on multiple columns:

In [6]: df.groupby(['A', 'B']).get_group(('bar', 'one'))
Out[6]: 
     A    B       C       D
1  bar  one -0.2829 -2.1046