2.6 Selection By Position

Warning

Whether a copy or a reference is returned for a setting operation, may depend on the context. This is sometimes called chained assignment and should be avoided. See Returning a View versus Copy

pandas provides a suite of methods in order to get purely integer based indexing. The semantics follow closely python and numpy slicing. These are 0-based indexing. When slicing, the start bounds is included, while the upper bound is excluded. Trying to use a non-integer, even a valid label will raise a IndexError.

The .iloc attribute is the primary access method. The following are valid inputs:

  • An integer e.g. 5
  • A list or array of integers [4, 3, 0]
  • A slice object with ints 1:7
  • A boolean array
  • A callable, see Selection By Callable
In [1]: s1 = pd.Series(np.random.randn(5), index=list(range(0,10,2)))

In [2]: s1
Out[2]: 
0    1.0758
2   -0.1090
4    1.6436
6   -1.4694
8    0.3570
dtype: float64

In [3]: s1.iloc[:3]
Out[3]: 
0    1.0758
2   -0.1090
4    1.6436
dtype: float64

In [4]: s1.iloc[3]
Out[4]: -1.4693879595399115

Note that setting works as well:

In [5]: s1.iloc[:3] = 0

In [6]: s1
Out[6]: 
0    0.0000
2    0.0000
4    0.0000
6   -1.4694
8    0.3570
dtype: float64

With a DataFrame

In [7]: df1 = pd.DataFrame(np.random.randn(6,4),
   ...:                    index=list(range(0,12,2)),
   ...:                    columns=list(range(0,8,2)))
   ...: 

In [8]: df1
Out[8]: 
         0       2       4       6
0  -0.6746 -1.7769 -0.9689 -1.2945
2   0.4137  0.2767 -0.4720 -0.0140
4  -0.3625 -0.0062 -0.9231  0.8957
6   0.8052 -1.2064  2.5656  1.4313
8   1.3403 -1.1703 -0.2262  0.4108
10  0.8139  0.1320 -0.8273 -0.0765

Select via integer slicing

In [9]: df1.iloc[:3]
Out[9]: 
        0       2       4       6
0 -0.6746 -1.7769 -0.9689 -1.2945
2  0.4137  0.2767 -0.4720 -0.0140
4 -0.3625 -0.0062 -0.9231  0.8957

In [10]: df1.iloc[1:5, 2:4]
Out[10]: 
        4       6
2 -0.4720 -0.0140
4 -0.9231  0.8957
6  2.5656  1.4313
8 -0.2262  0.4108

Select via integer list

In [11]: df1.iloc[[1, 3, 5], [1, 3]]
Out[11]: 
         2       6
2   0.2767 -0.0140
6  -1.2064  1.4313
10  0.1320 -0.0765
In [12]: df1.iloc[1:3, :]
Out[12]: 
        0       2       4       6
2  0.4137  0.2767 -0.4720 -0.0140
4 -0.3625 -0.0062 -0.9231  0.8957
In [13]: df1.iloc[:, 1:3]
Out[13]: 
         2       4
0  -1.7769 -0.9689
2   0.2767 -0.4720
4  -0.0062 -0.9231
6  -1.2064  2.5656
8  -1.1703 -0.2262
10  0.1320 -0.8273
# this is also equivalent to ``df1.iat[1,1]``
In [14]: df1.iloc[1, 1]
Out[14]: 0.27666171294975661

For getting a cross section using an integer position (equiv to df.xs(1))

In [15]: df1.iloc[1]
Out[15]: 
0    0.4137
2    0.2767
4   -0.4720
6   -0.0140
Name: 2, dtype: float64

Out of range slice indexes are handled gracefully just as in Python/Numpy.

# these are allowed in python/numpy.
# Only works in Pandas starting from v0.14.0.
In [16]: x = list('abcdef')

In [17]: x
Out[17]: ['a', 'b', 'c', 'd', 'e', 'f']

In [18]: x[4:10]
Out[18]: ['e', 'f']

In [19]: x[8:10]
Out[19]: []

In [20]: s = pd.Series(x)

In [21]: s
Out[21]: 
0    a
1    b
2    c
3    d
4    e
5    f
dtype: object

In [22]: s.iloc[4:10]
Out[22]: 
4    e
5    f
dtype: object

In [23]: s.iloc[8:10]
Out[23]: Series([], dtype: object)

Note

Prior to v0.14.0, iloc would not accept out of bounds indexers for slices, e.g. a value that exceeds the length of the object being indexed.

Note that this could result in an empty axis (e.g. an empty DataFrame being returned)

In [24]: dfl = pd.DataFrame(np.random.randn(5,2), columns=list('AB'))

In [25]: dfl
Out[25]: 
        A       B
0 -1.1877  1.1301
1 -1.4367 -1.4137
2  1.6079  1.0242
3  0.5696  0.8759
4 -2.2114  0.9745

In [26]: dfl.iloc[:, 2:3]
Out[26]: 
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4]

In [27]: dfl.iloc[:, 1:3]
Out[27]: 
        B
0  1.1301
1 -1.4137
2  1.0242
3  0.8759
4  0.9745

In [28]: dfl.iloc[4:6]
Out[28]: 
        A       B
4 -2.2114  0.9745

A single indexer that is out of bounds will raise an IndexError. A list of indexers where any element is out of bounds will raise an IndexError

dfl.iloc[[4, 5, 6]]
IndexError: positional indexers are out-of-bounds

dfl.iloc[:, 4]
IndexError: single positional indexer is out-of-bounds