1.4 Frequently Used Options

The following is a walkthrough of the more frequently used display options.

display.max_rows and display.max_columns sets the maximum number of rows and columns displayed when a frame is pretty-printed. Truncated lines are replaced by an ellipsis.

In [1]: df = pd.DataFrame(np.random.randn(7,2))

In [2]: pd.set_option('max_rows', 7)

In [3]: df
Out[3]: 
          0         1
0  0.469112 -0.282863
1 -1.509059 -1.135632
2  1.212112 -0.173215
3  0.119209 -1.044236
4 -0.861849 -2.104569
5 -0.494929  1.071804
6  0.721555 -0.706771

In [4]: pd.set_option('max_rows', 5)

In [5]: df
Out[5]: 
           0         1
0   0.469112 -0.282863
1  -1.509059 -1.135632
..       ...       ...
5  -0.494929  1.071804
6   0.721555 -0.706771

[7 rows x 2 columns]

In [6]: pd.reset_option('max_rows')

display.expand_frame_repr allows for the the representation of dataframes to stretch across pages, wrapped over the full column vs row-wise.

In [7]: df = pd.DataFrame(np.random.randn(5,10))

In [8]: pd.set_option('expand_frame_repr', True)

In [9]: df
Out[9]: 
          0         1         2         3         4         5         6  \
0 -1.039575  0.271860 -0.424972  0.567020  0.276232 -1.087401 -0.673690   
1  0.404705  0.577046 -1.715002 -1.039268 -0.370647 -1.157892 -1.344312   
2  1.643563 -1.469388  0.357021 -0.674600 -1.776904 -0.968914 -1.294524   
3 -0.013960 -0.362543 -0.006154 -0.923061  0.895717  0.805244 -1.206412   
4 -1.170299 -0.226169  0.410835  0.813850  0.132003 -0.827317 -0.076467   

          7         8         9  
0  0.113648 -1.478427  0.524988  
1  0.844885  1.075770 -0.109050  
2  0.413738  0.276662 -0.472035  
3  2.565646  1.431256  1.340309  
4 -1.187678  1.130127 -1.436737  

In [10]: pd.set_option('expand_frame_repr', False)

In [11]: df
Out[11]: 
          0         1         2         3         4         5         6         7         8         9
0 -1.039575  0.271860 -0.424972  0.567020  0.276232 -1.087401 -0.673690  0.113648 -1.478427  0.524988
1  0.404705  0.577046 -1.715002 -1.039268 -0.370647 -1.157892 -1.344312  0.844885  1.075770 -0.109050
2  1.643563 -1.469388  0.357021 -0.674600 -1.776904 -0.968914 -1.294524  0.413738  0.276662 -0.472035
3 -0.013960 -0.362543 -0.006154 -0.923061  0.895717  0.805244 -1.206412  2.565646  1.431256  1.340309
4 -1.170299 -0.226169  0.410835  0.813850  0.132003 -0.827317 -0.076467 -1.187678  1.130127 -1.436737

In [12]: pd.reset_option('expand_frame_repr')

display.large_repr lets you select whether to display dataframes that exceed max_columns or max_rows as a truncated frame, or as a summary.

In [13]: df = pd.DataFrame(np.random.randn(10,10))

In [14]: pd.set_option('max_rows', 5)

In [15]: pd.set_option('large_repr', 'truncate')

In [16]: df
Out[16]: 
           0         1         2         3         4         5         6  \
0  -1.413681  1.607920  1.024180  0.569605  0.875906 -2.211372  0.974466   
1   0.545952 -1.219217 -1.226825  0.769804 -1.281247 -0.727707 -0.121306   
..       ...       ...       ...       ...       ...       ...       ...   
8  -2.484478 -0.281461  0.030711  0.109121  1.126203 -0.977349  1.474071   
9  -1.071357  0.441153  2.353925  0.583787  0.221471 -0.744471  0.758527   

           7         8         9  
0  -2.006747 -0.410001 -0.078638  
1  -0.097883  0.695775  0.341734  
..       ...       ...       ...  
8  -0.064034 -1.282782  0.781836  
9   1.729689 -0.964980 -0.845696  

[10 rows x 10 columns]

In [17]: pd.set_option('large_repr', 'info')

In [18]: df
Out[18]: 
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
0    10 non-null float64
1    10 non-null float64
2    10 non-null float64
3    10 non-null float64
4    10 non-null float64
5    10 non-null float64
6    10 non-null float64
7    10 non-null float64
8    10 non-null float64
9    10 non-null float64
dtypes: float64(10)
memory usage: 872.0 bytes

In [19]: pd.reset_option('large_repr')

In [20]: pd.reset_option('max_rows')

display.max_colwidth sets the maximum width of columns. Cells of this length or longer will be truncated with an ellipsis.

In [21]: df = pd.DataFrame(np.array([['foo', 'bar', 'bim', 'uncomfortably long string'],
   ....:                             ['horse', 'cow', 'banana', 'apple']]))
   ....: 

In [22]: pd.set_option('max_colwidth',40)

In [23]: df
Out[23]: 
       0    1       2                          3
0    foo  bar     bim  uncomfortably long string
1  horse  cow  banana                      apple

In [24]: pd.set_option('max_colwidth', 6)

In [25]: df
Out[25]: 
       0    1      2      3
0    foo  bar    bim  un...
1  horse  cow  ba...  apple

In [26]: pd.reset_option('max_colwidth')

display.max_info_columns sets a threshold for when by-column info will be given.

In [27]: df = pd.DataFrame(np.random.randn(10,10))

In [28]: pd.set_option('max_info_columns', 11)

In [29]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
0    10 non-null float64
1    10 non-null float64
2    10 non-null float64
3    10 non-null float64
4    10 non-null float64
5    10 non-null float64
6    10 non-null float64
7    10 non-null float64
8    10 non-null float64
9    10 non-null float64
dtypes: float64(10)
memory usage: 872.0 bytes

In [30]: pd.set_option('max_info_columns', 5)

In [31]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Columns: 10 entries, 0 to 9
dtypes: float64(10)
memory usage: 872.0 bytes

In [32]: pd.reset_option('max_info_columns')

display.max_info_rows: df.info() will usually show null-counts for each column. For large frames this can be quite slow. max_info_rows and max_info_cols limit this null check only to frames with smaller dimensions then specified. Note that you can specify the option df.info(null_counts=True) to override on showing a particular frame.

In [33]: df  =pd.DataFrame(np.random.choice([0,1,np.nan], size=(10,10)))

In [34]: df
Out[34]: 
     0    1    2    3    4    5    6    7    8    9
0  0.0  1.0  1.0  0.0  1.0  1.0  0.0  NaN  1.0  NaN
1  1.0  NaN  0.0  0.0  1.0  1.0  NaN  1.0  0.0  1.0
2  NaN  NaN  NaN  1.0  1.0  0.0  NaN  0.0  1.0  NaN
3  0.0  1.0  1.0  NaN  0.0  NaN  1.0  NaN  NaN  0.0
4  0.0  1.0  0.0  0.0  1.0  0.0  0.0  NaN  0.0  0.0
5  0.0  NaN  1.0  NaN  NaN  NaN  NaN  0.0  1.0  NaN
6  0.0  1.0  0.0  0.0  NaN  1.0  NaN  NaN  0.0  NaN
7  0.0  NaN  1.0  1.0  NaN  1.0  1.0  1.0  1.0  NaN
8  0.0  0.0  NaN  0.0  NaN  1.0  0.0  0.0  NaN  NaN
9  NaN  NaN  0.0  NaN  NaN  NaN  0.0  1.0  1.0  NaN

In [35]: pd.set_option('max_info_rows', 11)

In [36]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
0    8 non-null float64
1    5 non-null float64
2    8 non-null float64
3    7 non-null float64
4    5 non-null float64
5    7 non-null float64
6    6 non-null float64
7    6 non-null float64
8    8 non-null float64
9    3 non-null float64
dtypes: float64(10)
memory usage: 872.0 bytes

In [37]: pd.set_option('max_info_rows', 5)

In [38]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 10 columns):
0    float64
1    float64
2    float64
3    float64
4    float64
5    float64
6    float64
7    float64
8    float64
9    float64
dtypes: float64(10)
memory usage: 872.0 bytes

In [39]: pd.reset_option('max_info_rows')

display.precision sets the output display precision in terms of decimal places. This is only a suggestion.

In [40]: df = pd.DataFrame(np.random.randn(5,5))

In [41]: pd.set_option('precision',7)

In [42]: df
Out[42]: 
           0          1          2          3          4
0 -2.0490276  2.8466122 -1.2080493 -0.4503923  2.4239054
1  0.1211080  0.2669165  0.8438259 -0.2225400  2.0219807
2 -0.7167894 -2.2244851 -1.0611370 -0.2328247  0.4307933
3 -0.6654779  1.8298075 -1.4065093  1.0782481  0.3227741
4  0.2003243  0.8900241  0.1948132  0.3516326  0.4488815

In [43]: pd.set_option('precision',4)

In [44]: df
Out[44]: 
        0       1       2       3       4
0 -2.0490  2.8466 -1.2080 -0.4504  2.4239
1  0.1211  0.2669  0.8438 -0.2225  2.0220
2 -0.7168 -2.2245 -1.0611 -0.2328  0.4308
3 -0.6655  1.8298 -1.4065  1.0782  0.3228
4  0.2003  0.8900  0.1948  0.3516  0.4489

display.chop_threshold sets at what level pandas rounds to zero when it displays a Series of DataFrame. Note, this does not effect the precision at which the number is stored.

In [45]: df = pd.DataFrame(np.random.randn(6,6))

In [46]: pd.set_option('chop_threshold', 0)

In [47]: df
Out[47]: 
        0       1       2       3       4       5
0 -0.1979  0.9657 -1.5229 -0.1166  0.2956 -1.0477
1  1.6406  1.9058  2.7721  0.0888 -1.1442 -0.6334
2  0.9254 -0.0064 -0.8204 -0.6009 -1.0393  0.8248
3 -0.8241 -0.3377 -0.9278 -0.8401  0.2485 -0.1093
4  0.4320 -0.4607  0.3365 -3.2076 -1.5359  0.4098
5 -0.6731 -0.7411 -0.1109 -2.6729  0.8645  0.0609

In [48]: pd.set_option('chop_threshold', .5)

In [49]: df
Out[49]: 
        0       1       2       3       4       5
0  0.0000  0.9657 -1.5229  0.0000  0.0000 -1.0477
1  1.6406  1.9058  2.7721  0.0000 -1.1442 -0.6334
2  0.9254  0.0000 -0.8204 -0.6009 -1.0393  0.8248
3 -0.8241  0.0000 -0.9278 -0.8401  0.0000  0.0000
4  0.0000  0.0000  0.0000 -3.2076 -1.5359  0.0000
5 -0.6731 -0.7411  0.0000 -2.6729  0.8645  0.0000

In [50]: pd.reset_option('chop_threshold')

display.colheader_justify controls the justification of the headers. Options are ‘right’, and ‘left’.

In [51]: df = pd.DataFrame(np.array([np.random.randn(6), np.random.randint(1,9,6)*.1, np.zeros(6)]).T,
   ....:                   columns=['A', 'B', 'C'], dtype='float')
   ....: 

In [52]: pd.set_option('colheader_justify', 'right')

In [53]: df
Out[53]: 
        A    B    C
0  0.9331  0.3  0.0
1  0.2888  0.2  0.0
2  1.3250  0.2  0.0
3  0.5892  0.7  0.0
4  0.5314  0.1  0.0
5 -1.1987  0.7  0.0

In [54]: pd.set_option('colheader_justify', 'left')

In [55]: df
Out[55]: 
   A       B    C  
0  0.9331  0.3  0.0
1  0.2888  0.2  0.0
2  1.3250  0.2  0.0
3  0.5892  0.7  0.0
4  0.5314  0.1  0.0
5 -1.1987  0.7  0.0

In [56]: pd.reset_option('colheader_justify')