4 Working with missing data
In this section, we will discuss missing (also referred to as NA) values in pandas.
Note
The choice of using NaN
internally to denote missing data was largely
for simplicity and performance reasons. It differs from the MaskedArray
approach of, for example, scikits.timeseries
. We are hopeful that
NumPy will soon be able to provide a native NA type solution (similar to R)
performant enough to be used in pandas.
See the cookbook for some advanced strategies
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: pd.options.display.max_rows=8
In [4]: import matplotlib
In [5]: matplotlib.style.use('ggplot')
In [6]: import matplotlib.pyplot as plt