In [1]: tips.head()
Out[1]: 
   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4

5.2 WHERE

Filtering in SQL is done via a WHERE clause.

SELECT *
FROM tips
WHERE time = 'Dinner'
LIMIT 5;

DataFrames can be filtered in multiple ways; the most intuitive of which is using boolean indexing.

In [2]: tips[tips['time'] == 'Dinner'].head(5)
Out[2]: 
   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4

The above statement is simply passing a Series of True/False objects to the DataFrame, returning all rows with True.

In [3]: is_dinner = tips['time'] == 'Dinner'

In [4]: is_dinner.value_counts()
Out[4]: 
True     176
False     68
Name: time, dtype: int64

In [5]: tips[is_dinner].head(5)
Out[5]: 
   total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4

Just like SQL’s OR and AND, multiple conditions can be passed to a DataFrame using | (OR) and & (AND).

-- tips of more than $5.00 at Dinner meals
SELECT *
FROM tips
WHERE time = 'Dinner' AND tip > 5.00;
# tips of more than $5.00 at Dinner meals
In [2]: tips[(tips['time'] == 'Dinner') & (tips['tip'] > 5.00)]
Out[2]: 
     total_bill   tip     sex smoker  day    time  size
23        39.42  7.58    Male     No  Sat  Dinner     4
44        30.40  5.60    Male     No  Sun  Dinner     4
47        32.40  6.00    Male     No  Sun  Dinner     4
52        34.81  5.20  Female     No  Sun  Dinner     4
..          ...   ...     ...    ...  ...     ...   ...
211       25.89  5.16    Male    Yes  Sat  Dinner     4
212       48.33  9.00    Male     No  Sat  Dinner     4
214       28.17  6.50  Female    Yes  Sat  Dinner     3
239       29.03  5.92    Male     No  Sat  Dinner     3

[15 rows x 7 columns]
-- tips by parties of at least 5 diners OR bill total was more than $45
SELECT *
FROM tips
WHERE size >= 5 OR total_bill > 45;
# tips by parties of at least 5 diners OR bill total was more than $45
In [3]: tips[(tips['size'] >= 5) | (tips['total_bill'] > 45)]
Out[3]: 
     total_bill   tip     sex smoker   day    time  size
59        48.27  6.73    Male     No   Sat  Dinner     4
125       29.80  4.20  Female     No  Thur   Lunch     6
141       34.30  6.70    Male     No  Thur   Lunch     6
142       41.19  5.00    Male     No  Thur   Lunch     5
..          ...   ...     ...    ...   ...     ...   ...
185       20.69  5.00    Male     No   Sun  Dinner     5
187       30.46  2.00    Male    Yes   Sun  Dinner     5
212       48.33  9.00    Male     No   Sat  Dinner     4
216       28.15  3.00    Male    Yes   Sat  Dinner     5

[13 rows x 7 columns]

NULL checking is done using the notnull() and isnull() methods.

In [4]: frame = pd.DataFrame({'col1': ['A', 'B', np.NaN, 'C', 'D'],
   ...:                       'col2': ['F', np.NaN, 'G', 'H', 'I']})
   ...: 

In [5]: frame
Out[5]: 
  col1 col2
0    A    F
1    B  NaN
2  NaN    G
3    C    H
4    D    I

Assume we have a table of the same structure as our DataFrame above. We can see only the records where col2 IS NULL with the following query:

SELECT *
FROM frame
WHERE col2 IS NULL;
In [6]: frame[frame['col2'].isnull()]
Out[6]: 
  col1 col2
1    B  NaN

Getting items where col1 IS NOT NULL can be done with notnull().

SELECT *
FROM frame
WHERE col1 IS NOT NULL;
In [7]: frame[frame['col1'].notnull()]
Out[7]: 
  col1 col2
0    A    F
1    B  NaN
3    C    H
4    D    I