5.5 UNION

UNION ALL can be performed using concat().

In [1]: df1 = pd.DataFrame({'city': ['Chicago', 'San Francisco', 'New York City'],
   ...:                     'rank': range(1, 4)})
   ...: 

In [2]: df2 = pd.DataFrame({'city': ['Chicago', 'Boston', 'Los Angeles'],
   ...:                     'rank': [1, 4, 5]})
   ...: 

In [3]: df1
Out[3]: 
            city  rank
0        Chicago     1
1  San Francisco     2
2  New York City     3

In [4]: df2
Out[4]: 
          city  rank
0      Chicago     1
1       Boston     4
2  Los Angeles     5

SELECT city, rank
FROM df1
UNION ALL
SELECT city, rank
FROM df2;
/*
         city  rank
      Chicago     1
San Francisco     2
New York City     3
      Chicago     1
       Boston     4
  Los Angeles     5
*/

In [5]: pd.concat([df1, df2])
Out[5]: 
            city  rank
0        Chicago     1
1  San Francisco     2
2  New York City     3
0        Chicago     1
1         Boston     4
2    Los Angeles     5

SQL’s UNION is similar to UNION ALL, however UNION will remove duplicate rows.

SELECT city, rank
FROM df1
UNION
SELECT city, rank
FROM df2;
-- notice that there is only one Chicago record this time
/*
         city  rank
      Chicago     1
San Francisco     2
New York City     3
       Boston     4
  Los Angeles     5
*/

In pandas, you can use concat() in conjunction with drop_duplicates().

In [6]: pd.concat([df1, df2]).drop_duplicates()
Out[6]: 
            city  rank
0        Chicago     1
1  San Francisco     2
2  New York City     3
1         Boston     4
2    Los Angeles     5