GroupBy

GroupBy objects are returned by groupby calls: pandas.DataFrame.groupby(), pandas.Series.groupby(), etc.

Indexing, iteration

GroupBy.__iter__() Groupby iterator
GroupBy.groups dict {group name -> group labels}
GroupBy.indices dict {group name -> group indices}
GroupBy.get_group(name[, obj]) Constructs NDFrame from group with provided name
Grouper([key, level, freq, axis, sort]) A Grouper allows the user to specify a groupby instruction for a target

Function application

GroupBy.apply(func, *args, **kwargs) Apply function and combine results together in an intelligent way.
GroupBy.aggregate(func, *args, **kwargs)
GroupBy.transform(func, *args, **kwargs)

Computations / Descriptive Stats

GroupBy.count() Compute count of group, excluding missing values
GroupBy.cumcount([ascending]) Number each item in each group from 0 to the length of that group - 1.
GroupBy.first() Compute first of group values
GroupBy.head([n]) Returns first n rows of each group.
GroupBy.last() Compute last of group values
GroupBy.max() Compute max of group values
GroupBy.mean(*args, **kwargs) Compute mean of groups, excluding missing values
GroupBy.median() Compute median of groups, excluding missing values
GroupBy.min() Compute min of group values
GroupBy.nth(n[, dropna]) Take the nth row from each group if n is an int, or a subset of rows if n is a list of ints.
GroupBy.ohlc() Compute sum of values, excluding missing values
GroupBy.prod() Compute prod of group values
GroupBy.size() Compute group sizes
GroupBy.sem([ddof]) Compute standard error of the mean of groups, excluding missing values
GroupBy.std([ddof]) Compute standard deviation of groups, excluding missing values
GroupBy.sum() Compute sum of group values
GroupBy.var([ddof]) Compute variance of groups, excluding missing values
GroupBy.tail([n]) Returns last n rows of each group

The following methods are available in both SeriesGroupBy and DataFrameGroupBy objects, but may differ slightly, usually in that the DataFrameGroupBy version usually permits the specification of an axis argument, and often an argument indicating whether to restrict application to columns of a specific data type.

DataFrameGroupBy.agg(arg, *args, **kwargs) Aggregate using input function or dict of {column ->
DataFrameGroupBy.all([axis, bool_only, ...]) Return whether all elements are True over requested axis
DataFrameGroupBy.any([axis, bool_only, ...]) Return whether any element is True over requested axis
DataFrameGroupBy.bfill([limit]) Backward fill the values
DataFrameGroupBy.corr([method, min_periods]) Compute pairwise correlation of columns, excluding NA/null values
DataFrameGroupBy.count() Compute count of group, excluding missing values
DataFrameGroupBy.cov([min_periods]) Compute pairwise covariance of columns, excluding NA/null values
DataFrameGroupBy.cummax([axis, skipna]) Return cumulative max over requested axis.
DataFrameGroupBy.cummin([axis, skipna]) Return cumulative minimum over requested axis.
DataFrameGroupBy.cumprod([axis]) Cumulative product for each group
DataFrameGroupBy.cumsum([axis]) Cumulative sum for each group
DataFrameGroupBy.describe([percentiles, ...]) Generate various summary statistics, excluding NaN values.
DataFrameGroupBy.diff([periods, axis]) 1st discrete difference of object
DataFrameGroupBy.ffill([limit]) Forward fill the values
DataFrameGroupBy.fillna([value, method, ...]) Fill NA/NaN values using the specified method
DataFrameGroupBy.hist(data[, column, by, ...]) Draw histogram of the DataFrame’s series using matplotlib / pylab.
DataFrameGroupBy.idxmax([axis, skipna]) Return index of first occurrence of maximum over requested axis.
DataFrameGroupBy.idxmin([axis, skipna]) Return index of first occurrence of minimum over requested axis.
DataFrameGroupBy.mad([axis, skipna, level]) Return the mean absolute deviation of the values for the requested axis
DataFrameGroupBy.pct_change([periods, ...]) Percent change over given number of periods.
DataFrameGroupBy.plot Class implementing the .plot attribute for groupby objects
DataFrameGroupBy.quantile([q, axis, ...]) Return values at the given quantile over requested axis, a la numpy.percentile.
DataFrameGroupBy.rank([axis, method, ...]) Compute numerical data ranks (1 through n) along axis.
DataFrameGroupBy.resample(rule, *args, **kwargs) Provide resampling when using a TimeGrouper
DataFrameGroupBy.shift([periods, freq, axis]) Shift each group by periods observations
DataFrameGroupBy.size() Compute group sizes
DataFrameGroupBy.skew([axis, skipna, level, ...]) Return unbiased skew over requested axis
DataFrameGroupBy.take(indices[, axis, ...]) Analogous to ndarray.take
DataFrameGroupBy.tshift([periods, freq, axis]) Shift the time index, using the index’s frequency if available.

The following methods are available only for SeriesGroupBy objects.

SeriesGroupBy.nlargest(*args, **kwargs) Return the largest n elements.
SeriesGroupBy.nsmallest(*args, **kwargs) Return the smallest n elements.
SeriesGroupBy.nunique([dropna]) Returns number of unique elements in the group
SeriesGroupBy.unique() Return array of unique values in the object.
SeriesGroupBy.value_counts([normalize, ...])

The following methods are available only for DataFrameGroupBy objects.

DataFrameGroupBy.corrwith(other[, axis, drop]) Compute pairwise correlation between rows or columns of two DataFrame objects.
DataFrameGroupBy.boxplot(grouped[, ...]) Make box plots from DataFrameGroupBy data.