4.7.7. statsmodels.tools.grouputils

Tools for working with groups

This provides several functions to work with groups and a Group class that keeps track of the different representations and has methods to work more easily with groups.

Author: Josef Perktold, Author: Nathaniel Smith, recipe for sparse_dummies on scipy user mailing list

Created on Tue Nov 29 15:44:53 2011 : sparse_dummies Created on Wed Nov 30 14:28:24 2011 : combine_indices changes: add Group class

4.7.7.1. Notes

This reverses the class I used before, where the class was for the data and the group was auxiliary. Here, it is only the group, no data is kept.

sparse_dummies needs checking for corner cases, e.g. what if a category level has zero elements? This can happen with subset

selection even if the original groups where defined as arange.

Not all methods and options have been tried out yet after refactoring

need more efficient loop if groups are sorted -> see GroupSorted.group_iter

4.7.7.1.1. Functions

combine_indices(groups[, prefix, sep, ...]) use np.unique to get integer group indices for product, intersection
dummy_sparse(groups) create a sparse indicator from a group array with integer labels
group_sums(x, group[, use_bincount]) simple bincount version, again
group_sums_dummy(x, group_dummy) sum by groups given group dummy variable
npc_unique(ar[, return_index, return_inverse]) Find the unique elements of an array.

4.7.7.1.2. Classes

Group(group[, name])
GroupSorted(group[, name])
Grouping(index[, names]) index : index-like