1. patsy

patsy is a Python package for describing statistical models and building design matrices. It is closely inspired by the ‘formula’ mini-language used in R and S.

1.1. Functions

balanced([factor_name, repeat]) Create simple balanced factorial designs for testing.
bs(x[, df, knots, degree, ...]) Generates a B-spline basis for x, allowing non-linear fits.
build_design_matrices(design_infos, data[, ...]) Construct several design matrices from DesignMatrixBuilder objects.
cc(x[, df, knots, lower_bound, upper_bound, ...]) Generates a cyclic cubic spline basis for x (with the option of absorbing centering or more general parameters constraints), allowing non-linear fits.
center(x) A stateful transform that centers input data, i.e., subtracts the mean.
cr(x[, df, knots, lower_bound, upper_bound, ...]) Generates a natural cubic spline basis for x (with the option of absorbing centering or more general parameters constraints), allowing non-linear fits.
demo_data(*names[, nlevels, min_rows]) Create simple categorical/numerical demo data.
design_matrix_builders(termlists, ...[, ...]) Construct several DesignInfo objects from termlists.
dmatrices(formula_like[, data, eval_env, ...]) Construct two design matrices given a formula_like and data.
dmatrix(formula_like[, data, eval_env, ...]) Construct a single design matrix given a formula_like and data.
incr_dbuilder(formula_like, data_iter_maker) Construct a design matrix builder incrementally from a large data set.
incr_dbuilders(formula_like, data_iter_maker) Construct two design matrix builders incrementally from a large data set.
scale(*args, **kwargs) standardize(x, center=True, rescale=True, ddof=0)
standardize(x[, center, rescale, ddof]) A stateful transform that standardizes input data, i.e.
stateful_transform(class_) Create a stateful transform callable object from a class that fulfills the stateful transform protocol.
te(s1, .., sn[, constraints]) Generates smooth of several covariates as a tensor product of the bases of marginal univariate smooths s1, .., sn.

1.2. Classes

ContrastMatrix(matrix, column_suffixes) A simple container for a matrix used for coding categorical factors.
DesignInfo(column_names[, factor_infos, ...]) A DesignInfo object holds metadata about a design matrix.
DesignMatrix A simple numpy array subclass that carries design matrix metadata.
Diff Backward difference coding.
EvalEnvironment(namespaces[, flags]) Represents a Python execution environment.
EvalFactor(code[, origin])
FactorInfo(factor, type, state[, ...]) A FactorInfo object is a simple class that provides some metadata about the role of a factor within a model.
Helmert Helmert contrasts.
LinearConstraint(variable_names, coefs[, ...]) A linear constraint in matrix form.
LookupFactor(varname[, force_categorical, ...]) A simple factor class that simply looks up a named entry in the given data.
ModelDesc(lhs_termlist, rhs_termlist) A simple container representing the termlists parsed from a formula.
NAAction([on_NA, NA_types]) An NAAction object defines a strategy for handling missing data.
Origin(code, start, end) This represents the origin of some object in some string.
Poly([scores]) Orthogonal polynomial contrast coding.
SubtermInfo(factors, contrast_matrices, ...) A SubtermInfo object is a simple metadata container describing a single primitive interaction and how it is coded in our design matrix.
Sum([omit]) Deviation coding (also known as sum-to-zero coding).
Term(factors) The interaction between a collection of factor objects.
Treatment([reference]) Treatment coding (also known as dummy coding).

1.3. Exceptions

PatsyError(message[, origin]) This is the main error type raised by Patsy functions.