1.1.2. patsy.bs¶
-
patsy.
bs
(x, df=None, knots=None, degree=3, include_intercept=False, lower_bound=None, upper_bound=None)¶ Generates a B-spline basis for
x
, allowing non-linear fits. The usual usage is something like:y ~ 1 + bs(x, 4)
to fit
y
as a smooth function ofx
, with 4 degrees of freedom given to the smooth.Parameters: - df – The number of degrees of freedom to use for this spline. The
return value will have this many columns. You must specify at least one
of
df
andknots
. - knots – The interior knots to use for the spline. If unspecified, then
equally spaced quantiles of the input data are used. You must specify at
least one of
df
andknots
. - degree – The degree of the spline to use.
- include_intercept – If
True
, then the resulting spline basis will span the intercept term (i.e., the constant function). IfFalse
(the default) then this will not be the case, which is useful for avoiding overspecification in models that include multiple spline terms and/or an intercept term. - lower_bound – The lower exterior knot location.
- upper_bound – The upper exterior knot location.
A spline with
degree=0
is piecewise constant with breakpoints at each knot, and the default knot positions are quantiles of the input. So if you find yourself in the situation of wanting to quantize a continuous variable intonum_bins
equal-sized bins with a constant effect across each bin, you can usebs(x, num_bins - 1, degree=0)
. (The- 1
is because one degree of freedom will be taken by the intercept; alternatively, you could leave the intercept term out of your model and usebs(x, num_bins, degree=0, include_intercept=True)
.A spline with
degree=1
is piecewise linear with breakpoints at each knot.The default is
degree=3
, which gives a cubic b-spline.This is a stateful transform (for details see stateful-transforms). If
knots
,lower_bound
, orupper_bound
are not specified, they will be calculated from the data and then the chosen values will be remembered and re-used for prediction from the fitted model.Using this function requires scipy be installed.
Note
This function is very similar to the R function of the same name. In cases where both return output at all (e.g., R’s
bs
will raise an error ifdegree=0
, while patsy’s will not), they should produce identical output given identical input and parameter settings.Warning
I’m not sure on what the proper handling of points outside the lower/upper bounds is, so for now attempting to evaluate a spline basis at such points produces an error. Patches gratefully accepted.
New in version 0.2.0.
- df – The number of degrees of freedom to use for this spline. The
return value will have this many columns. You must specify at least one
of