1.1.7. patsy.demo_data

patsy.demo_data(*names, nlevels=2, min_rows=5)[source]

Create simple categorical/numerical demo data.

Pass in a set of variable names, and this function will return a simple data set using those variable names.

Names whose first letter falls in the range “a” through “m” will be made categorical (with nlevels levels). Those that start with a “p” through “z” are numerical.

We attempt to produce a balanced design on the categorical variables, repeating as necessary to generate at least min_rows data points. Categorical variables are returned as a list of strings.

Numerical data is generated by sampling from a normal distribution. A fixed random seed is used, so that identical calls to demo_data() will produce identical results. Numerical data is returned in a numpy array.

Example: