Generalized Additive Models

MLTK currently supports building low-dimensional additive components in GAMs:

GAM
GA²M
SPLAM

GAM

GAM works on both classification and regression problems. The following code trains a GAM object using standard method. The base learner is a tree ensemble with 100 trees of at most 3 leaves.

GAMLearner learner = new GAMLearner();
learner.setBaseLearner("tr:3:100");
learner.setMaxNumIters(100);
learner.setLearningRate(0.01);
learner.setTask(Task.REGRESSION);
learner.setMetric(new RMSE());

GAM gam = learner.build(trainSet);

GA²M

GA²M works on both classification and regression problems. The following code takes a GAM object and learn pairwise feature interactions on the (f1, f2), (f2, f3), (f1, f3).

List<IntPair> terms = new ArrayList<>();
terms.add(new IntPair(0, 1));
terms.add(new IntPair(1, 2));
terms.add(new IntPair(0, 2));

GA2MLearner learner = new GA2MLearner();
learner.setGAM(gam);
learner.setMaxNumIters(100);
learner.setTask(Task.REGRESSION);
learner.setMetric(metric);
learner.setPairs(terms);
learner.setLearningRate(0.01);

GAM gam = learner.build(trainSet);

Note that current MLTK only supports feature interactions on binned and nominal attributes. Numeric attributes should be discretized and convert them into binned attributes. It is recommended to discretize all the features before building GAM.

SPLAM

Sparse partially linear additive models (SPLAMs) automatically discovers which features should be included in the model, and when they are included, which of them are nonlinear features and which of them stay linear. Currently only cubic spline basis is supported for SPLAM. SPLAM is a special form of GAM that works on both classification and regression problems. The following code trains a GAM object using standard method.

SPLAMLearner learner = new SPLAMLearner();
learner.setNumKnots(10);
learner.setMaxNumIters(100);
learner.setAlpha(0.6);
learner.setLambda(0.1);
learner.setTask(Task.REGRESSION);

GAM gam = learner.build(trainSet);

The code above trains a SPLAM model using cubic spline basis with 10 knots. The lambda is the regularization parameter and alpha (should be in (0, 1]) controls the regularization on linear and nonlinear terms. When alpha is set to 1, we are essentially training a sparse additive model (SPAM) and therefore no linear terms will be included in the model. When alpha is close to 0, SPLAM reduces to the lasso and we are essentially training a linear model.

Getting Started

Installation

User Guide

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalized Additive Models

GAM

GA²M

SPLAM

Clone this wiki locally