8000 Generalized Additive Models · yinlou/mltk Wiki · GitHub
[go: up one dir, main page]

Skip to content

Generalized Additive Models

Yin Lou edited this page May 18, 2017 · 3 revisions

MLTK currently supports building low-dimensional additive components in GAMs:

GAM

GAM works on both classification and regression problems. The following code trains a GAM object using standard method. The base learner is a tree ensemble with 100 trees of at most 3 leaves.

GAMLearner learner = new GAMLearner();
learner.setBaseLearner("tr:3:100");
learner.setMaxNumIters(100);
learner.setLearningRate(0.01);
learner.setTask(Task.REGRESSION);
learner.setMetric(new RMSE());

GAM gam = learner.build(trainSet);

GA2M

GA2M works on both classification and regression problems. The following code takes a GAM object and learn pairwise feature interactions on the (f1, f2), (f2, f3), (f1, f3).

List<IntPair> terms = new ArrayList<>();
terms.add(new IntPair(0, 1));
terms.add(new IntPair(1, 2));
terms.add(new IntPair(0, 2));

GA2MLearner learner = new GA2MLearner();
learner.setGAM(gam);
learner.setMaxNumIters(100);
learner.setTask(Task.REGRESSION);
learner.setMetric(metric);
learner.setPairs(terms);
learner.setLearningRate(0.01);

GAM gam = learner.build(trainSet);

Note that current MLTK only supports feature interactions on binned and nominal attributes. Numeric attributes should be discretized and convert them into binned attributes. It is recommended to discretize all the features before building GAM.

SPLAM

Sparse partially linear additive models (SPLAMs) automatically discovers which features should be included in the model, and when they are included, which of them are nonlinear features and which of them stay linear. Currently only cubic spline basis is supported for SPLAM. SPLAM is a special form of GAM that works on both classification and regression problems. The following code trains a GAM object using standard method.

SPLAMLearner learner = new SPLAMLearner();
learner.setNumKnots(10);
learner.setMaxNumIters(100);
learner.setAlpha(0.6);
learner.setLambda(0.1);
learner.setTask(Task.REGRESSION);

GAM gam = learner.build(trainSet);

The code above trains a SPLAM model using cubic spline basis with 10 knots. The lambda is the regularization parameter and alpha (should be in (0, 1]) controls the regularization on linear and nonlinear terms. When alpha is set to 1, we are essentially training a sparse additive model (SPAM) and therefore no linear terms will be included in the model. When alpha is close to 0, SPLAM reduces to the lasso and we are essentially training a linear model.

0