Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
ENH FEA add interaction constraints to HGBT #21020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH FEA add interaction constraints to HGBT #21020
Changes from all commits
244c409
b31eea0
d9b273a
1cc1cb5
7baf695
8aced52
9a9862c
ed31a7e
f2a0679
eb1e255
ec48945
1ed28d2
c7c8c3f
764cdf5
5a26f6e
eb75a30
3ea3829
0570f61
16fc0b8
2b7e1e2
ead3b0c
5a35ab7
c93d3f0
5092f6b
a18b5ee
6a02058
aa21d16
63191c0
299f31b
9ec7b04
4c9e1a3
c09ba91
ba78cb9
c7a6ebe
c8a3a30
3b6703a
255646a
7100600
31c6c3e
bd62aea
ec66be7
eed05ac
d66f40a
fb9f0b1
0fe1227
22ecd8d
ee86a77
13b0aaf
1c75630
b9d880b
10023c0
4265d23
a7559b1
3bacb79
cf4eb15
6653a4e
8d02553
4989a26
4a90e0f
61b1e06
38caedb
45d178d
9667937
ca270f5
9fb3e55
5240d9f
295aeee
9560ea7
28c4578
4d4b80a
461cd6a
e0e8220
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So specifying a GAM means
interaction_cst=[{i} for i in range(X.shape[1])]
, and doing all pairwise interactions would beinteraction_cst=[{i} for i in range(X.shape[1])] + [{i, j} for i, j in itertools.combinations(range(X.shape[1]), 2)]
or something like that?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talking with @ogrisel and @adrinjalali maybe an option would be to have string special cases for univariate and bivariate, and then this would be a good first step :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amueller : n my view, this is not necessary. These two cases are actually less important than one might think when working with interaction constraints. In practice you would simply limit the number of terminal nodes (2 resp. 3) instead of using constraints. The interesting cases are asymmetric ones, e.g. some variables are forced to act additively and others not. There, the intended interface is actually very convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? That's not equivalent at all, is it?
I think the motivation for me is interpretable models, and you can get interpretable models that are using deeper trees, which is not the same as boosting stumps as far as I can see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I just realized that this interface doesn't allow restricting to interactions of two features, right? Passing all tuples can still result in trees using more than 2 features, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resulting trees are not identical, but the resulting model structure is in the sense that the interaction constraints are fulfilled in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the pairwise case: If Christian implemented it correctly (I don't doubt!), then each branch in each tree will use only features of one constraint set. As such, each tree prediction will use only two features. But the tree will usually use three features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I had the same thought 😏 Maybe we can we do that in a follow-up PR?
All pairwise interactions would just be
interaction_cst = list(itertools.combinations(range(n_features), 2)))
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. We need to check how other libraries do it. I wonder if this is a significantly different inductive bais compared to have each tree work only with a small subset of features.
I think allowing deep trees (or deep branches) with a large number of splits but on a small subset of the features (typically 1 or 2) is an interesting inductive bias (similar to GAMs): it allows for decision function with complex non-linear feature-wise functions but very decoupled inter-features decisions. Relying of sequential decision stumps via more iterations of the gradient boosting algorithm is probably quite different from an inductive bias point of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constraint is fulfilled but the models are not equivalent in any way, right? Aka what @ogrisel said, it's quite a different model.
One of the reasons people restrict to pairwise interactions is so that the full model can be visualized. That's much harder with three features. There is no way to achieve trees that are on pairs of features with this PR, right?