8000 SVC with ADABoosting · Issue #16642 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

SVC with ADABoosting #16642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
QuantumChamploo opened this issue Mar 5, 2020 · 8 comments
Open

SVC with ADABoosting #16642

QuantumChamploo opened this issue Mar 5, 2020 · 8 comments

Comments

@QuantumChamploo
Copy link

I am trying to use SVC with ADAboosting. WIth SVC, but not other base estimators, the initial estimator does not seemed to be trained.

Comparing the initial estimator of the ensemble to a single estimator with the same hyper-parameters

create an ADA ensemble

Screen Shot 2020-03-04 at 9 15 10 PM

Create a single SVC

Screen Shot 2020-03-04 at 9 15 20 PM

Compare the accuracies

Screen Shot 2020-03-04 at 9 15 49 PM

I have tried different hyper-parameters and had the same issue. This issue does happen for RandomForests and DecisionTree Classifiers

@jnothman
Copy link
Member
jnothman commented Mar 5, 2020

Please provide runnable code so that we can try to reproduce the issue.

@QuantumChamploo
Copy link
Author

Archive.zip

I have attached a zip of a python script and a jupyter notebook with the issue. Is the the right way to post it? Probably should have used a gist?

As a note, i have tried multiple ways to scale the input data, and the same issue happens

@glemaitre
Copy link
Member

You can directly post the example here between triple backticks marker.

@QuantumChamploo
Copy link
Author
QuantumChamploo commented Mar 6, 2020
from sklearn.datasets import fetch_openml
from sklearn import datasets
from sklearn.ensemble import AdaBoostClassifier

from sklearn.model_selection import train_test_split
from sklearn import metrics

from sklearn.svm import SVC
from sklearn import preprocessing

mnist = fetch_openml('mnist_784', cache=False)

X = mnist.data.astype('float32')
y = mnist.target.astype('int64')
#X = preprocessing.scale(X)
X /= 255.0

size = 10000

train_x = X[:size]
train_y = y[:size]

X_train, X_test, Y_train, Y_test = train_test_split(train_x, train_y, test_size=0.6,shuffle=True)

abc = AdaBoostClassifier(SVC(random_state=0, probability=True, tol=1e-5,gamma=.01),n_estimators=3,learning_rate=.9)
abc.fit(X_train, Y_train)

svc = SVC(random_state=0, probability=True, tol=1e-5,gamma=.01)
svc.fit(X_train, Y_train)

print("base acc")
print(abc.estimators_[0].score(X_test,Y_test))
print("svc with same training")
print(svc.score(X_test,Y_test))

@cmarmo
Copy link
Contributor
cmarmo commented Oct 8, 2020

Hi @QuantumChamploo, I think this is the expected behavior. From the documentation:

The data modifications at each so-called boosting iteration consist of applying weights w_1, w_2, …, w_n
to each of the training samples. Initially, those weights are all set to w_i = 1 / N, so that the first step simply
trains a weak learner on the original data.

@ogrisel
Copy link
Member
ogrisel commented Nov 9, 2020

Still it's weird to get such a smaller accuracy just by uniformly re weighing the first fit by sample_weight=np.ones(n_samples) / n_samples instead of sample_weight=None or equivalently sample_weight=np.ones(n_samples).

@ogrisel
Copy link
Member
ogrisel commented Nov 9, 2020

This is probably related to #15657: as some (most?) scikit-learn estimators already do a 1/n_samples or 1/sample_weight.sum() in their inner loss computation, it's possible that Adaboost should not do it a second time in its own outer fit method.

But on the other hand I would not have expected SVC to give such poor results when we reweigh uniformly. But maybe this is because then the regularizer term of the loss function starts to dominate the data-fit term and we get a constant predictor as a result.

@glemaitre
Copy link
Member

Indeed to get the equivalence, one could multiply C by the number of samples.

FYI:

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import AdaBoostClassifier
from sklearn.svm import SVC

X, y = datasets.load_digits(return_X_y=True)
X /= 16

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.6, shuffle=True, random_state=0
)

svc = SVC(
    C=1 * len(y_train), probability=True, tol=1e-5, gamma=.01, random_state=0
)

adaboost = AdaBoostClassifier(svc, n_estimators=3, learning_rate=.9)
adaboost.fit(X_train, y_train)

svc.set_params(C=1)
svc.fit(X_train, y_train)

print("First weak learner in AdaBoost")
print(adaboost.estimators_[0].score(X_test, y_test))
print("SVC learner alone")
print(svc.score(X_test, y_test))
First weak learner in AdaBoost
0.9341983317886933
SVC learner alone
0.9341983317886933

Of course, we still have an issue in scikit-learn because we do not have a consistent formulation of sample_weight in the different estimator.

@glemaitre glemaitre added Bug and removed Bug: triage labels Dec 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
0