10000 Document what cost function LogisticRegression minimizes for each solver · Issue #10164 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
Document what cost function LogisticRegression minimizes for each solver #10164
Open
@allComputableThings

Description

@allComputableThings

Description

Different choices of solver for sklearn's LogisticRegression optimize different cost functions - its highly confusing behavior, particularly of concern if you want to publish what cost function you're using. In particular:

sklearn.LR(solver=liblinear) minimizes: L + lam*Rb
sklearn.LR(solver=others) minimizes:    L + lam*R
statsmodels.GLM(bionomial) minimizes:   L/n + lam*Rb

where:

lam = 1/C
L = logloss
n = training sample size
R = square of L2 norm of feature weights
Rb =square of L2 norm of feature weights and intercept

I was a little surprised to find that the logloss is not normalized by the training set size. I think this is uncommon, and means the effective C changes based on the amount of training data. Good thing, bad thing? Not sure, but it seems unusual, but more importantly, what is minimized should be explicit.

PS. #10001 --- excellent idea! The default liblinear cost function is just plain confusing.

Steps/Code to Reproduce

There's an example to show the different weights here:

https://stackoverflow.com/questions/47338695/why-does-the-choice-of-solver-result-in-different-weight-in-sklearn-logisticreg

Expected Results

Actual Results

Versions

1.19.1

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0