8000 DOC How is randomization handled · Issue #243 · fairlearn/fairlearn · GitHub
[go: up one dir, main page]

Skip to content
DOC How is randomization handled #243
@adrinjalali

Description

@adrinjalali

The docs say:

Randomization. In contrast with scikit-learn, estimators in fairlearn can produce randomized predictors. Randomization of predictions is required to satisfy many definitions of fairness. Because of randomization, it is possible to get different outputs from the predictor's predict method on identical data. For each of our methods, we provide explicit access to the probability distribution used for randomization.

sklearn does have randomization in many estimators (RandomForests as an example :P). But randomness is always controlled by a random_seed parameter. Reproducibility requires setting this parameter, to be able to go back and reproduce the results.

It is understandable if in the context of fairness the RNG shouldn't be fixed, but shouldn't the user be able to feed in a seed or a seed and have reproducible results?

Also, the user can set the RNG, and still get probabilistic output given the same input. I could have:

clf = MyClassifier(random_seed=42)
clf.fit(X, y)
clf.predict(x0) -> returns 0
clf.predict(x0) -> returns 1

but if the user runs the same script again, they'll get the same output as before.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0