10000 make_classification results are not proportional to weights · Issue #18717 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content
make_classification results are not proportional to weights  #18717
Closed
@daniel-yj-yang

Description

@daniel-yj-yang

Describe the bug

The online documentation says "weights: ... The proportions of samples assigned to each class." But the results showed otherwise.

Steps/Code to Reproduce

from sklearn.datasets import make_classification
from collections import Counter

for X_weight in [.90, .99, .995, .999]:
    X, y = make_classification(n_samples = 100000, weights=[X_weight,], random_state=1)
    print(Counter(y))

Expected Results

Something similar to the following:
Counter({0: 90000, 1: 10000})
Counter({0: 99000, 1: 1000})
Counter({0: 99500, 1: 500})
Counter({0: 99900, 1: 100})

Actual Results

Counter({0: 89607, 1: 10393})
Counter({0: 98525, 1: 1475})
Counter({0: 99016, 1: 984})
Counter({0: 99410, 1: 590})

Versions

System:
python: 3.8.3 (default, Jul 2 2020, 11:26:31) [Clang 10.0.0 ]
executable: /Users/.../opt/anaconda3/bin/python
machine: macOS-10.15.7-x86_64-i386-64bit

Python dependencies:
pip: 20.1.1
setuptools: 50.3.0
sklearn: 0.23.2
numpy: 1.19.2
scipy: 1.5.2
Cython: 0.29.21
pandas: 1.1.2
matplotlib: 3.3.2
joblib: 0.16.0
threadpoolctl: 2.1.0

Built with OpenMP: True

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0