Scaling features using MinMaxScaler makes DPGMM always have one cluster

I have noticed that if I scale my dataset using MinMaxScaler() then if I use DPGMM with whatever value for alpha, it will always create one cluster (label). This might be related to some numerical precision issue.

If I don't rescale the data or if instead of the MinMaxScaler() I use StandardScaler(), then this problem does not occur (i.e., the DPGMM creates more than one cluster).

Is this a bug in the sklearn.mixture.DPGMM or did I miss something ?

API is here: http://scikit-learn.org/stable/modules/generated/sklearn.mixture.DPGMM.html#sklearn.mixture.DPGMM

I have also tried on the artificial data in this example (from the official site) : http://scikit-learn.org/stable/auto_examples/mixture/plot_gmm.html#example-mixture-plot-gmm-py

It works, but if I resale the generated dataset X by adding the following line, then the DPGMM will create only one cluster:

from sklearn.preprocessing import MinMaxScaler
X = MinMaxScaler().fit_transform(X) # added this after X ...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions