-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] GaussianMixture with BIC/AIC #26735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@amueller @NicolasHug What do you think? |
@jjerphan @ogrisel This PR is designed to greatly simplify user's life when trying to do something like in this tutorial: https://scikit-learn.org/stable/auto_examples/mixture/plot_gmm_selection.html, and is highly simplified relative to other recently proposed PRs, such as #19562. We are grateful for your feedback, and look forward to finalizing this. |
@adam2392 Hey Adam - Curious to hear what you think about this? |
Hello Adam @adam2392, We have reviewed and addressed all feedback provided here. Below is a summary of the primary comments and our responses:
With these revisions, we believe the code is ready for the next review. Please let us know if there is anything further we should adjust. Thank you very much for your time and insights! |
Hey @tingshanL okay thanks! I think this will require a discussion among some of the more senior maintainers on the team. Personally, I do see the use of a mclust like algorithm within Python/sklearn ecosystem, since I have used it in the past in R. However, we'll have to see what others say… I know they're busy, and including new models is not super easy, so thank you for the patience. Out of curiosity (perhaps you can just summarize in the PR description if we want to reserve the space for other discussion), how does this differ from mclust? If there are significant differences, what would need to be done to make this 1-1 matching? This is just some high level info I'm curious on to guide the discussion, so feel free to not spend too much time addressing this question. |
Reference Issues/PRs
Fixes #19338. Automates the selection in Gaussian Mixture Model Selection. Adds a basic
GaussianMixtureIC
estimator without initializing with agglomerative clustering as discussed in #19562.What does this implement/fix? Explain your changes.
It automatically selects the best GM model based on BIC or AIC among a set of models that are parameterized by:
Covariance constraints
Number of components
Any other comments?