-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
ENH Add HDBSCAN
as a new estimator in sklearn.cluster
#26385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…#22616) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…ikit-learn#25538) Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
…cikit-learn#24698) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Since all PRs target the branch of this PR have been reviewed, merged, and are referenced in the issue, I thinking making the CI pass is sufficient to accept this PR. What do you think? |
I am going to mark it as a draft for now to resolve CI issues (it seems some changes weren't carried over due to git shenanigans) but will reopen once those are addressed. |
Should be good for review now @jjerphan @thomasjpfan!
There are some not-yet-reviewed changes. For your convenience, here are the novel changes (all of which are included in these five commits) These changes are comprised of:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is reasonable to wait for a third reviewer to approve this PR before merging it. |
cc: @scikit-learn/core-devs in case anyone would like to donate a third review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright here: https://github.com/scikit-learn-contrib/hdbscan/blob/master/LICENSE isn't really BSD, and doesn't allow to just use it, to me it seems we need a copy of that file here? Unless that copyright is changed or we somehow get a written permission from all contributors to that package.
I left a few comments just to make a few things more readable, otherwise I'm happy to merge.
@adrinjalali Should be ready for a second-pass :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PRETTY KEWL!!!
Thank you for working on integrating HDBSCAN in scikit-learn, @Micky774! 🎉 EDCF p> |
Thank you to everyone for all your help with this! This was a massive collaboration and I greatly appreciate your persistence and patience with review and API discussions <3 @lmcinnes Wanted to ping you as well! Thanks again to you and your team for the fantastic original implementation. The first major hurdle is done, now to slowly reintroduce some functionality (e.g. boruvka), optimize, and extend support (e.g. float32 data). |
…arn#26385) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com> Co-authored-by: Julien Jerphanion <git@jjerphan.xyz> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Reference Issues/PRs
Towards #24686
What does this implement/fix? Explain your changes.
Each change has been separately reviewed (see #24686 for details).
Edit: due to git shenanigans, some changes needed to be made within this PR. The novel changes included w/ this PR are:
cnp.*_t
typing with*_t
from_typedefs.pxd
*.shape[0]
pattern withlen(*)
forndarray
objectsAny other comments?
cc: @thomasjpfan @jjerphan @glemaitre