8000 DOC Added information about space complexity to docs DBSCAN by StefanieSenger · Pull Request #26783 · scikit-learn/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

DOC Added information about space complexity to docs DBSCAN #26783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 20, 2023

Conversation

StefanieSenger
Copy link
Contributor

Reference Issues/PRs

#26726

What does this implement/fix? Explain your changes.

Added information about space complexity to docstring; because users were wondering about the huge RAM usage if param eps is high, while param min_samples is low.

@github-actions
Copy link
github-actions bot commented Jul 6, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: d200e03. Link to the linter CI: here

@adrinjalali
Copy link
Member

I think you used Alt+Q to format the code (line length) and it has formatted the whole docstring, resulting in a larger than needed diff.

Could you please redo those? You can also set it in the settings of your vscode that Alt+Q only formats the current paragraph instead of the whole section.

@StefanieSenger
Copy link
Contributor Author

Okay, did it. Thanks for your support!

@adrinjalali
Copy link
Member

I'm wondering if this is good as is, or if it should be a note (.. note :: kinda thing) right before the example.

@StefanieSenger StefanieSenger changed the title DOC Added information about space complexity DOC Added information about space complexity to docs DBSCAN Jul 7, 2023
@StefanieSenger
Copy link
Contributor Author

I've added a sentence to clarify what min_samples tunes, because I find its naming not very intuitive.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@adrinjalali adrinjalali enabled auto-merge (squash) July 13, 2023 14:03
@adrinjalali adrinjalali merged commit 889b829 into scikit-learn:main Jul 20, 2023
@StefanieSenger StefanieSenger deleted the docs_dbscan branch July 21, 2023 09:28
punndcoder28 pushed a commit to punndcoder28/scikit-learn that referenced this pull request Jul 29, 2023
…earn#26783)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Sep 18, 2023
…earn#26783)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
jeremiedbb pushed a commit that referenced this pull request Sep 20, 2023
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
…earn#26783)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@kno10
Copy link
Contributor
kno10 commented Feb 21, 2024

The claim "The worst case memory complexity of DBSCAN is O(n²), which can occur when the eps param is large and min_samples is low." is incorrect.

(Original) DBSCAN has linear memory requirements, and worst-case quadratic distance computations. The O(n²) memory use is due to the sklearn implementation, as it was already documented in the "Notes" just a bit further down.

In #28493 I propose a revised claim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
0