8000 ENH Splitter Injection and Refactoring of DepthFirstTreeBuilder's building mechanism by SamuelCarliles3 · Pull Request #67 · neurodata/scikit-learn · GitHub
[go: up one dir, main page]

Skip to content

ENH Splitter Injection and Refactoring of DepthFirstTreeBuilder's building mechanism #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: submodulev3
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
8c09f7f
init split condition injection
SamuelCarliles3 Feb 16, 2024
ecfc9b1
wip
SamuelCarliles3 Feb 16, 2024
0c3d5c0
wip
SamuelCarliles3 Feb 16, 2024
5fd12a2
wip
SamuelCarliles3 Feb 20, 2024
b593ee0
injection progress
SamuelCarliles3 Feb 27, 2024
180fac3
injection progress
SamuelCarliles3 Feb 27, 2024
c207c3e
split injection refactoring
SamuelCarliles3 Feb 27, 2024
7cc71c1
added condition parameter passthrough prototype
SamuelCarliles3 Feb 29, 2024
2470d49
some tidying
SamuelCarliles3 Feb 29, 2024
ee3399f
more tidying
SamuelCarliles3 Feb 29, 2024
a079e4f
splitter injection refactoring
SamuelCarliles3 Mar 10, 2024
5397b66
cython injection due diligence, converted min_sample and monotonic_cs…
SamuelCarliles3 Mar 15, 2024
44f1d57
tree tests pass huzzah!
SamuelCarliles3 Mar 18, 2024
4f19d53
added some splitconditions to header
SamuelCarliles3 Mar 18, 2024
cb71be0
commented out some sample code that was substantially increasing peak…
SamuelCarliles3 Mar 21, 2024
e34be5c
added vector resize
SamuelCarliles3 Apr 9, 2024
aac802e
wip
SamuelCarliles3 Apr 10, 2024
c12f2fd
Merge branch 'submodulev3' into scarliles/splitter-injection-redux
SamuelCarliles3 Apr 15, 2024
a7f5e92
settling injection memory management for now
SamuelCarliles3 Apr 15, 2024
7a70a0b
added regression forest benchmark
SamuelCarliles3 Apr 22, 2024
d9ad68a
Merge pull request #2 from ssec-jhu/scarliles/regression-benchmark
SamuelCarliles3 Apr 22, 2024
893d588
ran black for linting check
SamuelCarliles3 Apr 23, 2024
548493c
Merge branch 'submodulev3' of github.com:ssec-jhu/scikit-learn into s…
SamuelCarliles3 Apr 23, 2024
e4b53ff
Merge branch 'submodulev3' into scarliles/regression-benchmark
SamuelCarliles3 Apr 23, 2024
089d901
Merge branch 'neurodata:submodulev3' into submodulev3
SamuelCarliles3 Apr 24, 2024
3ba5f74
Merge branch 'submodulev3' of github.com:ssec-jhu/scikit-learn into s…
SamuelCarliles3 Apr 24, 2024
cf285c1
Merge branch 'scarliles/splitter-injection-redux' into scarliles/regr…
SamuelCarliles3 Apr 24, 2024
ffc6328
Merge pull request #3 from ssec-jhu/scarliles/regression-benchmark
SamuelCarliles3 Apr 24, 2024
87c90fd
initial pass at refactoring DepthFirstTreeBuilder.build
SamuelCarliles3 May 23, 2024
51da586
some renaming to make closure pattern more obvious
SamuelCarliles3 May 28, 2024
6c117a2
added SplitRecordFactory
SamuelCarliles3 May 28, 2024
c7b675b
Merge branch 'scarliles/update-node-refactor2' into scarliles/update-…
SamuelCarliles3 May 28, 2024
9e7b131
SplitRecordFactory progress
SamuelCarliles3 May 28, 2024
a017669
build loop refactor
SamuelCarliles3 May 29, 2024
4325b0a
add_or_update tweak
SamuelCarliles3 May 29, 2024
78c3a1b
reverted to back out build body refactor
SamuelCarliles3 May 30, 2024
b8cc636
refactor baby step
SamuelCarliles3 May 30, 2024
f225658
update node refactor more baby steps
SamuelCarliles3 May 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
added vector resize
  • Loading branch information
SamuelCarliles3 committed Apr 9, 2024
commit e34be5c58a6f26ed38634b2a7b53a95ed0aabe67
43 changes: 32 additions & 11 deletions sklearn/tree/_splitter.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -349,20 +349,41 @@ cdef class Splitter(BaseSplitter):
self.min_samples_leaf_condition = MinSamplesLeafCondition()
self.min_weight_leaf_condition = MinWeightLeafCondition()

self.presplit_conditions.push_back((<SplitCondition>self.min_samples_leaf_condition).t)
if presplit_conditions is not None:
for condition in presplit_conditions:
self.presplit_conditions.push_back((<SplitCondition>condition).t)

self.postsplit_conditions.push_back((<SplitCondition>self.min_weight_leaf_condition).t)
if postsplit_conditions is not None:
for condition in postsplit_conditions:
self.postsplit_conditions.push_back((<SplitCondition>condition).t)
self.presplit_conditions.resize(
(len(presplit_conditions) if presplit_conditions is not None else 0)
+ (2 if self.with_monotonic_cst else 1)
)
self.postsplit_conditions.resize(
(len(postsplit_conditions) if postsplit_conditions is not None else 0)
+ (2 if self.with_monotonic_cst else 1)
)

offset = 0
self.presplit_conditions[offset] = self.min_samples_leaf_condition.t
self.postsplit_conditions[offset] = self.min_weight_leaf_condition.t
offset += 1

if(self.with_monotonic_cst):
self.monotonic_constraint_condition = MonotonicConstraintCondition()
self.presplit_conditions.push_back((<SplitCondition>self.monotonic_constraint_condition).t)
self.postsplit_conditions.push_back((<SplitCondition>self.monotonic_constraint_condition).t)
# self.presplit_conditions.push_back((<SplitCondition>self.monotonic_constraint_condition).t)
# self.postsplit_conditions.push_back((<SplitCondition>self.monotonic_constraint_condition).t)
self.presplit_conditions[offset] = self.monotonic_constraint_condition.t
self.postsplit_conditions[offset] = self.monotonic_constraint_condition.t
offset += 1

# self.presplit_conditions.push_back((<SplitCondition>self.min_samples_leaf_condition).t)
if presplit_conditions is not None:
# for condition in presplit_conditions:
# self.presplit_conditions.push_back((<SplitCondition>condition).t)
for i in range(len(presplit_conditions)):
self.presplit_conditions[i + offset] = presplit_conditions[i].t

# self.postsplit_conditions.push_back((<SplitCondition>self.min_weight_leaf_condition).t)
if postsplit_conditions is not None:
# for condition in postsplit_conditions:
# self.postsplit_conditions.push_back((<SplitCondition>condition).t)
for i in range(len(postsplit_conditions)):
self.postsplit_conditions[i + offset] = postsplit_conditions[i].t


def __reduce__(self):
Expand Down
0