8000 Improve IsolationForest average depth evaluation · scikit-learn/scikit-learn@0869944 · GitHub
[go: up one dir, main page]

Skip to content

Commit 0869944

Browse files
committed
Improve IsolationForest average depth evaluation
The equation in the original paper is undersimplified: c(n) = 2 H(n-1) - 2 (n-1) / n, where H(k) ~ (ln(k) + gamma) The definition of the harmonic number is: H(n) = sum_{j=1}^{n} (1 / j) then it follows that: H(n) = H(n-1) + 1 / n and then: c(n) = 2 H(n-1) - 2 (n-1) / n = 2 ( H(n-1) - 1 + 1 / n ) = = 2 (H(n) - 1) Here we use simplier equation to save calculations.
1 parent a203b9e commit 0869944

File tree

1 file changed

+2
-4
lines changed

1 file changed

+2
-4
lines changed

sklearn/ensemble/_iforest.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -501,9 +501,7 @@ def _average_path_length(n_samples_leaf):
501501

502502
average_path_length[mask_1] = 0.
503503
average_path_length[mask_2] = 1.
504-
average_path_length[not_mask] = (
505-
2.0 * (np.log(n_samples_leaf[not_mask] - 1.0) + np.euler_gamma)
506-
- 2.0 * (n_samples_leaf[not_mask] - 1.0) / n_samples_leaf[not_mask]
507-
)
504+
average_path_length[not_mask] = 2.0 * (np.log(n_samples_leaf[not_mask])
505+
+ np.euler_gamma - 1.0)
508506

509507
return average_path_length.reshape(n_samples_leaf_shape)

0 commit comments

Comments
 (0)
0