|
3 | 3 | Understanding the decision tree structure
|
4 | 4 | =========================================
|
5 | 5 |
|
6 |
| -The decision tree structure could be analysed to gain further insight on the |
| 6 | +The decision tree structure can be analysed to gain further insight on the |
7 | 7 | relation between the features and the target to predict. In this example, we
|
8 | 8 | show how to retrieve:
|
9 | 9 | - the binary tree structure;
|
10 |
| - - the nodes that were reaches by a sample using the decision_paths method; |
11 |
| - - the leaf that was reaches by a sample using the apply method; |
| 10 | + - the nodes that were reached by a sample using the decision_paths method; |
| 11 | + - the leaf that was reached by a sample using the apply method; |
12 | 12 | - the rules that were used to predict a sample;
|
13 | 13 | - the decision path shared by a group of samples.
|
14 | 14 |
|
|
29 | 29 | estimator.fit(X_train, y_train)
|
30 | 30 |
|
31 | 31 | # The decision estimator has an attribute called tree_ which stores the entire
|
32 |
| -# tree structure and allow to access to low level attribute. The binary tree |
| 32 | +# tree structure and allows access to low level attributes. The binary tree |
33 | 33 | # tree_ is represented as a number of parallel arrays. The i-th element of each
|
34 | 34 | # array holds information about the node `i`. Node 0 is the tree's root. NOTE:
|
35 | 35 | # Some of the arrays only apply to either leaves or split nodes, resp. In this
|
|
42 | 42 | # - threshold, threshold value at the node
|
43 | 43 | #
|
44 | 44 |
|
45 |
| -# Using those array, we can parse the tree structure: |
| 45 | +# Using those arrays, we can parse the tree structure: |
46 | 46 |
|
47 | 47 | print("The binary tree structure has %s nodes and has "
|
48 | 48 | "the following tree structure:"
|
49 | 49 | % estimator.tree_.node_count)
|
50 | 50 |
|
51 |
| -for i in np.arange(estimator.tree_.node_count): |
| 51 | +for i in range(estimator.tree_.node_count): |
52 | 52 | if estimator.tree_.children_left[i] == estimator.tree_.children_right[i]:
|
53 | 53 | print("node=%s leaf node." % i)
|
54 | 54 | else:
|
55 |
| - print("node=%s test node: go to node %s if X[:, %s] <= %ss else %s." |
| 55 | + print("node=%s test node: go to node %s if X[:, %s] <= %ss else to " |
| 56 | + "node %s." |
56 | 57 | % (i,
|
57 | 58 | estimator.tree_.children_left[i],
|
58 | 59 | estimator.tree_.feature[i],
|
|
63 | 64 |
|
64 | 65 | # First let's retrieve the decision path of each sample. The decision_paths
|
65 | 66 | # method allows to retrieve the node indicator function. A non zero elements at
|
66 |
| -# position (i, j) indicates that the sample i goes # through the node j. |
| 67 | +# position (i, j) indicates that the sample i goes sthrough the node j. |
67 | 68 |
|
68 | 69 | node_indicator = estimator.decision_paths(X_test)
|
69 | 70 |
|
|
89 | 90 | else:
|
90 | 91 | threshold_sign = ">"
|
91 | 92 |
|
92 |
| - print("rule %s : (X[%s, %s] (= %s) %s %s)" |
| 93 | + print("rule %s from node %s : (X[%s, %s] (= %s) %s %s)" |
93 | 94 | % (i,
|
| 95 | + node_id, |
94 | 96 | sample_id,
|
95 | 97 | estimator.tree_.feature[node_id],
|
96 | 98 | X_test[i, estimator.tree_.feature[node_id]],
|
|
0 commit comments