X

Berry M.J.A. – Data Mining Techniques For Marketing, Sales & Customer Relationship Management

The fact that trees can be (and sometimes are) used to estimate continuous values does not make it a good idea. A decision tree estimator can only generate as many discrete values as there are leaves in the tree. To estimate a continuous variable, it is preferable to use a continuous function. Regression models and neural network models are generally more appropriate for estimation.

Trees Grow in Many Forms

The tree in Figure 6.1 is a binary tree of nonuniform depth; that is, each nonleaf node has two children and leaves are not all at the same distance from the root.

In this case, each node represents a yes-or-no question, whose answer determines by which of two paths a record proceeds to the next level of the tree. Since any multiway split can be expressed as a series of binary splits, there is no real need for trees with higher branching factors. Nevertheless, many data mining tools are capable of producing trees with more than two branches. For example, some decision tree algorithms split on categorical variables by creating a branch for each class, leading to trees with differing numbers of branches at different nodes. Figure 6.3 illustrates a tree that uses both three-way and two-way splits for the same classification problem as the tree in Figures 6.1 and 6.2.

470643 c06.qxd 3/8/04 11:12 AM Page 171

Decision Trees 171

50%

tot units demand

Page: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154

Categories: Economics, finance
Oleg: