Home / Python / Predictive Modeling & Machine Learning / 204.3.6 The Decision Tree Algorithm

204.3.6 The Decision Tree Algorithm

In this post we will understand the decision tree algorithm step by step, how the split criterion and stop criterion are decided.

The Decision tree Algorithm

  • The major step is to identify the best split variables and best split criteria
  • Once we have the split then we have to go to segment level and drill down further

Until stopped:

  1. Select a leaf node
  2. Find the best splitting attribute
  3. Spilt the node using the attribute
  4. Go to each child node and repeat step 2 & 3

Stopping criteria:

  • Each leaf-node contains examples of one type
  • Algorithm ran out of attributes
  • No further significant information gain

The Decision tree Algorithm – Demo

Entropy([4+,10-]) Ovearll = 86.3% (Impurity)

  • Entropy([7+,1-]) Male= 54.3%
  • Entropy([3+,3-]) Female = 100%
  • Information Gain for Gender=86.3-((8/14)54.3+(6/14)100) =12.4

Entropy([4+,10-]) Ovearll = 86.3% (Impurity)

  • Entropy([0+,9-]) Married = 0%
  • Entropy([4+,1-]) Un Married= 72.1%
  • Information Gain for Marital Status=86.3-((9/14)0+(5/14)72.1)=60.5
  • The information gain for Marital Status is high, so it has to be the first variable for segmentation
  • Now we consider the segment “Married” and repeat the same process of looking for the best splitting variable for this sub segment ### The Decision tree Algorithm

Until stopped:

  1. Select a leaf node
  2. Find the best splitting attribute
  3. Spilt the node using the attribute
  4. Go to each child node and repeat step 2 & 3 Stopping criteria:
  5. Each leaf-node contains examples of one type
  6. Algorithm ran out of attributes
  7. No further significant information gain

Many Splits for a Single Variable

  • Sometimes we may find multiple values taken by a variable
    • which will lead to multiple split options for a single variable
    • that will give us multiple information gain values for a single variable

What is the information gain for income?


  • What is the information gain for income?
  • There are multiple options to calculate Information gain.
  • For income, we will consider all possible scenarios and calculate the information gain for each scenario.
  • The best split is the one with highest information gain.
  • Within income, out of all the options, the split with best information gain is considered.
  • So, node partitioning for multi class attributes need to be included in the decision tree algorithm.
  • We need find best splitting attribute along with best split rule.

About admin

Check Also

204.7.6 Practice : Random Forest

Let’s implement the concept of Random Forest into practice using Python. Practice : Random Forest …

Leave a Reply

Your email address will not be published. Required fields are marked *