Home / Tag Archives: Decision Trees

# Tag Archives: Decision Trees

## 204.3.10 Pruning a Decision Tree in Python

Pruning Growing the tree beyond a certain level of complexity leads to overfitting In our data, age doesn’t have any impact on the target variable. Growing the tree beyond Gender is not going to add any value. Need to cut it at Gender This process of trimming trees is called …

## 204.3.9 The Problem of Overfitting the Decision Tree

So far we have built a tree, predicted with our model and validated the tree. In this post we will handle the issue of over fitting a tree. First we will built another tree and see the problem of overfitting and then will find how to solve the problem. Practice …

## 204.3.8 Practice : Validating the Tree

In last post we built a decision tree and after plotting we explored the major characteristics of the tree. In this post we will practice how to validate the tree. Tree Validation Find the accuracy of the classification for the tree model #Tree Validation predict1 = clf.predict(X) from sklearn.metrics import …

## 204.3.7 Building a Decision Tree in Python

Here comes the fun part of this series. Building a decision tree in python and plotting the same. But before let’s us have a recap of how the decision tree algorithm works. The Decision tree Algorithm- Full version Until stopped: Select a leaf node Select an attribute Partition the node …

## 204.3.6 The Decision Tree Algorithm

In this post we will understand the decision tree algorithm step by step, how the split criterion and stop criterion are decided. The Decision tree Algorithm The major step is to identify the best split variables and best split criteria Once we have the split then we have to go …

## 204.3.5 Information Gain in Decision Tree Split

In previous post of this series we calculated the entropy for each split. In this post we will calculate the information gain or decrease in entropy after split. Information Gain Information Gain= entropyBeforeSplit – entropyAfterSplit Easy way to understand Information gain= (overall entropy at parent node) – (sum of weighted …

## 204.3.4 How to Calculate Entropy for Decision Tree Split?

Entropy Calculation – Example Entropy at root Total population at root 100 [50+,50-] Entropy(S) = −p+log2p+−p−log2p− −0.5log2(0.5)−0.5log2(0.5) -(0.5)(-1) – (0.5)(-1) 1 100% Impurity at root     Entropy(S)=−p+log2p+−p−log2p− Entropy Calculation Gender Splits the population into two segments Segment-1 : Age=”Young” Segment-2: Age=”Old” Entropy at segment-1 Age=”Young” segment has 60 records …

## 204.3.3 How Decision tree Splits works?

Decision Tree follows the Algorithm ID3 (Iterative Dichotomiser 3). This algorithm iteratively splits data into segments which have a decrease in Entropy and increase in Information Gain with each split. The final goal is to achieve homogeneity in final nodes. Two matrices of Decision Tree Algorithms are: Entropy : is …