### The Splitting Criterion

- The best split is
- The split that does the best job of separating the data into groups
- Where a single class(either 0 or 1) predominates in each group

#### Example Sales Segmentation Based on Age

#### Example Sales Segmentation Based on Gender

## Impurity (Diversity) Measures

- We are looking for a impurity or diversity measure that will give high score for this Age variable(high impurity while segmenting), Low score for Gender variable(Low impurity while segmenting)

**Entropy**: Characterizes the impurity/diversity of segment- Measure of uncertainty/Impurity
- Entropy measures the information amount in a message
- S is a segment of training examples, p+ is the proportion of positive examples, p- is the proportion of negative examples
- Entropy(S) =
`\(-p_+ log_2p_+ – p_- log_2 p_-\)`

- Where
`\(p_+\)`

is the probabailty of positive class and`\(p_-\)`

is the probabailty of negative class - Entropy is highest when the split has p of 0.5.
- Entropy is least when the split is pure .ie p of 1

### Entropy is highest when the split has p of 0.5

- Entropy(S) =
`\(-p_+ log_2p_+ – p_- log_2 p_-\)`

- Entropy is highest when the split has p of 0.5
- 50-50 class ratio in a segment is really impure, hence entropy is high
- Entropy(S) =
`\(-p_+ log_2p_+ – p_- log_2 p_-\)`

- Entropy(S) =
`\(-0.5*log_2(0.5) – 0.5*log_2(0.5)\)`

- Entropy(S) = 1

### Entropy is least when the split is pure .ie p of 1

`\(-p_+ log_2p_+ – p_- log_2 p_-\)`

- Entropy is least when the split is pure ie p of 1
- 100-0 class ratio in a segment is really pure, hence entropy is low
- Entropy(S) = \(-p_+ log_2p_+ – p_- log_2 p_-\)
- Entropy(S) =
`\(-1*log_2(1) – 0*log_2(0)\)`

- Entropy(S) = 0

### The less the entropy, the better the split

- The less the entropy, the better the split
- Entropy is formulated in such a way that, its value will be high for impure segments