**In previous post we tried understanding descriptive Statistics. In this post we will understand Dispersion Measures and implement them using python.**

**This post is extension of previous posts, we will be going forward with previously imported data from 104.3.2 and 104.3.1.**

**Dispersion Measures : Variance and Standard Deviation**

**Dispersion**

- Just knowing the central tendency is not enough.
- Two variables might have same mean, but they might be very different.
- Look at these two variables. Profit details of two companies A & B for last 14 Quarters in MMs

Company A | Company B |
---|---|

43 | 17 |

44 | 15 |

0 | 12 |

25 | 17 |

20 | 15 |

35 | 18 |

-8 | 12 |

13 | 15 |

-10 | 12 |

-8 | 13 |

32 | 18 |

11 | 18 |

-8 | 14 |

21 | 14 |

- Though the average profit is 15 in both the cases
- Company B has performed consistently than company A.
- There was even loses for company A
- Measures of dispersion become very vital in such cases

### Variance and Standard deviation

- Dispersion is the quantification of deviation of each point from the mean value.
- Variance is average of squared distances of each point from the mean
- Variance is a fairly good measure of dispersion.
- Variance in profit for company A is 352 and Company B is 4.9

σ2=∑ni=1(xi−x¯)2n

**Variance Calculation**

Value | Value – Mean | (Value – Mean)^2 |
---|---|---|

43 | 28 | 784 |

44 | 29 | 841 |

0 | -15 | 225 |

25 | 10 | 100 |

20 | 5 | 25 |

35 | 20 | 400 |

-8 | -23 | 529 |

13 | -2 | 4 |

-10 | -25 | 625 |

-8 | -23 | 529 |

32 | 17 | 289 |

11 | -4 | 16 |

-8 | -23 | 529 |

21 | 6 | 36 |

15 |
352 |

Value | Value – Mean | (Value – Mean)^2 |
---|---|---|

17 | 2 | 4 |

15 | 0 | 0 |

12 | -3 | 9 |

15 | 0 | 0 |

18 | 3 | 9 |

12 | -3 | 9 |

15 | 0 | 0 |

12 | -3 | 9 |

13 | -2 | 4 |

18 | 3 | 9 |

18 | 3 | 9 |

14 | -1 | 1 |

14 | -1 | 1 |

21 | 6 | 36 |

15 |
4.9 |

### Standard Deviation

- Standard deviation is just the square root of variance
- Variance gives a good idea on dispersion, but it is of the order of squares.
- Its very clear from the formula, variance unites are squared than that of original data.
- Standard deviation is the variance measure that is in the same units as the original data

`s=∑ni=1(xi−x¯)2n−−−−−−−−−−−−√`

### Variance and Standard deviation on Python

- Divide the Income data into two sets. USA vs Others
- Find the variance of “education.num” in those two sets. Which one has higher variance?

In [12]:

```
usa_income=Income_Data[Income_Data["native-country"]==' United-States']
usa_income.shape
```

Out[12]:

In [13]:

```
other_income=Income_Data[Income_Data["native-country"]!=' United-States']
other_income.shape
```

Out[13]:

- Variance and SD for USA

In [14]:

```
var_usa=usa_income["education-num"].var()
var_usa
```

Out[14]:

In [15]:

```
std_usa=usa_income["education-num"].std()
std_usa
```

Out[15]:

In [16]:

```
var_other=other_income["education-num"].var()
var_other
```

Out[16]:

In [17]:

```
std_other=other_income["education-num"].std()
std_other
```

Out[17]:

### Practice : Variance and Standard deviation

- Dataset: “./Online Retail Sales Data/Online Retail.csv”
- What is the variance and s.d of “UnitPrice”
- What is the variance and s.d of “Quantity”
- Which one these two variables is consistent?

In [18]:

```
var_UnitPrice=Retail['UnitPrice'].var()
var_UnitPrice
```

Out[18]:

In [19]:

```
std_UnitPrice=Retail['UnitPrice'].std()
std_UnitPrice
```

Out[19]:

In [20]:

```
var_quantity=Retail['Quantity'].var()
var_quantity
```

Out[20]:

In [21]:

```
std_quantity=Retail['Quantity'].std()
std_quantity
```

Out[21]: