Home / Python / Predictive Modeling & Machine Learning / 204.5.1 Neural Networks : A Recap of Logistic Regression

# 204.5.1 Neural Networks : A Recap of Logistic Regression

Welcome to this Blog series on Neural Networks. In the series 204.5 we will go from basics of neural networks to build a neural network model that recognizes digit images and reads them correctly.

In this post we will just revise our understanding of how logistic regression works, which can be considered a building block for a neural network.

### Recap of Logistic Regression

• Categorical output YES/NO type
• Using the predictor variables to predict the categorical output

### Practice : Logistic Regression

• Dataset: Emp_Productivity/Emp_Productivity.csv
• Filter the data and take a subset from above dataset . Filter condition is Sample_Set<3
• Draw a scatter plot that shows Age on X axis and Experience on Y-axis. Try to distinguish the two classes with colors or shapes (visualizing the classes)
• Build a logistic regression model to predict Productivity using age and experience
• Finally draw the decision boundary for this logistic regression model
• Create the confusion matrix
• Calculate the accuracy and error rates

### Solution

In [1]:
```import pandas as pd
```
Out[1]:
Age Experience Productivity Sample_Set
0 20.0 2.3 0 1
1 16.2 2.2 0 1
2 20.2 1.8 0 1
3 18.8 1.4 0 1
4 18.9 3.2 0 1
5 16.7 3.9 0 1
6 16.3 1.4 0 1
7 20.0 1.4 0 1
8 18.0 3.6 0 1
9 21.2 4.3 0 1
In [2]:
```#Filter the data and take a subset from above dataset . Filter condition is Sample_Set<3
Emp_Productivity1=Emp_Productivity_raw[Emp_Productivity_raw.Sample_Set<3]
Emp_Productivity1.shape
```
Out[2]:
`(74, 4)`
In [3]:
```#frequency table of Productivity variable
Emp_Productivity1.Productivity.value_counts()
```
Out[3]:
```1    41
0    33
Name: Productivity, dtype: int64```
In [4]:
```####The clasification graph
#Draw a scatter plot that shows Age on X axis and Experience on Y-axis. Try to distinguish the two classes with colors or shapes.
import matplotlib.pyplot as plt
%matplotlib inline

fig = plt.figure()

ax1.scatter(Emp_Productivity1.Age[Emp_Productivity1.Productivity==0],Emp_Productivity1.Experience[Emp_Productivity1.Productivity==0], s=10, c='b', marker="o", label='Productivity 0')
ax1.scatter(Emp_Productivity1.Age[Emp_Productivity1.Productivity==1],Emp_Productivity1.Experience[Emp_Productivity1.Productivity==1], s=10, c='r', marker="+", label='Productivity 1')
plt.legend(loc='upper left');
plt.show()
```
In [5]:
```#predict Productivity using age and experience
import statsmodels.formula.api as sm
model1 = sm.logit(formula='Productivity ~ Age+Experience', data=Emp_Productivity1)
fitted1 = model1.fit()
fitted1.summary()
```
```Optimization terminated successfully.
Current function value: 0.315987
Iterations 7
```
Out[5]:
Dep. Variable: No. Observations: Productivity 74 Logit 71 MLE 2 Tue, 15 Nov 2016 0.5402 16:08:12 -23.383 True -50.86 1.167e-12
coef std err z P>|z| [95.0% Conf. Int.] -8.9361 2.061 -4.335 0.000 -12.976 -4.896 0.2763 0.105 2.620 0.009 0.070 0.483 0.5923 0.298 1.988 0.047 0.008 1.176
In [6]:
```#coefficients
coef=fitted1.normalized_cov_params
print(coef)
```
```            Intercept       Age  Experience
Intercept    4.249138 -0.184321    0.030957
Age         -0.184321  0.011118   -0.017256
Experience   0.030957 -0.017256    0.088759
```
In [7]:
```# getting slope and intercept of the line
slope1=coef.Intercept[1]/(-coef.Intercept[2])
intercept1=coef.Intercept[0]/(-coef.Intercept[2])
slope1
intercept1
```
Out[7]:
`-137.26024805820899`
In [8]:
```#Finally draw the decision boundary for this logistic regression model
import matplotlib.pyplot as plt

fig = plt.figure()

ax1.scatter(Emp_Productivity1.Age[Emp_Productivity1.Productivity==0],Emp_Productivity1.Experience[Emp_Productivity1.Productivity==0], s=10, c='b', marker="o", label='Productivity 0')
ax1.scatter(Emp_Productivity1.Age[Emp_Productivity1.Productivity==1],Emp_Productivity1.Experience[Emp_Productivity1.Productivity==1], s=10, c='r', marker="+", label='Productivity 1')
plt.legend(loc='upper left');

x_min, x_max = ax1.get_xlim()
ax1.plot([0, x_max], [intercept1, x_max*slope1+intercept1])
ax1.set_xlim([15,35])
ax1.set_ylim([0,10])
plt.show()
```
• Accuracy of the model
In [9]:
```#Predicting classes
predicted_values=fitted1.predict(Emp_Productivity1[["Age"]+["Experience"]])
predicted_values[1:10]

threshold=0.5
threshold

import numpy as np
predicted_class=np.zeros(predicted_values.shape)
predicted_class[predicted_values>threshold]=1

predicted_class
```
Out[9]:
```array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,
0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])```
In [10]:
```#Confusion Matrix, Accuracy and Error
from sklearn.metrics import confusion_matrix as cm
ConfusionMatrix = cm(Emp_Productivity1[['Productivity']],predicted_class)
print('Confusion Matrix :', ConfusionMatrix)
accuracy=(ConfusionMatrix[0,0]+ConfusionMatrix[1,1])/sum(sum(ConfusionMatrix))
print('Accuracy : ',accuracy)
error=1-accuracy
print('Error: ',error)
```
```Confusion Matrix : [[31  2]
[ 2 39]]
Accuracy :  0.945945945946
Error:  0.0540540540541
```