Home / Python / Predictive Modeling & Machine Learning / 204.4.2 Calculating Sensitivity and Specificity in Python

204.4.2 Calculating Sensitivity and Specificity in Python

This post is an extension of previous post. Here, we will look at way to calculate Sensitivity and Specificity of the model in python.

Calculating Sensitivity and Specificity

Building Logistic Regression Model

In [1]:
#Importing necessary libraries
import sklearn as sk
import pandas as pd
import numpy as np
import scipy as sp
In [2]:
#Importing the dataset
Fiber_df= pd.read_csv("datasets\\Fiberbits\\Fiberbits.csv")
###to see head and tail of the Fiber dataset
Fiber_df.head(5)
Out[2]:
active_cust income months_on_network Num_complaints number_plan_changes relocated monthly_bill technical_issues_per_month Speed_test_result
0 0 1586 85 4 1 0 121 4 85
1 0 1581 85 4 1 0 133 4 85
2 0 1594 82 4 1 0 118 4 85
3 0 1594 82 4 1 0 123 4 85
4 1 1609 80 4 1 0 177 4 85
In [3]:
#Name of the columns/Variables
Fiber_df.columns
Out[3]:
Index(['active_cust', 'income', 'months_on_network', 'Num_complaints',
       'number_plan_changes', 'relocated', 'monthly_bill',
       'technical_issues_per_month', 'Speed_test_result'],
      dtype='object')
In [4]:
#Building and training a Logistic Regression model
import statsmodels.formula.api as sm
logistic1 = sm.logit(formula='active_cust~income+months_on_network+Num_complaints+number_plan_changes+relocated+monthly_bill+technical_issues_per_month+Speed_test_result', data=Fiber_df)
fitted1 = logistic1.fit()
fitted1.summary()
Optimization terminated successfully.
         Current function value: 0.493647
         Iterations 9
Out[4]:
Logit Regression Results
Dep. Variable: active_cust No. Observations: 100000
Model: Logit Df Residuals: 99991
Method: MLE Df Model: 8
Date: Fri, 18 Nov 2016 Pseudo R-squ.: 0.2748
Time: 19:16:40 Log-Likelihood: -49365.
converged: True LL-Null: -68074.
LLR p-value: 0.000
coef std err z P>|z| [95.0% Conf. Int.]
Intercept -17.6101 0.301 -58.538 0.000 -18.200 -17.020
income 0.0017 8.21e-05 20.820 0.000 0.002 0.002
months_on_network 0.0288 0.001 28.654 0.000 0.027 0.031
Num_complaints -0.6865 0.030 -22.811 0.000 -0.746 -0.628
number_plan_changes -0.1896 0.008 -24.940 0.000 -0.205 -0.175
relocated -3.1626 0.040 -79.927 0.000 -3.240 -3.085
monthly_bill -0.0022 0.000 -13.995 0.000 -0.003 -0.002
technical_issues_per_month -0.3904 0.007 -54.581 0.000 -0.404 -0.376
Speed_test_result 0.2222 0.002 93.435 0.000 0.218 0.227
In [5]:
###predicting values
predicted_values1=fitted1.predict(Fiber_df[["income"]+['months_on_network']+['Num_complaints']+['number_plan_changes']+['relocated']+['monthly_bill']+['technical_issues_per_month']+['Speed_test_result']])
predicted_values1[1:10]
Out[5]:
array([ 0.83701059,  0.83271114,  0.83117449,  0.80896979,  0.8520262 ,
        0.82713018,  0.85504571,  0.85131352,  0.85537857])
In [6]:
### Converting predicted values into classes using threshold
threshold=0.5

predicted_class1=np.zeros(predicted_values1.shape)
predicted_class1[predicted_values1>threshold]=1
predicted_class1
Out[6]:
array([ 1.,  1.,  1., ...,  1.,  1.,  1.])
In [7]:
#Confusion matrix, Accuracy, sensitivity and specificity
from sklearn.metrics import confusion_matrix

cm1 = confusion_matrix(Fiber_df[['active_cust']],predicted_class1)
print('Confusion Matrix : \n', cm1)

total1=sum(sum(cm1))
#####from confusion matrix calculate accuracy
accuracy1=(cm1[0,0]+cm1[1,1])/total1
print ('Accuracy : ', accuracy1)

sensitivity1 = cm1[0,0]/(cm1[0,0]+cm1[0,1])
print('Sensitivity : ', sensitivity1 )

specificity1 = cm1[1,1]/(cm1[1,0]+cm1[1,1])
print('Specificity : ', specificity1)
Confusion Matrix : 
 [[29492 12649]
 [10847 47012]]
Accuracy :  0.76504
Sensitivity :  0.699841009943
Specificity :  0.812527005306

Changing Threshold to 0.8

In [8]:
### Converting predicted values into classes using new threshold
threshold=0.8

predicted_class1=np.zeros(predicted_values1.shape)
predicted_class1[predicted_values1>threshold]=1
predicted_class1
Out[8]:
array([ 1.,  1.,  1., ...,  1.,  1.,  1.])

Change in Confusion Matrix, Accuracy and Sensitivity-Specificity

In [9]:
#Confusion matrix, Accuracy, sensitivity and specificity
from sklearn.metrics import confusion_matrix

cm1 = confusion_matrix(Fiber_df[['active_cust']],predicted_class1)
print('Confusion Matrix : \n', cm1)

total1=sum(sum(cm1))
#####from confusion matrix calculate accuracy
accuracy1=(cm1[0,0]+cm1[1,1])/total1
print ('Accuracy : ', accuracy1)

sensitivity1 = cm1[0,0]/(cm1[0,0]+cm1[0,1])
print('Sensitivity : ', sensitivity1 )

specificity1 = cm1[1,1]/(cm1[1,0]+cm1[1,1])
print('Specificity : ', specificity1)
Confusion Matrix : 
 [[37767  4374]
 [30521 27338]]
Accuracy :  0.65105
Sensitivity :  0.896205595501
Specificity :  0.472493475518

About admin

Check Also

204.7.6 Practice : Random Forest

Let’s implement the concept of Random Forest into practice using Python. Practice : Random Forest …

Leave a Reply

Your email address will not be published. Required fields are marked *