This will be our last post of our Model Selection and Cross Validation Series. Bootstrap Methods Boot strapping is a powerful tool to get an idea on accuracy of the model and the test error Can estimate the likely future performance of a given modeling procedure, on new data not …

Read More »## 204.4.9 model-Bias Variance Tradeoff

Model Bias and Variance Over fitting Low Bias with High Variance Low training error – ‘Low Bias’ High testing error Unstable model – ‘High Variance’ The coefficients of the model change with small changes in the data Under fitting High Bias with low Variance High training error – ‘high Bias’ …

Read More »## 204.4.5 What is a Best Model?

What is a best model? How to build? A model with maximum accuracy /least error A model that uses maximum information available in the given data A model that has minimum squared error A model that captures all the hidden patterns in the data A model that produces the best …

Read More »## 204.4.1 Model Section and Cross Validation

Building a model is not that difficult. However, tuning the model and checking if the model is working as we have built it to is a different game. In this series, we will be covering methods and matrices to validate the model and find an optimum model for our requirement. …

Read More »## 204.3.11 Practice : Tree Building & Model Selection

This is our last post in this series. We will again build a model and validate the accuracy of the model on test data and prune the tree if needed. Tree Building & Model Selection Import fiber bits data. This is internet service provider data. The idea is to predict …

Read More »## 204.2.6 Model Selection : Logistic Regression

We left some part of the post regarding goodness of fitness behind. We will cover them in this post and see if we can improve our model based on AIC and BIC. We will also cover various methods used for model selection in a series dedicated to it. How to …

Read More »## 203.4.7 Cross Validation

Choosing Optimal Model Unfortunately There is no scientific method of choosing optimal model complexity that gives minimum test error. Training error is not a good estimate of the test error. There is always bias-variance tradeoff in choosing the appropriate complexity of the model. We can use cross validation methods, boot …

Read More »## 203.4.6 model-Bias Variance Tradeoff

Model Bias and Variance Over fitting Low Bias with High Variance Low training error – ‘Low Bias’ High testing error Unstable model – ‘High Variance’ The coefficients of the model change with small changes in the data Under fitting High Bias with low Variance High training error – ‘high Bias’ …

Read More »## 203.4.5 Type of Datasets, Type of Errors and Problem of Overfitting

The Problem of Over Fitting In search of the best model on the given data we add many predictors, polynomial terms, Interaction terms, variable transformations, derived variables, indicator/dummy variables etc., Most of the times we succeed in reducing the error. What error is this? So by complicating the model we …

Read More »## 203.4.4 What is a Best Model?

What is a best model? How to build? A model with maximum accuracy /least error A model that uses maximum information available in the given data A model that has minimum squared error A model that captures all the hidden patterns in the data A model that produces the best …

Read More »