Home / Predictive Modeling & Machine Learning / 203.7.4 The Bagging Algorithm

# 203.7.4 The Bagging Algorithm

Let’s move forward to the first type of Ensemble Methodology, the Bagging Algorithm.

We will cover the concept behind Bagging and implement it using R.

### The Bagging Algorithm

• The training dataset D
• Draw k boot strap sample sets from dataset D
• For each boot strap sample i
• Build a classifier model \(M_i\)
• We will have total of k classifiers \(M_1 , M_2 ,… M_k\)
• Vote over for the final classifier output and take the average for regression output

### Why Bagging Works

• We are selecting records one-at-a-time, returning each selected record back in the population, giving it a chance to be selected again
• Note that the variance in the consolidated prediction is reduced, if we have independent samples. That way we can reduce the unavoidable errors made by the single model.
• In a given boot strap sample, some observations have chance to select multiple times and some observations might not have selected at all.
• There a proven theory that boot strap samples have only 63% of overall population and rest 37% is not present.
• So the data used in each of these models is not exactly same, This makes our learning models independent. This helps our predictors have the uncorrelated errors.
• Finally the errors from the individual models cancel out and give us a better ensemble model with higher accuracy
• Bagging is really useful when there is lot of variance in our data

### LAB: Bagging Models

• Import Boston house price data. It is part of MASS package
• Get some basic meta details of the data
• Take 90% data use it for training and take rest 10% as holdout data
• Build a single linear regression model on the training data.
• On the hold out data, calculate the error (squared deviation) for the regression model.
• Build the regression model using bagging technique. Build at least 25 models
• On the hold out data, calculate the error (squared deviation) for the consolidated bagged regression model.
• What is the improvement of the bagged model when compared with the single model?

### Solution

``````#Importing Boston  house pricing data.
library(MASS)
data(Boston)
``````##      crim zn indus chas   nox    rm  age    dis rad tax ptratio  black
## 1 0.00632 18  2.31    0 0.538 6.575 65.2 4.0900   1 296    15.3 396.90
## 2 0.02731  0  7.07    0 0.469 6.421 78.9 4.9671   2 242    17.8 396.90
## 3 0.02729  0  7.07    0 0.469 7.185 61.1 4.9671   2 242    17.8 392.83
## 4 0.03237  0  2.18    0 0.458 6.998 45.8 6.0622   3 222    18.7 394.63
## 5 0.06905  0  2.18    0 0.458 7.147 54.2 6.0622   3 222    18.7 396.90
## 6 0.02985  0  2.18    0 0.458 6.430 58.7 6.0622   3 222    18.7 394.12
##   lstat medv
## 1  4.98 24.0
## 2  9.14 21.6
## 3  4.03 34.7
## 4  2.94 33.4
## 5  5.33 36.2
## 6  5.21 28.7``````
``dim(Boston)``
``## [1] 506  14``
``````##Training and holdout sample
library(caret)``````
``## Loading required package: lattice``
``## Loading required package: ggplot2``
``````set.seed(500)
sampleseed <- createDataPartition(Boston\$medv, p=0.9, list=FALSE)

train_boston<-Boston[sampleseed,]
test_boston<-Boston[-sampleseed,]

###Regression Model
reg_model<- lm(medv ~ ., data=train_boston)
summary(reg_model)``````
``````##
## Call:
## lm(formula = medv ~ ., data = train_boston)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -15.4763  -2.7684  -0.4912   1.9030  26.4569
##
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  3.637e+01  5.534e+00   6.572 1.40e-10 ***
## crim        -1.042e-01  3.513e-02  -2.965 0.003195 **
## zn           4.482e-02  1.459e-02   3.073 0.002248 **
## indus        1.986e-02  6.566e-02   0.302 0.762462
## chas         2.733e+00  8.765e-01   3.118 0.001939 **
## nox         -1.844e+01  4.018e+00  -4.590 5.79e-06 ***
## rm           3.845e+00  4.670e-01   8.234 2.04e-15 ***
## age          8.782e-04  1.434e-02   0.061 0.951211
## dis         -1.488e+00  2.096e-01  -7.101 4.94e-12 ***
## rad          2.770e-01  6.993e-02   3.960 8.71e-05 ***
## tax         -1.062e-02  3.944e-03  -2.693 0.007348 **
## ptratio     -9.799e-01  1.385e-01  -7.073 5.92e-12 ***
## black        9.620e-03  2.827e-03   3.403 0.000726 ***
## lstat       -5.051e-01  5.706e-02  -8.852  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.787 on 444 degrees of freedom
## Multiple R-squared:  0.7309, Adjusted R-squared:  0.723
## F-statistic: 92.75 on 13 and 444 DF,  p-value: < 2.2e-16``````
``````###Accuracy testing on holdout data
pred_reg<-predict(reg_model, newdata=test_boston[,-14])
reg_err<-sum((test_boston\$medv-pred_reg)^2)
reg_err``````
``## [1] 918.5927``
``````###Bagging Ensemble Model
library(ipred)
bagg_model<- bagging(medv ~ ., data=train_boston , nbagg=30)

###Accuracy testing on holout data
pred_bagg<-predict(bagg_model, newdata=test_boston[,-14])
bgg_err<-sum((test_boston\$medv-pred_bagg)^2)
bgg_err``````
``## [1] 390.9028``
``````###Overall Improvement
reg_err``````
``## [1] 918.5927``
``bgg_err``
``## [1] 390.9028``
``(reg_err-bgg_err)/reg_err``
``````## [1] 0.5744547
We can see the error of the model has been reduced.
``````