Home / Python / Predictive Modeling & Machine Learning / 204.7.8 Practice : Boosting

# 204.7.8 Practice : Boosting

In last post we covered the concepts and theory behind Boosting Algorithms.

In this post we will put the concepts into practice and build Boosting models using Scikit Learn in python.

## Boosting

• Rightly categorizing the items based on their detailed feature specifications. More than 100 specifications have been collected.
• Build a decision tree model and check the training and testing accuracy
• Build a boosted decision tree.
• Is there any improvement from the earlier decision tree

In [20]:
```#importing the datasets
```
In [21]:
```lab=list(menu_train.columns[1:101])
```
In [22]:
```###buildng Decision tree on the training data ####
from sklearn import tree
tree = tree.DecisionTreeClassifier()
tree.fit(g,h)
```
Out[22]:
```DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
max_features=None, max_leaf_nodes=None, min_samples_leaf=1,
min_samples_split=2, min_weight_fraction_leaf=0.0,
presort=False, random_state=None, splitter='best')```
In [23]:
```#####predicting the tree  on test data ####
from sklearn.metrics import f1_score
```
Out[23]:

0.70993535216059889

In [24]:
```###Building a gradient boosting clssifier ###
from sklearn import ensemble
boost=GradientBoostingClassifier(loss='deviance', learning_rate=0.1, n_estimators=100, subsample=1.0, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, init=None, random_state=None, max_features=None, verbose=0, max_leaf_nodes=None, warm_start=False, presort='auto')
```
In [25]:
```##calculating the time while fitting the Gradient boosting classifier
import datetime
start_time = datetime.datetime.now()
boost.fit(g,h)
end_time = datetime.datetime.now()
print(end_time-start_time)
```
```0:03:15.513182
```
In [26]:
```###predicting Gradient boosting model on the test Data
from sklearn.metrics import f1_score
```
Out[26]:
`0.78717250765566504`

We see an accuracy of 78% after Gradient boosting model,where as it is 70% in decison tree building. Our accuracy has improved by 8%.

In [27]:
```##building an AdaBoosting Classifier ####
from sklearn import ensemble
```
Out[27]:
```AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
learning_rate=1.0, n_estimators=50, random_state=None)```
In [28]:
```### Predicting the AdaBoost clssifier on Test Data
from sklearn.metrics import f1_score
```
Out[28]:

0.69555971418849949

Well, ada Boosting didn’t give us improved results as we expected.

## 204.7.5 The Random Forest

Random Forest Like many trees form a forest, many decision tree model together form a …