Characteristics of Bagging in Machine Learning

Bagging in Machine Learning

Bagging is an ensemble process where a model is trained on each of the bootstrap samples. The final model is an aggregated models of the all sample models. For a numeric target variable/regression problems the predicted outcome is an average of all the models and in the classification problems. The predicted class defined based on plurality.

Bagging is short for bootstrap aggregation, so named because it takes a number of samples from the dataset with each sample set being regarded as a bootstrap sample. The results of these bootstrap samples are then aggregated.

Characteristics of Bagging

1. Operates via equal weighting of models.

2. Settles on result using majority voting.

3. Employs multiple instances of same classifier for one dataset.

4. Builds models of smaller datasets by sampling with replacement.

5. Works best when classifier is unstable, as this instability creates models of different accuracy.

6. Bagging can hurt stable model by introducing artificial variability from which to draw inaccurate conclusions.

Advantages of Bagging

1. Bagging improves the model’s accuracy.

2. It minimizes the over-fitting of data.

3. It deals with higher dimensional data with higher efficiency than base models.

Disadvantages of Bagging

1. Bagging works particularly well with less stable algorithms.

2. It introduces a loss of interpretability of a model.

3. It can be computationally expensive.