Bagging and other ensemble methods
-
Bagging and other ensemble methods
Bagging is a tech for reducing generalization error by combining several models.
Ensemble
Suppose we have k models, every model has error
. Variance and covariance .The result of ensemble is:
So, when
,clearly, ensemble doesn't help at all.However, when , bagging result is . It's much smaller than . On average, ensemble will perform at least as well as any of its members. And if the members make independent errors, the ensemble will perform significantly better than its members. bagging
Bagging is a method that allows the same kind of model, training algorithm and objective function to be reused several times.
First, construct
different datasets. Each dataset has the same examples as the origin dataset. And every dataset is constructed by sampling with replacement from the origin dataset. Then model is trained on the dataset . Some tips to train nn models
mainly make partially independent errors.
- random initialization
- random selection of mini batches
- differences in hyperparameters
- different outcomes of non-deterministic implementations of nn