Use the gradient boosting classes in Scikit-Learn to resolve different classification and regression problems
In the primary a part of this text, we presented the gradient boosting algorithm and showed its implementation in pseudocode.
On this a part of the article, we’ll explore the classes in Scikit-Learn that implement this algorithm, discuss their various parameters, and show the way to use them to resolve several classification and regression problems.
Although the XGBoost library (which shall be covered in a future article) provides a more optimized and highly scalable implementation of gradient boosting, for small to medium-sized data sets it is commonly easier to make use of the gradient boosting classes in Scikit-Learn, which have an easier interface and a significantly fewer variety of hyperparameters to tune.
Scikit-Learn provides the next classes that implement the gradient-boosted decision trees (GBDT) model:
- GradientBoostingClassifier is used for classification problems.
- GradientBoostingRegressor is used for regression problems.
Along with the usual parameters of decision trees, resembling criterion, max_depth (set by default to three) and min_samples_split, these classes provide the next parameters:
- loss — the loss function to be optimized. In GradientBoostingClassifier, this function will be ‘log_loss’ (the default) or ‘exponential’ (which can make gradient boosting behave like AdaBoost). In GradientBoostingRegressor, this function will be ‘squared_loss’ (the default), ‘absolute_loss’, ‘huber’, or ‘quantile’ (see this text for the differences between these loss functions).
- n_estimators — the variety of boosting iterations (defaults to 100).
- learning_rate — an element that shrinks the contribution of every tree (defaults to 0.1).
- subsample — the fraction of samples to make use of for training each tree (defaults to 1.0).
- max_features — the variety of features to contemplate when trying to find the perfect split in each…