Home Artificial Intelligence Hyperparameter Tuning with GridSearchCV

Hyperparameter Tuning with GridSearchCV

0
Hyperparameter Tuning with GridSearchCV

In almost any Machine Learning project, we train different models on the dataset and choose the one with one of the best performance. Nonetheless, there’s room for improvement as we cannot say of course that this particular model is best for the issue at hand. Hence, our aim is to enhance the model in any way possible. One essential think about the performances of those models are their hyperparameters, once we set appropriate values for these hyperparameters, the performance of a model can improve significantly. In this text, we’ll learn how we will find optimal values for the hyperparameters of a model through the use of GridSearchCV.

What’s GridSearchCV?

GridSearchCV is the strategy of performing hyperparameter tuning so as to determine the optimal values for a given model. As mentioned above, the performance of a model significantly is dependent upon the worth of hyperparameters. Note that there isn’t a method to know upfront one of the best values for hyperparameters so ideally, we want to try all possible values to know the optimal values. Doing this manually could take a substantial period of time and resources and thus we use GridSearchCV to automate the tuning of hyperparameters.

GridSearchCV is a function that is available in Scikit-learn’s(or SK-learn) model_selection package.So a vital point here to notice is that we want to have the Scikit learn library installed on the pc. This function helps to loop through predefined hyperparameters and suit your estimator (model) in your training set. So, in the long run, we will select one of the best parameters from the listed hyperparameters.

How does GridSearchCV work?

As mentioned above, we pass predefined values for hyperparameters to the GridSearchCV function. We do that by defining a dictionary wherein we mention a selected hyperparameter together with the values it could take. Here is an example of it

 { 'C': [0.1, 1, 10, 100, 1000],  
   'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 
   'kernel': ['rbf',’linear’,'sigmoid']  }

Here C, gamma and kernels are a few of the hyperparameters of an SVM model. Note that the remainder of the hyperparameters might be set to their default values

GridSearchCV tries all of the combos of the values passed within the dictionary and evaluates the model for every combination using the Cross-Validation method. Hence after using this function we get accuracy/loss for each combination of hyperparameters and we will select the one with one of the best performance.

Learn how to use GridSearchCV?

On this section, we will see tips on how to use GridSearchCV and likewise learn how it improves the performance of the model.

First, allow us to see what are the varied arguments which might be taken by GridSearchCV function:

sklearn.model_selection.GridSearchCV(estimator, param_grid,scoring=None,
          n_jobs=None, iid='deprecated', refit=True, cv=None, verbose=0, 
          pre_dispatch="2*n_jobs", error_score=nan, return_train_score=False) 

We’re going to briefly describe a couple of of those parameters and the remainder you’ll be able to see on the unique documentation:

1.estimator: Pass the model instance for which you ought to check the hyperparameters.
2.params_grid: the dictionary object that holds the hyperparameters you ought to try
3.scoring: evaluation metric that you ought to use, you'll be able to simply pass a sound string/ object of evaluation metric
4.cv: variety of cross-validation you could have to try for every chosen set of hyperparameters
5.verbose: you'll be able to set it to 1 to get the detailed print out whilst you fit the info to GridSearchCV
6.n_jobs: variety of processes you would like to run in parallel for this task if it -1 it is going to use all available processors. 

Now, allow us to see tips on how to use GridSearchCV to enhance the accuracy of our model. Here I’m going to coach the model twice, once without using GridsearchCV(using the default hyperparameters) and the opposite time we’ll use GridSearchCV to search out the optimal values of hyperparameters for the dataset at hand. I’m using the famous Breast Cancer Wisconsin (Diagnostic) Data Set which I’m directly importing from the Scikit-learn library here.

#import all vital libraries
import sklearn
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import classification_report, confusion_matrix 
from sklearn.datasets import load_breast_cancer 
from sklearn.svm import SVC 
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split 

#load the dataset and split it into training and testing sets
dataset = load_breast_cancer()
X=dataset.data
Y=dataset.goal
X_train, X_test, y_train, y_test = train_test_split( 
                        X,Y,test_size = 0.30, random_state = 101) 
# train the model on train set without using GridSearchCV 
model = SVC() 
model.fit(X_train, y_train) 
  
# print prediction results 
predictions = model.predict(X_test) 
print(classification_report(y_test, predictions)) 

OUTPUT:
 precision    recall  f1-score   support

           0       0.95      0.85      0.90        66
           1       0.91      0.97      0.94       105

    accuracy                           0.92       171
   macro avg       0.93      0.91      0.92       171
weighted avg       0.93      0.92      0.92       171
# defining parameter range 
param_grid = {'C': [0.1, 1, 10, 100],  
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 
              'gamma':['scale', 'auto'],
              'kernel': ['linear']}  
  
grid = GridSearchCV(SVC(), param_grid, refit = True, verbose = 3,n_jobs=-1) 
  
# fitting the model for grid search 
grid.fit(X_train, y_train) 

# print best parameter after tuning 
print(grid.best_params_) 
grid_predictions = grid.predict(X_test) 
  
# print classification report 
print(classification_report(y_test, grid_predictions)) 
Output:
 {'C': 100, 'gamma': 'scale', 'kernel': 'linear'}
              precision    recall  f1-score   support

           0       0.97      0.91      0.94        66
           1       0.94      0.98      0.96       105

    accuracy                           0.95       171
   macro avg       0.96      0.95      0.95       171
weighted avg       0.95      0.95      0.95       171

Plenty of you may think that are one of the best values for hyperparameters for an SVM model. This will not be the case, the above-mentioned hyperparameters will be the best for the dataset we’re working on. But for every other dataset, the SVM model can have different optimal values for hyperparameters that will improve its performance.

Difference between parameter and hypermeter 

Parameter  Hyperparameter
The configuration model’s parameters are internal to the model. Hyperparameters are parameters which might be explicitly specified and control the training process.
Predictions require using parameters. Model optimization necessitates using hyperparameters.
These are specified or guessed while the model is being trained. These are established prior to the beginning of the model’s training.
That is internal to the model. That is external to the model.
These are learned & set by the model by itself. These are set manually by a machine learning engineer/practitioner.

If you utilise cross-validation, you put aside a portion of your data to make use of in assessing your model. Cross-validation might be done in quite a lot of ways. The best notion is to utilise 70% (I’m making up a number here; it doesn’t should be 70%) of the info for training and the remaining 30% for evaluating the model’s performance. To avoid overfitting, you’ll need distinct data for training and assessing the model. Other (somewhat tougher) cross-validation approaches, comparable to k-fold cross-validation, are also commonly employed in practice.

Grid search is a technique for performing hyper-parameter optimisation, that’s, with a given model (e.g. a CNN) and test dataset, it is a technique for locating the optimal combination of hyper-parameters (an example of a hyper-parameter is the educational rate of the optimiser). You will have quite a few models on this case, each with a unique set of hyper-parameters. Each of those parameter combos that correspond to a single model is claimed to lie on a “grid” point. The aim is to coach and evaluate each of those models using cross-validation, for instance. Then you definately select the one which performed one of the best.

This brings us to the tip of this text where we learned tips on how to find optimal hyperparameters of our model to get one of the best performance out of it.

Further Reading

  1. An Easy Guide to Gradient Descent in Machine Learning
  2. Support Vector Machine algorithm (SVM)
  3. Machine learning Tutorial
  4. What’s Gradient Boosting and the way is it different from AdaBoost
  5. Understanding the Ensemble method Bagging and Boosting
  6. What’s Cross Validation in Machine learning?

GridSearchCV FAQs

What’s GridSearchCV used for?

GridSearchCV is a way for locating the optimal parameter values from a given set of parameters in a grid. It’s essentially a cross-validation technique. The model in addition to the parameters should be entered. After extracting one of the best parameter values, predictions are made.

How do you define GridSearchCV?

 GridSearchCV is the strategy of performing hyperparameter tuning so as to determine the optimal values for a given model.

What does cv in GridSearchCV stand for?

GridSearchCV can also be generally known as GridSearch cross-validation: an internal cross-validation technique is used to calculate the rating for every combination of parameters on the grid.

How do you employ GridSearchCV in regression?

GirdserachCV in regression might be utilized by following the below steps
Import the library – GridSearchCv.
Arrange the Data.
Model and its Parameter.
Using GridSearchCV and Printing Results.

Does GridSearchCV use cross-validation?

GridSearchCV does, in reality, do cross-validation. If I understand the notion accurately, you ought to hide a portion of your data set from the model in order that it might be tested. In consequence, you train your models on training data after which test them on testing data.

LEAVE A REPLY

Please enter your comment!
Please enter your name here