Home Artificial Intelligence An Introduction To Advantageous-Tuning Pre-Trained Transformers Models Table of Contents 1. Setup

An Introduction To Advantageous-Tuning Pre-Trained Transformers Models Table of Contents 1. Setup

0
An Introduction To Advantageous-Tuning Pre-Trained Transformers Models
Table of Contents
1. Setup

Simplified utilizing the HuggingFace trainer object

Towards Data Science
Image from Unsplash by Markus Spiske

HuggingFace serves as a house to many popular open-source NLP models. A lot of these models are effective as is, but often require some kind of training or fine-tuning to enhance performance on your specific use-case. Because the LLM implosion continues, we’ll take a step back in this text to revisit a few of the core constructing blocks HuggingFace provides that simplify the training of NLP models.

Traditionally NLP models might be trained using vanilla PyTorch, TensorFlow/Keras, and other popular ML frameworks. While you’ll be able to go this route, it does require a deeper understanding of the framework you’re utilizing in addition to more code to write down the training loop. With HuggingFace’s Trainer class, there’s a less complicated approach to interact with the NLP Transformers models that you must utilize.

Trainer is a category specifically optimized for Transformers models and likewise provides tight integration with other Transformers libraries resembling Datasets and Evaluate. Trainer at a more advanced level also supports distributed training libraries and might be easily integrated with infrastructure platforms resembling Amazon SageMaker.

In this instance we’ll take a take a look at using the Trainer class locally to fine-tune the favored BERT model on the IMBD dataset for a Text Classification use-case(Large Movie Reviews Dataset Citation).

NOTE: This text assumes basic knowledge of Python and the domain of NLP. We won’t get into any specific Machine Learning theory around model constructing or selection, this text is devoted to understanding how we are able to fine-tune the present pre-trained models available within the HuggingFace Model Hub.

  1. Setup
  2. Advantageous-Tuning BERT
  3. Additional Resources & Conclusion

For this instance, we’ll be working in SageMaker Studio and utilize a conda_python3 kernel on a ml.g4dn.12xlarge instance. Note which you could use a smaller instance type, but this might impact the training speed depending on the variety of CPUs/employees which can be available.

LEAVE A REPLY

Please enter your comment!
Please enter your name here