Home Artificial Intelligence Constructing a Conformal Chatbot in Julia πŸ‘€ At a Glance πŸ€— HuggingFace

Constructing a Conformal Chatbot in Julia πŸ‘€ At a Glance πŸ€— HuggingFace

0
Constructing a Conformal Chatbot in Julia
πŸ‘€ At a Glance
πŸ€— HuggingFace

Conformal Prediction, LLMs and HuggingFace β€” Part 1

Towards Data Science

Large Language Models (LLM) are all the thrill without delay. They’re used for a wide range of tasks, including text classification, query answering, and text generation. On this tutorial, we are going to show tips on how to conformalize a transformer language model for text classification using ConformalPrediction.jl.

Specifically, we’re enthusiastic about the duty of intent classification as illustrated within the sketch below. Firstly, we feed a customer query into an LLM to generate embeddings. Next, we train a classifier to match these embeddings to possible intents. After all, for this supervised learning problem we’d like training data consisting of inputs β€” queries β€” and outputs β€” labels indicating the true intent. Finally, we apply Conformal Predition to quantify the predictive uncertainty of our classifier.

Conformal Prediction (CP) is a rapidly emerging methodology for Predictive Uncertainty Quantification. Should you’re unfamiliar with CP, it’s possible you’ll wish to first take a look at my 3-part introductory series on the subject starting with this post.

High-level overview of a conformalized intent classifier. Image by writer.

We are going to use the Banking77 dataset (Casanueva et al., 2020), which consists of 13,083 queries from 77 intents related to banking. On the model side, we are going to use the DistilRoBERTa model, which is a distilled version of RoBERTa (Liu et al., 2019) fine-tuned on the Banking77 dataset.

The model may be loaded from HF straight into our running Julia session using the Transformers.jl package.

This package makes working with HF models remarkably easy in Julia. Kudos to the devs! πŸ™

Below we load the tokenizer tkr and the model mod. The tokenizer is used to convert the text right into a sequence of integers, which is then fed into the model. The model outputs a hidden state, which is then fed right into a classifier to get the logits for every class. Finally, the logits are then passed through a softmax function to get the corresponding predicted probabilities. Below we run a number of queries through the model to see the way it performs.

# Load model from HF πŸ€—:
tkr =…

LEAVE A REPLY

Please enter your comment!
Please enter your name here