Home Artificial Intelligence Learn how to Construct a Local Open-Source LLM Chatbot With RAG Introduction Retrieval-Augmented Generation (RAG)

Learn how to Construct a Local Open-Source LLM Chatbot With RAG Introduction Retrieval-Augmented Generation (RAG)

0
Learn how to Construct a Local Open-Source LLM Chatbot With RAG
Introduction
Retrieval-Augmented Generation (RAG)

Large Language Models (LLMs) are remarkable at compressing knowledge concerning the world into their billions of parameters.

Nonetheless, LLMs have two major limitations: They only have up-to-date knowledge as much as the time of the last training iteration. And they often are inclined to make up knowledge (hallucinate) when asked specific questions.

Using the RAG technique, we may give pre-trained LLMs access to very specific information as additional context when answering our questions.

In this text, I’ll walk through the idea and practice of implementing Google’s LLM Gemma with additional RAG capabilities using the Hugging Face transformers library, LangChain, and the Faiss vector database.

An summary of the RAG pipeline is shown within the figure below, which we are going to implement step-by-step.

An overview of the RAG pipeline. For documents storage: input documents -> text chunks -> encoder model -> vector database. For LLM prompting: User query -> encoder model -> vector database -> top-k relevant chunks -> generator LLM model. The LLM then answers the query with the retrieved context.” class=”bg mv oi c” width=”700″ height=”609″ loading=”eager”></picture></div>
</div><figcaption class=Overview of the RAG pipeline implementation. Image by writer

LEAVE A REPLY

Please enter your comment!
Please enter your name here