Home News Zero to Advanced Prompt Engineering with Langchain in Python

Zero to Advanced Prompt Engineering with Langchain in Python

0
Zero to Advanced Prompt Engineering with Langchain in Python

A crucial aspect of Large Language Models (LLMs) is the variety of parameters these models use for learning. The more parameters a model has, the higher it may well comprehend the connection between words and phrases. Which means that models with billions of parameters have the capability to generate various creative text formats and answer open-ended and difficult questions in an informative way.

LLMs corresponding to ChatGPT, which utilize the Transformer model, are proficient in understanding and generating human language, making them useful for applications that require natural language understanding. Nonetheless, they should not without their limitations, which include outdated knowledge, inability to interact with external systems, lack of context understanding, and sometimes generating plausible-sounding but incorrect or nonsensical responses, amongst others.

Addressing these limitations requires integrating LLMs with external data sources and capabilities, which might present complexities and demand extensive coding and data handling skills. This, coupled with the challenges of understanding AI concepts and sophisticated algorithms, contributes to the educational curve related to developing applications using LLMs.

Nevertheless, the mixing of LLMs with other tools to form LLM-powered applications could redefine our digital landscape. The potential of such applications is vast, including improving efficiency and productivity, simplifying tasks, enhancing decision-making, and providing personalized experiences.

In this text, we’ll delve deeper into these issues, exploring the advanced techniques of prompt engineering with Langchain, offering clear explanations, practical examples, and step-by-step instructions on how one can implement them.

Langchain, a state-of-the-art library, brings convenience and adaptability to designing, implementing, and tuning prompts. As we unpack the principles and practices of prompt engineering, you’ll learn how one can utilize Langchain’s powerful features to leverage the strengths of SOTA Generative AI models like GPT-4.

Understanding Prompts

Before diving into the technicalities of prompt engineering, it is crucial to know the concept of prompts and their significance.

A ‘prompt‘ is a sequence of tokens which can be used as input to a language model, instructing it to generate a selected style of response. Prompts play an important role in steering the behavior of a model. They’ll impact the standard of the generated text, and when crafted appropriately, might help the model provide insightful, accurate, and context-specific results.

Prompt engineering is the art and science of designing effective prompts. The goal is to elicit the specified output from a language model. By rigorously choosing and structuring prompts, one can guide the model toward generating more accurate and relevant responses. In practice, this involves fine-tuning the input phrases to cater to the model’s training and structural biases.

The sophistication of prompt engineering ranges from easy techniques, corresponding to feeding the model with relevant keywords, to more advanced methods involving the design of complex, structured prompts that use the interior mechanics of the model to its advantage.

Langchain: The Fastest Growing Prompt Tool

LangChain, launched in October 2022 by Harrison Chase, has turn out to be one of the highly rated open-source frameworks on GitHub in 2023. It offers a simplified and standardized interface for incorporating Large Language Models (LLMs) into applications. It also provides a feature-rich interface for prompt engineering, allowing developers to experiment with different strategies and evaluate their results. By utilizing Langchain, you possibly can perform prompt engineering tasks more effectively and intuitively.

LangFlow serves as a user interface for orchestrating LangChain components into an executable flowchart, enabling quick prototyping and experimentation.

LangChain fills an important gap in AI development for the masses. It enables an array of NLP applications corresponding to virtual assistants, content generators, question-answering systems, and more, to resolve a variety of real-world problems.

Moderately than being a standalone model or provider, LangChain simplifies the interaction with diverse models, extending the capabilities of LLM applications beyond the constraints of a straightforward API call.

The Architecture of LangChain

 

LangChain’s fundamental components include Model I/O, Prompt Templates, Memory, Agents, and Chains.

Model I/O

LangChain facilitates a seamless reference to various language models by wrapping them with a standardized interface often called Model I/O. This facilitates a simple model switch for optimization or higher performance. LangChain supports various language model providers, including OpenAI, HuggingFace, Azure, Fireworks, and more.

Prompt Templates

These are used to administer and optimize interactions with LLMs by providing concise instructions or examples. Optimizing prompts enhances model performance, and their flexibility contributes significantly to the input process.

An easy example of a prompt template:

from langchain.prompts import PromptTemplate
prompt = PromptTemplate(input_variables=["subject"],
template="What are the recent advancements in the sphere of {subject}?")
print(prompt.format(subject="Natural Language Processing"))

As we advance in complexity, we encounter more sophisticated patterns in LangChain, corresponding to the Reason and Act (ReAct) pattern. ReAct is an important pattern for motion execution where the agent assigns a task to an appropriate tool, customizes the input for it, and parses its output to perform the duty. The Python example below showcases a ReAct pattern. It demonstrates how a prompt is structured in LangChain, using a series of thoughts and actions to reason through an issue and produce a final answer:

PREFIX = """Answer the next query using the given tools:"""
FORMAT_INSTRUCTIONS = """Follow this format:
Query: {input_question}
Thought: your initial thought on the query
Motion: your chosen motion from [{tool_names}]
Motion Input: your input for the motion
Remark: the motion's final result"""
SUFFIX = """Start!
Query: {input}
Thought:{agent_scratchpad}"""

Memory

Memory is a critical concept in LangChain, enabling LLMs and tools to retain information over time. This stateful behavior improves the performance of LangChain applications by storing previous responses, user interactions, the state of the environment, and the agent’s goals. The ConversationBufferMemory and ConversationBufferWindowMemory strategies help keep track of the total or recent parts of a conversation, respectively. For a more sophisticated approach, the ConversationKGMemory strategy allows encoding the conversation as a knowledge graph which may be fed back into prompts or used to predict responses without calling the LLM.

Agents

An agent interacts with the world by performing actions and tasks. In LangChain, agents mix tools and chains for task execution. It may establish a connection to the skin world for information retrieval to reinforce LLM knowledge, thus overcoming their inherent limitations. They’ll resolve to pass calculations to a calculator or Python interpreter depending on the situation.

Agents are equipped with subcomponents:

  • Tools: These are functional components.
  • Toolkits: Collections of tools.
  • Agent Executors: That is the execution mechanism that enables selecting between tools.

Agents in LangChain also follow the Zero-shot ReAct pattern, where the choice is predicated only on the tool’s description. This mechanism may be prolonged with memory with the intention to bear in mind the total conversation history. With ReAct, as an alternative of asking an LLM to autocomplete your text, you possibly can prompt it to reply in a thought/act/statement loop.

Chains

Chains, because the term suggests, are sequences of operations that allow the LangChain library to process language model inputs and outputs seamlessly. These integral components of LangChain are fundamentally made up of links, which may be other chains, or primitives corresponding to prompts, language models, or utilities.

Imagine a series as a conveyor belt in a factory. Each step on this belt represents a certain operation, which could possibly be invoking a language model, applying a Python function to a text, and even prompting the model in a selected way.

LangChain categorizes its chains into three types: Utility chains, Generic chains, and Mix Documents chains. We’ll dive into Utility and Generic chains for our discussion.

  • Utility Chains are specifically designed to extract precise answers from language models for narrowly defined tasks. For instance, let’s take a have a look at the LLMMathChain. This utility chain enables language models to perform mathematical calculations. It accepts an issue in natural language, and the language model in turn generates a Python code snippet which is then executed to provide the reply.
  • Generic Chains, then again, function constructing blocks for other chains but can’t be directly used standalone. These chains, corresponding to the LLMChain, are foundational and are sometimes combined with other chains to perform intricate tasks. As an illustration, the LLMChain is incessantly used to question a language model object by formatting the input based on a provided prompt template after which passing it to the language model.

Step-by-step Implementation of Prompt Engineering with Langchain

We are going to walk you thru the strategy of implementing prompt engineering using Langchain. Before proceeding, make sure that you will have installed the vital software and packages.

You’ll be able to benefit from popular tools like Docker, Conda, Pip, and Poetry for establishing LangChain. The relevant installation files for every of those methods may be found inside the LangChain repository at https://github.com/benman1/generative_ai_with_langchain. This features a Dockerfile for Docker, a requirements.txt for Pip, a pyproject.toml for Poetry, and a langchain_ai.yml file for Conda.

In our article we’ll use Pip, the usual package manager for Python, to facilitate the installation and management of third-party libraries. If it isn’t included in your Python distribution, you possibly can install Pip by following the instructions at https://pip.pypa.io/.

To put in a library with Pip, use the command pip install library_name.

Nonetheless, Pip doesn’t manage environments by itself. To handle different environments, we use the tool virtualenv.

In the following section, we shall be discussing model integrations.

Step 1: Establishing Langchain

First, it is advisable install the Langchain package. We’re using Windows OS. Run the next command in your terminal to put in it:

pip install langchain

Step 2: Importing Langchain and other vital modules

Next, import Langchain together with other vital modules. Here, we also import the transformers library, which is extensively utilized in NLP tasks.

import langchain
from transformers import AutoModelWithLMHead, AutoTokenizer

Step 3: Load Pretrained Model

Open AI

OpenAI models may be conveniently interfaced with the LangChain library or the OpenAI Python client library. Notably, OpenAI furnishes an Embedding class for text embedding models. Two key LLM models are GPT-3.5 and GPT-4, differing mainly in token length. Pricing for every model may be found on OpenAI’s website. While there are more sophisticated models like GPT-4-32K which have higher token acceptance, their availability via API just isn’t all the time guaranteed.

Accessing these models requires an OpenAI API key. This may be done by creating an account on OpenAI’s platform, establishing billing information, and generating a brand new secret key.

import os
os.environ["OPENAI_API_KEY"] = 'your-openai-token'

After successfully creating the important thing, you possibly can set it as an environment variable (OPENAI_API_KEY) or pass it as a parameter during class instantiation for OpenAI calls.

Consider a LangChain script to showcase the interaction with the OpenAI models:

from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")
# The LLM takes a prompt as an input and outputs a completion
prompt = "who's the president of the USA of America?"
completion = llm(prompt)
The present President of the USA of America is Joe Biden.

In this instance, an agent is initialized to perform calculations. The agent takes an input, a straightforward addition task, processes it using the provided OpenAI model and returns the result.

Hugging Face

Hugging Face is a FREE-TO-USE Transformers Python library, compatible with PyTorch, TensorFlow, and JAX, and includes implementations of models like BERT, T5, etc.

Hugging Face also offers the Hugging Face Hub, a platform for hosting code repositories, machine learning models, datasets, and web applications.

To make use of Hugging Face as a provider to your models, you will need an account and API keys, which may be obtained from their website. The token may be made available in your environment as HUGGINGFACEHUB_API_TOKEN.

Consider the next Python snippet that utilizes an open-source model developed by Google, the Flan-T5-XXL model:

from langchain.llms import HuggingFaceHub
llm = HuggingFaceHub(model_kwargs={"temperature": 0.5, "max_length": 64},repo_id="google/flan-t5-xxl")
prompt = "By which country is Tokyo?"
completion = llm(prompt)
print(completion)

This script takes an issue as input and returns a solution, showcasing the knowledge and prediction capabilities of the model.

Step 4: Basic Prompt Engineering

To start out with, we’ll generate a straightforward prompt and see how the model responds.

prompt="Translate the next English text to French: "{0}""
input_text="Hello, how are you?"
input_ids = tokenizer.encode(prompt.format(input_text), return_tensors="pt")
generated_ids = model.generate(input_ids, max_length=100, temperature=0.9)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

Within the above code snippet, we offer a prompt to translate English text into French. The language model then tries to translate the given text based on the prompt.

Step 5: Advanced Prompt Engineering

While the above approach works high-quality, it doesn’t take full advantage of the facility of prompt engineering. Let’s improve upon it by introducing some more complex prompt structures.

prompt="As a highly proficient French translator, translate the next English text to French: "{0}""
input_text="Hello, how are you?"
input_ids = tokenizer.encode(prompt.format(input_text), return_tensors="pt")
generated_ids = model.generate(input_ids, max_length=100, temperature=0.9)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

On this code snippet, we modify the prompt to suggest that the interpretation is being done by a ‘highly proficient French translator. The change within the prompt can result in improved translations, because the model now assumes a persona of an authority.

Constructing an Academic Literature Q&A System with Langchain

We’ll construct an Academic Literature Query and Answer system using LangChain that may answer questions on recently published academic papers.

Firstly, to establish the environment, we install the vital dependencies.

pip install langchain arxiv openai transformers faiss-cpu

Following the installation, we create a brand new Python notebook and import the vital libraries:

from langchain.llms import OpenAI
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.docstore.document import Document
import arxiv

The core of our Q&A system is the flexibility to fetch relevant academic papers related to a certain field, here we consider Natural Language Processing (NLP), using the arXiv academic database. To perform this, we define a function get_arxiv_data(max_results=10). This function collects probably the most recent NLP paper summaries from arXiv and encapsulates them into LangChain Document objects, using the summary as content and the unique entry id because the source.

We’ll use the arXiv API to fetch recent papers related to NLP:

def get_arxiv_data(max_results=10):
    search = arxiv.Search(
        query="NLP",
        max_results=max_results,
        sort_by=arxiv.SortCriterion.SubmittedDate,
    )
   
    documents = []
   
    for lead to search.results():
        documents.append(Document(
            page_content=result.summary,
            metadata={"source": result.entry_id},
        ))
    return documents

This function retrieves the summaries of probably the most recent NLP papers from arXiv and converts them into LangChain Document objects. We’re using the paper’s summary and its unique entry id (URL to the paper) because the content and source, respectively.

def print_answer(query):
    print(
        chain(
            {
                "input_documents": sources,
                "query": query,
            },
            return_only_outputs=True,
        )["output_text"]
    )                 

Let’s define our corpus and arrange LangChain:

sources = get_arxiv_data(2)
chain = load_qa_with_sources_chain(OpenAI(temperature=0))

With our academic Q&A system now ready, we will test it by asking an issue:

print_answer("What are the recent advancements in NLP?")

The output shall be the reply to your query, citing the sources from which the data was extracted. As an illustration:

Recent advancements in NLP include Retriever-augmented instruction-following models and a novel computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs).
SOURCES: http://arxiv.org/abs/2307.16877v1, http://arxiv.org/abs/2307.16830v1

You’ll be able to easily switch models or alter the system as per your needs. For instance, here we’re changing to GPT-4 which find yourself giving us a a lot better and detailed response.

sources = get_arxiv_data(2)
chain = load_qa_with_sources_chain(OpenAI(model_name="gpt-4",temperature=0))
Recent advancements in Natural Language Processing (NLP) include the event of retriever-augmented instruction-following models for information-seeking tasks corresponding to query answering (QA). These models may be adapted to numerous information domains and tasks without additional fine-tuning. Nonetheless, they often struggle to keep on with the provided knowledge and should hallucinate of their responses. One other advancement is the introduction of a computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs). This approach utilizes a single-instruction, multiple-data (SIMD) abstraction of nonlinear programs (NLP) and employs a condensed-space interior-point method (IPM) with an inequality rest strategy. This strategy allows for the factorization of the KKT matrix without numerical pivoting, which has previously hampered the parallelization of the IPM algorithm.
SOURCES: http://arxiv.org/abs/2307.16877v1, http://arxiv.org/abs/2307.16830v1

A token in GPT-4 may be as short as one character or so long as one word. As an illustration, GPT-4-32K, can process as much as 32,000 tokens in a single run while GPT-4-8K and GPT-3.5-turbo support 8,000 and 4,000 tokens respectively. Nonetheless, it is important to notice that each interaction with these models comes with a price that’s directly proportional to the variety of tokens processed, be it input or output.

Within the context of our Q&A system, if a chunk of educational literature exceeds the utmost token limit, the system will fail to process it in its entirety, thus affecting the standard and completeness of responses. To work around this issue, the text may be broken down into smaller parts that comply with the token limit.

FAISS (Facebook AI Similarity Search) assists in quickly finding probably the most relevant text chunks related to the user’s query. It creates a vector representation of every text chunk and uses these vectors to discover and retrieve the chunks most just like the vector representation of a given query.

It is vital to keep in mind that even with the usage of tools like FAISS, the need to divide the text into smaller chunks as a result of token limitations can sometimes result in the lack of context, affecting the standard of answers. Due to this fact, careful management and optimization of token usage are crucial when working with these large language models.

 
pip install faiss-cpu langchain CharacterTextSplitter

After ensuring the above libraries are installed, run

 
from langchain.embeddings.openai import OpenAIEmbeddings 
from langchain.vectorstores.faiss import FAISS 
from langchain.text_splitter import CharacterTextSplitter 
documents = get_arxiv_data(max_results=10) # We are able to now use feed more data
document_chunks = []
splitter = CharacterTextSplitter(separator=" ", chunk_size=1024, chunk_overlap=0)
for document in documents:
    for chunk in splitter.split_text(document.page_content):
        document_chunks.append(Document(page_content=chunk, metadata=document.metadata))
search_index = FAISS.from_documents(document_chunks, OpenAIEmbeddings())
chain = load_qa_with_sources_chain(OpenAI(temperature=0))
def print_answer(query):
    print(
        chain(
            {
                "input_documents": search_index.similarity_search(query, k=4),
                "query": query,
            },
            return_only_outputs=True,
        )["output_text"]
    )

With the code complete, we now have a robust tool for querying the newest academic literature in the sphere of NLP.

 
Recent advancements in NLP include the usage of deep neural networks (DNNs) for automatic text evaluation and natural language processing (NLP) tasks corresponding to spell checking, language detection, entity extraction, writer detection, query answering, and other tasks. 
SOURCES: http://arxiv.org/abs/2307.10652v1, http://arxiv.org/abs/2307.07002v1, http://arxiv.org/abs/2307.12114v1, http://arxiv.org/abs/2307.16217v1 

Conclusion

The mixing of Large Language Models (LLMs) into applications has speed up adoption of several domains, including language translation, sentiment evaluation, and data retrieval. Prompt engineering is a robust tool in maximizing the potential of those models, and Langchain is leading the way in which in simplifying this complex task. Its standardized interface, flexible prompt templates, robust model integration, and the progressive use of agents and chains ensure optimal outcomes for LLMs’ performance.

Nonetheless, despite these advancements, there are few tricks to bear in mind. As you utilize Langchain, it’s essential to grasp that the standard of the output depends heavily on the prompt’s phrasing. Experimenting with different prompt styles and structures can yield improved results. Also, keep in mind that while Langchain supports quite a lot of language models, every one has its strengths and weaknesses. Selecting the suitable one to your specific task is crucial. Lastly, it is important to keep in mind that using these models comes with cost considerations, as token processing directly influences the associated fee of interactions.

As demonstrated within the step-by-step guide, Langchain can power robust applications, corresponding to the Academic Literature Q&A system. With a growing user community and increasing prominence within the open-source landscape, Langchain guarantees to be a pivotal tool in harnessing the total potential of LLMs like GPT-4.

LEAVE A REPLY

Please enter your comment!
Please enter your name here