Home Community Using LangChain: Find out how to Add Conversational Memory to an LLM?

Using LangChain: Find out how to Add Conversational Memory to an LLM?

Using LangChain: Find out how to Add Conversational Memory to an LLM?

Recognizing the necessity for continuity in user interactions, LangChain, a flexible software framework designed for constructing applications around LLMs, introduces a pivotal feature often known as Conversational Memory. This feature empowers developers to seamlessly integrate memory capabilities into LLMs, enabling them to retain information from previous interactions and respond contextually.

Conversational Memory is a fundamental aspect of LangChain that proves instrumental in creating applications, particularly chatbots. Unlike stateless conversations, where each interaction is treated in isolation, Conversational Memory allows LLMs to recollect and leverage information from prior exchanges. This breakthrough feature transforms the user experience, ensuring a more natural and coherent flow of conversation.

  1. Initialize the LLM and ConversationChain

Let’s start by initializing the big language model and the conversational chain using langchain. This can set the stage for implementing conversational memory.

from langchain import OpenAI

from langchain.chains import ConversationChain

# first initialize the big language model

llm = OpenAI(





# now initialize the conversation chain

conversation_chain = ConversationChain(llm)
  1. ConversationBufferMemory

The ConversationBufferMemory in LangChain stores past interactions between the user and AI in its raw form, preserving the entire history. This permits the model to know and respond contextually by considering all the conversation flow during subsequent interactions.

from langchain.chains.conversation.memory import ConversationBufferMemory

# Assuming you've gotten already initialized the OpenAI model (llm) elsewhere

# Initialize the ConversationChain with ConversationBufferMemory

conversation_buf = ConversationChain(



  1. Counting the Tokens

We now have added a count_tokens function in order that we are able to keep a count of the tokens utilized in each interaction.

from langchain.callbacks import get_openai_callback

def count_tokens(chain, query):

    # Using the get_openai_callback to trace token usage

    with get_openai_callback() as cb:

        # Run the query through the conversation chain

        result = chain.run(query)

        # Print the full variety of tokens used

        print(f'Spent a complete of {cb.total_tokens} tokens')

    return result
  1. Checking the history

To envision if the ConversationBufferMemory has saved the history or not, we are able to print the conversation history just as shown below. This can show that the buffer saves every interaction within the chat history.

  1. ConversationSummaryMemory

When using ConversationSummaryMemory in LangChain, the conversation history is summarized before being provided to the history parameter. This helps control token usage, stopping the fast exhaustion of tokens and overcoming context window limits in advanced LLMs. 

from langchain.chains.conversation.memory import ConversationSummaryMemory

# Assuming you've gotten already initialized the OpenAI model (llm)

conversation = ConversationChain(




# Access and print the template attribute from ConversationSummaryMemory


Using ConversationSummaryMemory in LangChain offers a bonus for longer conversations because it initially consumes more tokens but grows more slowly because the conversation progresses. This summarization approach is helpful for cases with prolonged interactions, providing more efficient use of tokens in comparison with ConversationBufferMemory, which grows linearly with the variety of tokens within the chat. Nevertheless, it’s important to notice that even with summarization, there are still inherent limitations attributable to token constraints over time.

  1. ConversationBufferWindowMemory

We initialize the ConversationChain with ConversationBufferWindowMemory, setting the parameter `k` to 1. This means that we’re using a windowed buffer memory approach with a window size of 1. Which means only probably the most recent interaction is retained in memory, discarding previous conversations beyond probably the most recent exchange. This windowed buffer memory is helpful when you need to maintain contextual understanding with a limited history.

from langchain.chains.conversation.memory import ConversationBufferWindowMemory

# Assuming you've gotten already initialized llm

# Initialize ConversationChain with ConversationBufferWindowMemory

conversation = ConversationChain(



  1. ConversationSummaryBufferMemory

Here, a ConversationChain named conversation_sum_bufw is initialized with the ConversationSummaryBufferMemory. This memory type utilizes summarization and buffer window techniques to recollect essential early interactions while maintaining recent tokens, with a specified token limit of 650 to regulate memory usage.

In conclusion, using conversational memory in LangChain offers quite a lot of options to administer the state of conversations with Large Language Models. The examples provided exhibit other ways to tailor the conversation memory based on specific scenarios. Other than those listed above, we’ve got some more options like ConversationKnowledgeGraphMemory and ConversationEntityMemory.

Whether it’s sending all the history, utilizing summaries, tracking token counts, or combining these methods, exploring the available options and choosing the suitable pattern for the use case is vital. LangChain provides flexibility, allowing users to implement custom memory modules, mix multiple memory types inside the same chain, integrate them with agents, and more.


Manya Goyal is an AI and Research consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Guru Gobind Singh Indraprastha University(Bhagwan Parshuram Institute of Technology). She is a Data Science enthusiast and has a keen interest within the scope of application of artificial intelligence in various fields. She is a podcaster on Spotify and is keen about exploring.

🚀 Boost your LinkedIn presence with Taplio: AI-driven content creation, easy scheduling, in-depth analytics, and networking with top creators – Try it free now!.


Please enter your comment!
Please enter your name here