Master LangChain: Building Advanced LLM Apps in Python

Introduction

LangChain is an open-source orchestration framework designed to create applications based on Large Language Models (LLMs). Available in both JavaScript and Python, it provides tools and abstractions that enhance customization, precision, and relevance of the information generated by these models. This article focuses on utilizing the Python API to illustrate how LangChain components can be used to build sophisticated LLM applications. But why is LangChain important? Let’s dive into its key components and understand their significance.

This article is aimed at developers and technical professionals interested in building agentic software using LangChain. You’ll learn about the key components of LangChain, including how to specify and use different LLMs, create prompt templates, build and manage chains, utilize output parsers, handle chat history, and integrate tools and agents effectively.

LLMs

LangChain offers an abstraction that allows the interchangeable use of various LLMs. For instance, we might choose `gpt-3.5-turbo` for data extraction due to its fine-tuned nature for such tasks, while preferring `gpt-4` for generating high-quality text for end users. Alternatively, we could utilize `LLaMa3` for either task to reduce costs.

Additionally, we can adjust the temperature of the same LLM to achieve more deterministic results. To specify the LLM in a chain, consider the following example using OpenAI:

We can employ this LLM in the execution of a chain. It’s possible to import multiple LLMs and even custom ones from LangChain modules, maintained by the community or the LangChain team.

PromptTemplates

LangChain provides various prompt templates to simplify interaction with the LLM. These templates can be used individually or as part of a sequence. Two crucial types of prompt templates are from_template and from_messages, each with its own pros and cons depending on the use case.

When employing a template, we establish the field for invoking our LLM. Let’s explore some examples that will practically achieve the same result.

From Template

This template allows setting a string template that will replace the text in brackets with the actual input during chain execution. This is particularly helpful when generating text without an existing chat history. Here’s an example:

 

You can also set a specific scenario for the AI to respond to, ensuring responses are based solely on that context:

 

  You will only answer questions if the information is present in this context:

 

From Messages

For continuing a conversation, the from_messages template serves as instructions for the LLM, similar to interacting with the OpenAI API. In the memory section, this template supplies the prompt along with the chat history:

 

Chains

When utilizing the OpenAI API, multiple steps are required to execute an LLM, provide a prompt, invoke it, and retrieve the output. LangChain simplifies this process by introducing chains, which use a bash-like pipe syntax to concatenate inputs and outputs from various steps into distinct functions. Here’s an example:


The fascinating aspect of chains lies in the ability to concatenate multiple chains, linking them together to achieve more intricate tasks.

 

Output parsers

Output parsers play a crucial role in working with LLMs. Prompting to define the output format can be time-consuming, and depending on the model’s temperature, it may fail when an expected output type is necessary. Two widely recognized parsers are JSONOutputParser and StrOutputParser.

  • StrOutputParser → John Doe

    This approach eliminates the need to extract text from the response, ensuring the output is a string.
    To utilize this parser, simply invoke the chain as follows:
    chain = prompt | llm | StrOutputParser()
  • JSONOutputParser → {“name”: “John”, “lastname”: “Doe”}
    The utilization of the Pydantic library is recommended for creating an object and utilizing it as a type format instruction:


This feature enables the extraction of various types of information from extensive texts or related datasets.

 

History

Chat history is fundamental for conversational chatbots or agents. Memory can be managed in various ways, from storing messages locally to using cloud storage providers like Upstash Redis:

 

Tools

LangChain offers a wide array of tools, including loaders from various websites, in-memory vector databases, and text splitters. These tools streamline the process of gathering, storing, and retrieving information, making it easily accessible for the agent’s utilization. Developers can also create custom tools tailored to the model’s requirements. Custom tools can perform diverse tasks like sending emails, adding calendar events, or other specified functions.

Moreover, developers have the option to develop custom tools tailored to the model’s requirements. These tools can perform diverse tasks like sending emails, adding calendar events, or any other specified function. Custom tools are crafted with a defined function and description, outlining the tool’s purpose and the necessary inputs for its operation.

The agent (which we will discuss in a moment) assumes the responsibility of selecting the appropriate tool to fulfill a user’s request based on the tools descriptions. Subsequently, the agent collects and processes the essential information to execute the designated function effectively, even asking the user for more information if missing and required.

For instance, let’s define a tool to get web results, Setting the initial line comment correctly is crucial as the model’s decisions will be based on it:

 

Agents

Every effective AI solution that completes tasks can be considered an “Agent.”

Based on our previous discussions, you can create one using LangChain. This Agent may possess memory, utilizes tools, follows a specific model (LLM), and can execute actions according to prompts, parsing its output to generate the next step based on the desired input.

When utilizing an agent, additional logic is required to accomplish tasks. Initially, the model identifies the tool to be used and generates the corresponding inputs. The selected tool is then executed with the provided input, and the resulting output is passed back to the agent to generate a response for the end user.

To illustrate, let’s consider an Agent that can respond to queries and retrieve information from the web when necessary.

First, define the tools, establish the model, and link them to enable the model to recognize the available tools. Assuming we have already defined these tool in the previous step, we can now proceed with setting up our agent.

The prompt should include a scratchpad, which is essential for storing temporary data while utilizing various tools.

The chain will be defined as usual. However, it is necessary to extract the information from the instructions array that we will define in the next step and convert it into scratchpad info.


Finally lets make a loop for the Agent to keep executing tools if needed until the response is ready, the result will be an instance of the object AgentFinish, and we will just return the log.

 

Conclusion

LangChain is a powerful tool that allows developers to carefully craft agentic software to achieve numerous and diverse tasks. By focusing on the flow of actions rather than the low-level implementation details, LangChain enables the development of sophisticated solutions more efficiently. Whether it’s managing chat histories, integrating various tools, or utilizing multiple LLMs, LangChain provides the flexibility and abstraction necessary to streamline the development process. For developers looking to build advanced AI applications, LangChain offers a robust framework that enhances customization, precision, and relevance, ultimately accelerating the path from concept to implementation.