Collaboration is a core aspect of analysts’ day-to-day jobs. Frequently, we encounter high-level requests such as, “What will be the impact of the new feature?” or “What is going on with retention?”. Before jumping to writing queries and pulling data, we usually need to define tasks more clearly: talk to stakeholders, understand their needs thoroughly, and determine how we can provide the best assistance.
So, for an LLM-powered analyst, mastering the art of posing and addressing follow-up questions is essential since I can’t imagine an analyst working in isolation.
In this article, we will teach our LLM analyst to ask clarifying questions and follow long conversations. We will talk in detail about different memory implementations in LangChain.
We’ve already discussed many aspects of LLM agents in the previous articles. So, let me quickly summarise them. Also, since our last implementation, LangChain has been updated, and it’s time to catch up.
LLM agents recap
Let’s quickly recap what we’ve already learned about LLM agents.
- We’ve discussed how to empower LLMs with external tools. It helps them overcome limitations (i.e., poor performance on maths tasks) and get access to the world (i.e., your database or internet).
- The core idea of the LLM agents is to use LLM as a reasoning engine to define the set of actions to take and leverage tools. So, in this approach, you don’t need to hardcode the logic and just let LLM make decisions on the following steps to achieve the final goal.
- We’ve implemented an LLM-powered agent that can work with SQL databases and answer user requests.
Since our last iteration, LangChain has been updated from 0.0.350 to 0.1.0 version. The documentation and best practices for LLM agents have changed. This domain is developing quickly, so it’s no surprise the tools are evolving, too. Let’s quickly recap.
First, LangChain has significantly improved the documentation, and now you can find a clear, structured view of the supported agent types and the differences between them.
It’s easier for models to work with tools with just one input parameter, so some agents have such limitations. However, in most real-life cases, tools have several arguments. So, let’s focus on the agents capable of working with multiple inputs. It leaves us just three possible options.
- It’s the most cutting-edge type of agent since it supports chat history, tools with multiple inputs and even parallel function calling.
- You can use it with the recent OpenAI models (after
1106
) since they were fine-tuned for tool calling.
- OpenAI functions agents are close to OpenAI tools but are slightly different under the hood.
- Such agents don’t support parallel function calling.
- You can use recent OpenAI models that were fine-tuned to work with functions (the complete list is here) or compatible open-source LLMs.
- This approach is similar to ReAct. It instructs an agent to follow the Thought -> Action -> Observation framework.
- It doesn’t support parallel function calling, just as OpenAI functions approach.
- You can use it with any model.
Also, you can notice that the experimental agent types we tried in the previous article, such as BabyAGI, Plan-and-execute and AutoGPT, are still not part of the suggested options. They might be included later (I hope), but for now I wouldn’t recommend using them in production.
After reading the new documentation, I’ve finally realised the difference between OpenAI tools and OpenAI functions agents. With the OpenAI tools approach, an agent can call multiple tools at the same iterations, while other agent types don’t support such functionality. Let’s see how it works and why it matters.