构建聊天机器人

import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo")

deepseek

from langchain_openai import ChatOpenAI

api_key = ""

model = ChatOpenAI(
    base_url="https://api.deepseek.com/v1",
    api_key=api_key,
    model="deepseek-chat",
)

TogetherAI

import getpass
import os

os.environ["TOGETHER_API_KEY"] = getpass.getpass()

from langchain_openai import ChatOpenAI

model = ChatOpenAI(
    base_url="https://api.together.xyz/v1",
    api_key=os.environ["TOGETHER_API_KEY"],
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",
)

我们首先直接使用模型。ChatModel是 LangChain “Runnables” 的实例，这意味着它们公开了一个用于与它们交互的标准接口。要简单地调用模型，我们可以将消息列表传递给 .invoke 方法。

from langchain_core.messages import HumanMessage

model.invoke([HumanMessage(content="Hi! I'm Bob")])

API 参考：HumanMessage

AIMessage(content='Hello Bob! How can I assist you today?', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 12, 'total_tokens': 22}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d939617f-0c3b-45e9-a93f-13dafecbd4b5-0', usage_metadata={'input_tokens': 12, 'output_tokens': 10, 'total_tokens': 22})

该模型本身没有任何状态的概念。例如，如果您提出后续问题：

model.invoke([HumanMessage(content="What's my name?")])

AIMessage(content="I'm sorry, I don't have access to personal information unless you provide it to me. How may I assist you today?", response_metadata={'token_usage': {'completion_tokens': 26, 'prompt_tokens': 12, 'total_tokens': 38}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-47bc8c20-af7b-4fd2-9345-f0e9fdf18ce3-0', usage_metadata={'input_tokens': 12, 'output_tokens': 26, 'total_tokens': 38})

我们可以看到，它没有将之前的对话转化为上下文，也无法回答问题。这会导致糟糕的聊天机器人体验！

为了解决这个问题，我们需要将整个对话历史传递到模型中。让我们看看当我们这样做时会发生什么：

from langchain_core.messages import AIMessage

model.invoke(
    [
        HumanMessage(content="Hi! I'm Bob"),
        AIMessage(content="Hello Bob! How can I assist you today?"),
        HumanMessage(content="What's my name?"),
    ]
)

API 参考：AIMessage

AIMessage(content='Your name is Bob. How can I help you, Bob?', response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 35, 'total_tokens': 48}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9f90291b-4df9-41dc-9ecf-1ee1081f4490-0', usage_metadata={'input_tokens': 35, 'output_tokens': 13, 'total_tokens': 48})

我们可以看到，模型能够记住之前的对话并回答问题。

这是支撑聊天机器人进行对话交互能力的基本思想。那么我们如何最好地实现这一点呢？

3 Message History(消息历史记录)

我们可以使用 Message History 类来包装我们的模型并使其有状态。这将跟踪模型的输入和输出，并将它们存储在某个数据存储中。然后，未来的交互将加载这些消息，并将它们作为 input 的一部分传递到链中。让我们看看如何使用它！

首先，让我们确保安装 langchain-community，因为我们将使用其中的集成来存储消息历史记录。

之后，我们可以导入相关的类并设置我们的链，该链包装模型并添加此消息历史记录。这里的一个关键部分是我们作为 get_session_history 传入的函数。此函数应接收 session_id 并返回 Message History 对象。此session_id用于区分单独的对话，在调用新链时应作为配置的一部分传入（我们将展示如何执行此作）。

from langchain_core.chat_history import (
    BaseChatMessageHistory,
    InMemoryChatMessageHistory,
)
from langchain_core.runnables.history import RunnableWithMessageHistory

store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = InMemoryChatMessageHistory()
    return store[session_id]


with_message_history = RunnableWithMessageHistory(model, get_session_history)

API 参考：BaseChatMessageHistory | InMemoryChatMessageHistory | RunnableWithMessageHistory

我们现在需要创建一个config，每次都传递给 runnable。此配置包含的信息不是直接输入的一部分，但仍然有用。在本例中，我们希望包含一个 session_id。这应该看起来像：

config = {"configurable": {"session_id": "abc2"}}

response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Bob")],
    config=config,
)

response.content

'Hi Bob! How can I assist you today?'

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Bob. How can I help you today, Bob?'

我们的聊天机器人现在记住了关于我们的事情。如果我们更改配置以引用不同的session_id，我们可以看到它开始了全新的对话。

config = {"configurable": {"session_id": "abc3"}} # session_id 改为abc3

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

"I'm sorry, I cannot determine your name as I am an AI assistant and do not have access to that information."

我们可以看到它没有记住之前的对话。

我们也可以返回到之前的对话，只需将 session_id 更改为 abc2 即可。

config = {"configurable": {"session_id": "abc2"}}

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Bob. How can I help you today, Bob?'

这就是我们支持聊天机器人与许多用户进行对话的方式！

现在，我们所做的只是在模型周围添加一个简单的持久层。我们可以通过添加提示模板来开始使它变得更加复杂和个性化。

4 Prompt templates(提示模板)

提示模板有助于将原始用户信息转换为 LLM 可以使用的格式。在本例中，原始用户输入只是一条消息，我们将其传递给 LLM。现在让我们让它稍微复杂一点。首先，让我们添加一条带有一些自定义指令的系统消息（但仍将消息作为输入）。接下来，除了消息之外，我们还将添加更多输入。

首先，让我们添加一条系统消息。为此，我们将创建一个 ChatPromptTemplate。我们将使用 MessagesPlaceholder 来传递所有消息。

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

API 参考：ChatPromptTemplate | MessagesPlaceholder

请注意，这会略微改变输入类型 - 我们现在传入的不是消息列表，而是传入一个带有 messages 键的字典，其中包含一个消息列表。

response = chain.invoke({"messages": [HumanMessage(content="hi! I'm bob")]})

response.content

'Hello Bob! How can I assist you today?'

现在，我们可以像以前一样将其包装在相同的 Messages History 对象中

with_message_history = RunnableWithMessageHistory(chain, get_session_history)

config = {"configurable": {"session_id": "abc5"}}

response = with_message_history.invoke(
    [HumanMessage(content="Hi! I'm Jim")],
    config=config,
)

response.content

'Hi Jim! How can I assist you today?'

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Jim. How can I help you today, Jim?'

response = with_message_history.invoke(
    [HumanMessage(content="What's my name?")],
    config=config,
)

response.content

'Your name is Jim. How can I help you today, Jim?'

现在让我们的提示稍微复杂一点。我们假设提示模板现在如下所示：

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | model

请注意，我们已向提示添加了新的language输入。现在，我们可以调用链并传入我们选择的语言。

response = chain.invoke(
    {"messages": [HumanMessage(content="hi! I'm bob")], "language": "Spanish"}
)

response.content

'¡Hola Bob! ¿Cómo puedo ayudarte hoy?'

现在让我们将这个更复杂的链包装在 Message History 类中。这一次，由于输入中有多个 key，我们需要指定正确的 key 来保存聊天记录。

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc11"}}

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="hi! I'm todd")], "language": "Spanish"},
    config=config,
)

response.content

'¡Hola Todd! ¿En qué puedo ayudarte hoy?'

response = with_message_history.invoke(
    {"messages": [HumanMessage(content="whats my name?")], "language": "Spanish"},
    config=config,
)

response.content

'Tu nombre es Todd.'

提示

这里我认为有几个比较重要的概念：

HumanMessage/AIMessage 是消息的类型
session_id 是确定使用哪组消息的关键
RunnableWithMessageHistory 是将消息历史记录与模型结合在一起的关键它需要传入model变量和get_session_history函数，而get_session_history函数需要传入session_id来获取消息历史记录
如果使用了prompt_template，那么我们需要传给invoke的参数是一个字典，而不是一个消息列表，字典的key是prompt_template中定义的变量名

5 管理对话历史记录

构建聊天机器人时要了解的一个重要概念是如何管理对话历史记录。

如果不进行管理，消息列表将变得不受限制，并可能使 LLM 的上下文窗口溢出。

因此，添加一个限制您传入的消息大小的步骤非常重要。

重要的是，你要在加载“消息历史”里的旧消息之后，再进行提示模板的操作。

为此，我们可以在 prompt 前面添加一个简单的步骤，以适当地修改 messages 键，然后将该新链包装在 Message History 类中。

LangChain 附带了一些内置的 helpers，用于管理消息列表。在本例中，我们将使用 trim_messages 帮助程序来减少我们发送到模型的消息数量。修剪器允许我们指定要保留的令牌数量，以及其他参数，例如我们是否要始终保留系统消息以及是否允许部分消息：

from langchain_core.messages import SystemMessage, trim_messages

trimmer = trim_messages(
    max_tokens=65,
    strategy="last",
    token_counter=model,
    include_system=True,
    allow_partial=False,
    start_on="human",
)

messages = [
    SystemMessage(content="you're a good assistant"),
    HumanMessage(content="hi! I'm bob"),
    AIMessage(content="hi!"),
    HumanMessage(content="I like vanilla ice cream"),
    AIMessage(content="nice"),
    HumanMessage(content="whats 2 + 2"),
    AIMessage(content="4"),
    HumanMessage(content="thanks"),
    AIMessage(content="no problem!"),
    HumanMessage(content="having fun?"),
    AIMessage(content="yes!"),
]

trimmer.invoke(messages)

API 参考：SystemMessage | trim_messages

[SystemMessage(content="you're a good assistant"),
 HumanMessage(content='whats 2 + 2'),
 AIMessage(content='4'),
 HumanMessage(content='thanks'),
 AIMessage(content='no problem!'),
 HumanMessage(content='having fun?'),
 AIMessage(content='yes!')]

要在我们的链中使用它，我们只需要在将 input 传递给 prompt 之前运行 trimmer。

现在，如果我们尝试向模型询问我们的名字，它不会知道它，因为我们修剪了聊天记录的那部分：

from operator import itemgetter

from langchain_core.runnables import RunnablePassthrough

chain = (
    RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer)
    | prompt
    | model
)

response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what's my name?")],
        "language": "English",
    }
)
response.content

API 参考：RunnablePassthrough

"I'm sorry, but I don't have access to your personal information. How can I assist you today?"

但是，如果我们询问最后几封邮件中的信息，它会知道：

response = chain.invoke(
    {
        "messages": messages + [HumanMessage(content="what math problem did i ask")],
        "language": "English",
    }
)
response.content

'You asked "what\'s 2 + 2?"'

现在让我们将其包装在 Message History 中

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history,
    input_messages_key="messages",
)

config = {"configurable": {"session_id": "abc20"}}

response = with_message_history.invoke(
    {
        "messages": messages + [HumanMessage(content="whats my name?")],
        "language": "English",
    },
    config=config,
)

response.content

"I'm sorry, I don't have access to that information. How can I assist you today?"

正如预期的那样，我们声明我们名称的第一条消息已被修剪。此外，聊天记录中现在有两条新消息（我们的最新问题和最新回复）。这意味着过去在我们的对话历史记录中可以访问的更多信息不再可用！在这种情况下，我们的初始数学问题也已从历史记录中修剪下来，因此模型不再知道它：

response = with_message_history.invoke(
    {
        "messages": [HumanMessage(content="what math problem did i ask?")],
        "language": "English",
    },
    config=config,
)

response.content

"You haven't asked a math problem yet. Feel free to ask any math-related question you have, and I'll be happy to help you with it."

6 Streaming

现在我们有一个正常运行的聊天机器人。但是，聊天机器人应用程序的一个非常重要的 UX 考虑因素是流式传输。LLM 有时可能需要一段时间才能响应，因此，为了改善用户体验，大多数应用程序所做的一件事是在生成每个令牌时将其流回。这样，用户就可以看到进度。

这实际上非常简单！

所有链都开放 .stream 方法，使用消息历史记录的链也不例外。我们可以简单地使用该方法来获取流式响应。

config = {"configurable": {"session_id": "abc15"}}
for r in with_message_history.stream(
    {
        "messages": [HumanMessage(content="hi! I'm todd. tell me a joke")],
        "language": "English",
    },
    config=config,
):
    print(r.content, end="|")

|Hi| Todd|!| Sure|,| here|'s| a| joke| for| you|:| Why| couldn|'t| the| bicycle| find| its| way| home|?| Because| it| lost| its| bearings|!| 😄||