记录和注册 AI 代理

2025-09-16

使用 Mosaic AI 代理框架记录 AI 代理。记录代理活动是开发过程的基础。日志记录捕获代理代码和配置的“时间点”，以便评估配置的质量。

Requirements

Databricks 建议安装最新版本的 databricks-sdk。

% pip install databricks-sdk

基于代码的日志记录

Databricks 建议在记录代理时使用 MLflow 的代码功能中的模型功能。

在此方法中，代理的代码作为 Python 文件捕获，Python 环境将捕获为包列表。部署代理后，将还原 Python 环境，并运行代理的代码将代理加载到内存中，以便在调用终结点时调用代理。

可以将此方法与使用预部署验证 API（如 mlflow.models.predict（）结合使用，以确保代理在部署服务时可靠运行。

若要查看基于代码的日志记录示例，请参阅 ResponsesAgent 创作示例笔记本。

在日志记录期间推断模型的签名

Note

Databricks 建议使用 ResponsesAgent 接口创作代理。如果使用 ResponsesAgent，则可以跳过本部分;MLflow 会自动推断代理的有效签名。

如果未使用 ResponsesAgent 接口，则必须使用以下方法之一在日志记录时指定代理的 MLflow 模型签名：

手动定义签名
使用 MLflow 的模型签名推理功能根据提供的输入示例自动生成代理的签名。此方法比手动定义签名更方便。

MLflow 模型签名验证输入和输出，以确保代理与 AI Playground 和审阅应用等下游工具正确交互。它还指导其他应用程序如何有效地使用代理。

以下 LangChain 和 PyFunc 示例使用模型签名推理。

如果想要在日志记录时自行显式定义模型签名，请参阅 MLflow 文档 - 如何使用签名记录模型。

使用 LangChain 的基于代码的日志记录

以下说明和代码示例演示如何使用 LangChain 记录代理。

使用代码创建一个笔记本或 Python 文件。在本示例中，笔记本或文件命名为 agent.py。笔记本或文件必须包含一个 LangChain 代理，此处称为 lc_agent。
在笔记本或文件中包括 mlflow.models.set_model（lc_agent）。
创建新的笔记本作为驱动程序笔记本（在此示例中称为 driver.py）。
在驱动程序笔记本中，使用以下代码运行 agent.py 并将结果记录到 MLflow 模型：
```
mlflow.langchain.log_model(lc_model="/path/to/agent.py", resources=list_of_databricks_resources)
```
resources 参数声明了为代理提供服务所需的 Databricks 托管资源，例如向量搜索索引或用于提供基础模型的服务端点。有关详细信息，请参阅实现自动身份验证直通。
部署模型。请参阅为生成式 AI 应用程序部署代理。
加载服务环境时， agent.py 将运行。
传入服务请求时，将调用 lc_agent.invoke(...)。


import mlflow

code_path = "/Workspace/Users/first.last/agent.py"
config_path = "/Workspace/Users/first.last/config.yml"

# Input example used by MLflow to infer Model Signature
input_example = {
  "messages": [
    {
      "role": "user",
      "content": "What is Retrieval-augmented Generation?",
    }
  ]
}

# example using langchain
with mlflow.start_run():
  logged_agent_info = mlflow.langchain.log_model(
    lc_model=code_path,
    model_config=config_path, # If you specify this parameter, this configuration is used by agent code. The development_config is overwritten.
    artifact_path="agent", # This string is used as the path inside the MLflow model where artifacts are stored
    input_example=input_example, # Must be a valid input to the agent
    example_no_conversion=True, # Required
  )

print(f"MLflow Run: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")

# To verify that the model has been logged correctly, load the agent and call `invoke`:
model = mlflow.langchain.load_model(logged_agent_info.model_uri)
model.invoke(example)

使用 PyFunc 的基于代码的日志记录

以下说明和代码示例演示如何使用 PyFunc 记录代理。

使用代码创建一个笔记本或 Python 文件。在本示例中，笔记本或文件命名为 agent.py。笔记本或文件必须包含名为 PyFuncClass的 PyFunc 类。
在笔记本或文件中包含 mlflow.models.set_model(PyFuncClass)。
创建新的笔记本作为驱动程序笔记本（在此示例中称为 driver.py）。
在驱动程序笔记本中，用以下代码运行 agent.py，并使用 log_model() 将结果记录到 MLflow 模型。
```
mlflow.pyfunc.log_model(python_model="/path/to/agent.py", resources=list_of_databricks_resources)
```
resources 参数声明了为代理提供服务所需的 Databricks 托管资源，例如向量搜索索引或用于提供基础模型的服务端点。有关详细信息，请参阅实现自动身份验证直通。
部署模型。请参阅为生成式 AI 应用程序部署代理。
加载服务环境时， agent.py 将运行。
传入服务请求时，将调用 PyFuncClass.predict(...)。

import mlflow
from mlflow.models.resources import (
    DatabricksServingEndpoint,
    DatabricksVectorSearchIndex,
)

code_path = "/Workspace/Users/first.last/agent.py"
config_path = "/Workspace/Users/first.last/config.yml"

# Input example used by MLflow to infer Model Signature
input_example = {
  "messages": [
    {
      "role": "user",
      "content": "What is Retrieval-augmented Generation?",
    }
  ]
}

with mlflow.start_run():
  logged_agent_info = mlflow.pyfunc.log_model(
    python_model=agent_notebook_path,
    artifact_path="agent",
    input_example=input_example,
    resources=resources_path,
    example_no_conversion=True,
    resources=[
      DatabricksServingEndpoint(endpoint_name="databricks-meta-llama-3-3-70b-instruct"),
      DatabricksVectorSearchIndex(index_name="prod.agents.databricks_docs_index"),
    ]
  )

print(f"MLflow Run: {logged_agent_info.run_id}")
print(f"Model URI: {logged_agent_info.model_uri}")

# To verify that the model has been logged correctly, load the agent and call `invoke`:
model = mlflow.pyfunc.load_model(logged_agent_info.model_uri)
model.invoke(example)

Databricks 资源的身份验证

AI 代理通常需要向其他资源进行身份验证才能完成任务。例如，部署的代理可能需要访问矢量搜索索引来查询非结构化数据，或者访问提示注册表来加载动态提示。

在代理记录日志期间，自动身份验证直通和代为身份验证需要进行配置。

将代理注册到 Unity 目录

在部署代理之前，必须将代理注册到 Unity 目录。在 Unity Catalog 中将代理包注册为一个模型。因此，可以使用 Unity 目录权限来授权代理中的资源。

import mlflow

mlflow.set_registry_uri("databricks-uc")

catalog_name = "test_catalog"
schema_name = "schema"
model_name = "agent_name"

model_name = catalog_name + "." + schema_name + "." + model_name
uc_model_info = mlflow.register_model(model_uri=logged_agent_info.model_uri, name=model_name)

请参阅 mlflow.register_model()。

后续步骤

向 AI 代理添加跟踪。
部署 AI 代理。

反馈

此页面是否有帮助？