本地 vllm 部署 Qwen2.5-7B-Instruct 在 stream 模式下 tool_calls 参数 JSON 字符串未正常闭合

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

已注册用户请登录

这是一个创建于 220 天前的主题，其中的信息可能已经有所发展或是发生改变。

问题：在 stream 流模式下模型返回的数据中 function.arguments 的 JSON 是未闭合的状态，在非 stream 模式下返回正常

正确返回：{"tool_name": "get-user", "arguments": {"name": "张三"}}

错误返回：{"tool_name": "get-user", "arguments": {"name": "张三"}

只有本地部署的 Qwen2.5-7B-Instruct 有这个问题，换了其他参数大小的模型也还是会出现这个问题。我尝试过使用阿里百炼中的 Qwen2.5-7B-Instruct 测试，返回的 function.arguments 是正确的 JSON 格式

为了排除是软件包的问题，使用的是 curl 方式测试。问题出现概率几乎是 100%。

docker compose 部署配置

services: vllm-service: image: 172.16.99.11:5000/vllm/vllm-openai:latest runtime: nvidia deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] environment: - HF_ENDPOINT=https://hf-mirror.com volumes: - ~/.cache/huggingface:/root/.cache/huggingface ports: - "8000:8000" ipc: host command: - "--model" - "Qwen/Qwen2.5-7B-Instruct" - "--enable-auto-tool-choice" - "--tool-call-parser" - "hermes"

请求 Body

 { "model": "Qwen/Qwen2.5-7B-Instruct", "messages": [ { "role": "system", "content": "# 角色\n 你是一位高效的工具使用专家，擅长根据用户需求选择合适的工具并逐步调用，以解决复杂的问题。你能够利用一组工具逐步解决问题，并且每次工具调用的参数都基于前一次工具调用的结果。\n\n## 技能\n### 技能 1: 选择和调用工具\n- **任务**：根据用户的需求，选择合适的工具并逐步调用。\n - 说明调用的工具名称及其功能。\n - 解释为什么要选择这个工具以及它如何帮助解决问题。\n - 如果需要调用多个工具，详细说明每个工具的调用顺序及其原因。\n\n### 技能 2: 参数设置与结果分析\n- **任务**：为每次工具调用设置合适的参数，并基于前一次工具调用的结果进行调整。\n - 详细说明每次工具调用的参数设置。\n - 分析前一次工具调用的结果，解释如何根据这些结果调整当前工具的参数。\n - 提供每一步的详细输出，确保用户可以理解每一步的操作及其结果。\n\n### 技能 3: 总结和反馈\n- **任务**：总结提交给你的数据，并提供最终的解决方案或建议。\n - 汇总所有工具调用的结果，形成一个完整的解决方案。\n - 提供最终的总结报告，包括问题的解决过程、使用的工具及其效果。\n - 如果有进一步的建议或改进措施，也一并在总结中提出。\n\n## 限制\n- 只回答与工具使用相关的问题，不涉及其他领域的内容。\n- 确保每次工具调用的参数设置合理，并基于前一次工具调用的结果进行调整。\n- 在调用工具时，必须详细说明调用的原因和预期结果。\n- 所有步骤和结果必须清晰地呈现给用户，确保用户能够理解整个过程。\n- 保持专业和客观的态度，避免过度复杂的解释，确保用户易于理解。\n" }, { "role": "user", "content": "查询张三的用户信息" }, { "role": "assistant", "content": "", "tool_calls": [ { "id": "chatcmpl-tool-fba9a74429774828a76d5ca105cadd7f", "type": "function", "function": { "name": "mcp_sse_list_tools", "arguments": "{}" } } ] }, { "role": "tool", "content": "MCP Server tools list: \n[{'name': 'get-student', 'description': '用户查询，使用用户姓名查询系统用户数据，包含用户的基础信息', 'parameters': {'type': 'object', 'properties': {'name': {'type': 'string', 'description': '姓名'}}, 'required': ['name'], 'additionalProperties': False}}]", "tool_call_id": "chatcmpl-tool-fba9a74429774828a76d5ca105cadd7f" } ], "tools": [ { "type": "function", "function": { "name": "mcp_sse_list_tools", "description": "Fetch MCP Server tools list (Gets a list of MCP tools in addition to existing tools).", "parameters": { "properties": {}, "required": [], "type": "object" } } }, { "type": "function", "function": { "name": "mcp_sse_call_tool", "description": "Call MCP Server tool.", "parameters": { "properties": { "arguments": { "description": "Tool arguments (JSON string in the python dict[str, Any] format).", "type": "string" }, "tool_name": { "description": "Name of the tool to execute.", "type": "string" } }, "required": [ "tool_name", "arguments" ], "type": "object" } } } ], "tool_choice": "auto", "stream": true }

Qwen2.5-7B-Instruct

stream模式

JSON闭合问题

3 条回复 2025-08-18 02:26:26 +08:00

harlen

220 天前

API 文档上有写
1.response_format 参数声明 json
2.提示词必须有 json
3.max_tokens 长度太短或者不合适 json 字符串会被截断

BenchWidth

220 天前

@harlen 请问是哪个文档，openai 文档还是 vllm 部署文档，我去看一看

DefoliationM

127 天前 via Android

vllm bug 请看 https://github.com/vllm-project/vllm/issues/18108