你把工具的 JSON Schema 写得漂漂亮亮,上线第一天 Agent 就开始调错函数、参数乱填、超时不重试。本文用 Python 逐一拆解 Function Calling 的 5 个工程陷阱,并给出可运行的解决方案。
一、5 个陷阱一览
| # | 陷阱 | 后果 | 生产影响 |
|---|---|---|---|
| 1 | Schema 描述太模糊 | 模型选错工具 | 用户得到错误结果 |
| 2 | 参数校验缺失 | 非法值透传 | 下游服务崩溃 |
| 3 | 没有重试与降级 | 一次失败 = 整体失败 | 成功率 < 85% |
| 4 | Tool Call 阻塞主流程 | 串行调用慢 | P95 延迟爆炸 |
| 5 | Streaming 工具调用处理不当 | 参数截断 | 参数解析失败 |
二、完整实现:一个生产级 Tool Executor
2.1 基础框架:带校验的工具注册器
# tool_registry.pyimportinspectimportjsonfromdataclassesimportdataclass,fieldfromtypingimportAny,Callable,OptionalfrompydanticimportBaseModel,ValidationError,create_model@dataclassclassToolDef:"""工具定义"""name:strdescription:strfunc:Callable parameters_schema:dict# JSON Schema# 工程化配置max_retries:int=2timeout_seconds:int=30fallback_func:Optional[Callable]=None# 降级函数classToolRegistry:"""工具注册中心:注册、校验、执行"""def__init__(self):self._tools:dict[str,ToolDef]={}defregister(self,tool:ToolDef):self._tools[tool.name]=tooldefget_openai_schema(self)->list[dict]:"""生成 OpenAI 兼容的 tools 参数"""return[{"type":"function","function":{"name":t.name,"description":t.description,"parameters":t.parameters_schema,},}fortinself._tools.values()]defexecute(self,name:str,arguments:dict)->dict:"""执行工具调用(带重试 + 降级 + 校验)"""ifnamenotinself._tools:return{"error":f"Unknown tool:{name}"}tool=self._tools[name]# 陷阱2修复:参数校验validated=self._validate_args(tool,arguments)ifisinstance(validated,dict)and"error"invalidated:returnvalidated# 执行(带重试)returnself._execute_with_retry(tool,validated)def_validate_args(self,tool:ToolDef,arguments:dict)->dict:"""用 Pydantic 动态校验参数"""# 从 JSON Schema 生成 Pydantic modelprops=tool.parameters_schema.get("properties",{})required=tool.parameters_schema.get("required",[])fields={}forname,propinprops.items():py_type=self._schema_type_to_python(prop)default=...ifnameinrequiredelseNonedescription=prop.get("description","")fields[name]=(py_type,inspect.Parameter.emptyifnameinrequiredelsefield(default=default))# 注意:这里只做类型校验,不做业务逻辑校验ifnotfields:returnarguments# 无参数工具ModelClass=create_model(f"Args_{tool.name}",**fields)try:instance=ModelClass(**arguments)returninstance.model_dump()exceptValidationErrorase:return{"error":f"Validation failed:{e.errors()}"}def_schema_type_to_python(self,prop:dict):"""JSON Schema type -> Python type"""type_map={"string":str,"integer":int,"number":float,"boolean":bool,"array":list,"object":dict,}returntype_map.get(prop.get("type"),str)def_execute_with_retry(self,tool:ToolDef,args:dict)->dict:"""陷阱3修复:带重试 + 超时 + 降级的执行"""importasyncioimporttime last_error=Noneforattemptinrange(tool.max_retries+1):try:start=time.monotonic()result=tool.func(**args)elapsed=time.monotonic()-startifelapsed>tool.timeout_seconds:raiseTimeoutError(f"Tool{tool.name}timeout ({elapsed:.1f}s >{tool.timeout_seconds}s)")return{"success":True,"result":result,"attempts":attempt+1,"elapsed_ms":int(elapsed*1000),}exceptExceptionase:last_error=str(e)ifattempt<tool.max_retries:wait=2**attempt# 指数退避time.sleep(wait)continue# 所有重试耗尽,尝试降级iftool.fallback_func:try:fallback_result=tool.fallback_func(**args)return{"success":True,"result":fallback_result,"fallback":True,"original_error":last_error,}exceptExceptionasfe:return{"error":f"Tool failed + fallback failed:{last_error}|{fe}"}return{"error":f"Tool{tool.name}failed after{tool.max_retries+1}attempts:{last_error}"}2.2 陷阱1修复:编写高质量的 Tool Description
# tools/definitions.pyfromtool_registryimportToolDef# ❌ 糟糕的描述——模型不知道何时调用BAD_SEARCH_TOOL=ToolDef(name="search",description="Search something",# ← 太模糊...)# ✅ 好的描述——告诉模型 WHEN + WHAT + 参数约束SEARCH_TOOL=ToolDef(name="search_knowledge_base",description=("Search the internal knowledge base for technical documentation. ""Use this when the user asks about internal APIs, architecture, ""or product specifications. Do NOT use for general knowledge questions ""(those should be answered directly)."),parameters_schema={"type":"object","properties":{"query":{"type":"string","description":"Search keywords. Use exact technical terms. ""Max 200 characters."},"category":{"type":"string","enum":["api","architecture","product","oncall"],"description":"Document category to narrow search. ""Use 'api' for endpoint docs, 'architecture' for system design."},},"required":["query"],},func=search_kb,max_retries=1,timeout_seconds=10,)2.3 陷阱3深入:不同错误类型的重试策略
# retry_policy.pyfromenumimportEnumclassErrorCategory(Enum):RETRYABLE="retryable"# 网络错误、429 限流 — 重试就对了FALLBACK="fallback"# 超时 — 改用降级方案FATAL="fatal"# 参数错误、权限不足 — 直接失败defcategorize_error(error:Exception)->ErrorCategory:"""根据异常类型决定重试策略"""importrequestsifisinstance(error,TimeoutError):returnErrorCategory.FALLBACKifisinstance(error,requests.HTTPError):status=error.response.status_codeifhasattr(error,'response')else500ifstatusin(429,503,502):returnErrorCategory.RETRYABLEifstatus==408:returnErrorCategory.FALLBACKreturnErrorCategory.FATALifisinstance(error,(ConnectionError,ConnectionResetError)):returnErrorCategory.RETRYABLE# 参数校验错误 -> 不应该重试ifisinstance(error,(ValueError,TypeError)):returnErrorCategory.FATALreturnErrorCategory.RETRYABLE# 未知错误默认重试2.4 陷阱4修复:并行 Tool Call 执行
# parallel_executor.pyimportasynciofromconcurrent.futuresimportThreadPoolExecutor,as_completedclassParallelToolExecutor:"""并行执行多个 tool calls"""def__init__(self,registry:"ToolRegistry",max_workers:int=5):self.registry=registry self.executor=ThreadPoolExecutor(max_workers=max_workers)defexecute_batch(self,tool_calls:list[dict])->list[dict]:""" 并行执行多个独立 tool calls。 注意:只并行化互不依赖的调用;有依赖关系的需要串行。 """futures={}fori,tcinenumerate(tool_calls):name=tc["function"]["name"]args=json.loads(tc["function"]["arguments"])future=self.executor.submit(self.registry.execute,name,args)futures[future]=i results=[None]*len(tool_calls)forfutureinas_completed(futures):idx=futures[future]results[idx]=future.result()returnresults# 使用示例# executor = ParallelToolExecutor(registry)# results = executor.execute_batch(response.choices[0].message.tool_calls)2.5 陷阱5修复:Streaming 模式下的 Tool Call 累积
Streaming 模式下,tool call 的参数是分块到达的。如果直接解析——会拿到不完整的 JSON。
# streaming_tool_handler.pyimportjsonclassStreamingToolAccumulator:"""累积 streaming 模式下分块到达的 tool call 参数"""def__init__(self):self._accumulators:dict[int,dict]={}deffeed(self,delta)->Optional[dict]:"""喂入一个 delta chunk。 Returns: Optional[dict]: 如果参数已完整,返回 (index, name, arguments) 否则返回 None """ifnotdelta.tool_calls:returnNonefortc_deltaindelta.tool_calls:idx=tc_delta.indexifidxnotinself._accumulators:self._accumulators[idx]={"id":tc_delta.idor"","name":"","arguments":"",}acc=self._accumulators[idx]iftc_delta.functionandtc_delta.function.name:acc["name"]+=tc_delta.function.nameiftc_delta.functionandtc_delta.function.arguments:acc["arguments"]+=tc_delta.function.argumentsiftc_delta.id:acc["id"]=tc_delta.idreturnNone# 参数可能还不完整,继续等待deffinalize(self)->list[dict]:"""在所有 chunks 接收完后调用,尝试解析参数"""results=[]foridx,accinsorted(self._accumulators.items()):try:args=json.loads(acc["arguments"])exceptjson.JSONDecodeError:# 参数截断了——尝试修复(补结尾括号)args=self._attempt_repair(acc["arguments"])results.append({"id":acc["id"],"type":"function","function":{"name":acc["name"],"arguments":json.dumps(args),},})self._accumulators.clear()returnresultsdef_attempt_repair(self,partial_json:str)->dict:"""尝试修复截断的 JSON"""# 统计未闭合的括号open_braces=partial_json.count('{')-partial_json.count('}')open_brackets=partial_json.count('[')-partial_json.count(']')repaired=partial_json repaired+=']'*open_brackets repaired+='}'*open_braces# 如果最后一个 key 没有 value,补 nullifrepaired.rstrip().endswith(':'):repaired+=' null'try:returnjson.loads(repaired)exceptjson.JSONDecodeError:return{"_error":"unparseable","_raw":partial_json[:200]}2.6 完整 Agent Loop
# agent.pyimportjsonfromopenaiimportOpenAIfromtool_registryimportToolRegistryfromstreaming_tool_handlerimportStreamingToolAccumulatorclassFunctionCallingAgent:"""生产级 Function Calling Agent"""MAX_TURNS=10# 防止无限循环def__init__(self,client:OpenAI,registry:ToolRegistry):self.client=client self.registry=registrydefrun(self,user_message:str,model:str="gpt-4o")->str:messages=[{"role":"user","content":user_message}]forturninrange(self.MAX_TURNS):response=self.client.chat.completions.create(model=model,messages=messages,tools=self.registry.get_openai_schema(),tool_choice="auto",)msg=response.choices[0].message# 没有 tool call → 返回最终回复ifnotmsg.tool_calls:returnmsg.content# 处理 tool callsmessages.append(msg.model_dump())fortcinmsg.tool_calls:fn_name=tc.function.name fn_args=json.loads(tc.function.arguments)print(f"[TURN{turn}] Calling{fn_name}({fn_args})")result=self.registry.execute(fn_name,fn_args)messages.append({"role":"tool","tool_call_id":tc.id,"content":json.dumps(result,ensure_ascii=False),})return"Max turns exceeded"三、陷阱对比总结
| 陷阱 | 无修复 | 有修复 |
|---|---|---|
| Schema 模糊 | 30%+ 工具选择错误 | < 5% |
| 无参数校验 | 下游服务不定期崩溃 | Pydantic 拦截所有非法参数 |
| 无重试 | 网络波动导致 10%+ 失败 | 指数退避后成功率 > 99% |
| 串行执行 | 3 个独立 tool call 耗时 6s | 并行耗时 2.5s |
| Streaming 截断 | 参数解析失败率 ~8% | < 1% (含 JSON 修复) |
四、两个额外建议
4.1 工具返回值的 Token 预算
deftruncate_tool_result(result:dict,max_chars:int=4000)->dict:"""工具返回值太长会炸 context window,必须截断"""result_str=json.dumps(result,ensure_ascii=False)iflen(result_str)>max_chars:return{"truncated":True,"full_length":len(result_str),"preview":result_str[:max_chars]+"...","hint":"Use more specific parameters to narrow results.",}returnresult4.2 区分tool_choice: "auto"vs"required"
auto:模型自己决定要不要调工具。适合大部分场景。required:强制模型必须调工具。适合"每一步都必须出结构化数据"的场景。none:禁止调工具。适合预处理步骤(如摘要、翻译)。
在生产环境中,我们通常在 Agent 的第一步用required(强制查知识库),后续步骤用auto。
五、总结
Function Calling 看起来就是"写个 JSON Schema 就完了",但真正上线后,Schema 描述、参数校验、重试策略、并行执行、Streaming 处理——这五个维度每个没做好都会导致生产事故。本文的实现是一个可以直接用的骨架,按你的需求补充工具函数即可。
完整代码可直接运行。依赖:openai,pydantic。