Tool use safely
Tool calling lets models invoke structured functions (HTTP APIs, DB queries, tickets, code execution). Security is determined by schemas, authorization, sandboxing, and monitoring—not by natural-language politeness.
Misconfiguration maps directly to OWASP-style categories such as LLM06 Excessive agency (tools that do too much) and LLM05 Improper output handling (tool results trusted blindly downstream). See Risk landscape for vocabulary; this page is implementation-focused.
Design checklist
- Explicit tool set — Avoid mega-tools (“run arbitrary curl” or “query any table”). Prefer small verbs with typed parameters so the model cannot smuggle open-ended intent.
- Server-side authorization — Bind each call to verified session identity from your auth layer; never trust model-supplied user IDs, tenant IDs, or file paths without lookup.
- Allowlists — URLs, hostnames, HTTP methods, and SQL read-only modes fixed per environment; deny by default.
- Confirmation — Preview side effects (email recipients, money movement, bulk deletes) before execution; require explicit user confirm for irreversible actions.
- Idempotency keys — For payments or writes, prevent duplicate execution on retries (models and gateways retry often).
- Rate and cost limits — Cap invocations per session and per tool to mitigate loops and “denial of wallet” (LLM10).
Failure modes to test
| Failure | Example | Mitigation |
|---|---|---|
| Injection into arguments | Model emits '; DROP TABLE-- into a SQL tool | Parameterized queries inside tool impl; no raw string concat. |
| SSRF via HTTP tool | Model picks http://169.254.169.254/ | Egress allowlists; block metadata IPs; require signed URLs. |
| Confused deputy | Model passes victim’s resource ID to admin tool | Per-request authz check in tool handler. |
| Over-broad retrieval | RAG returns another tenant’s chunk | Tenant filters in retrieval; never trust model for partition key. |
Schema tips
- Prefer enums and bounded integers over free text.
- Validate ranges (amounts, dates) in application code after the model returns JSON—models hallucinate plausible numbers.
- Version schemas; fail closed if the model returns unknown fields unless you explicitly allow extras.