Skip to main content

Tool use safely

Tool calling lets models invoke structured functions (HTTP APIs, DB queries, tickets, code execution). Security is determined by schemas, authorization, sandboxing, and monitoring—not by natural-language politeness.

Misconfiguration maps directly to OWASP-style categories such as LLM06 Excessive agency (tools that do too much) and LLM05 Improper output handling (tool results trusted blindly downstream). See Risk landscape for vocabulary; this page is implementation-focused.

Design checklist

  1. Explicit tool set — Avoid mega-tools (“run arbitrary curl” or “query any table”). Prefer small verbs with typed parameters so the model cannot smuggle open-ended intent.
  2. Server-side authorization — Bind each call to verified session identity from your auth layer; never trust model-supplied user IDs, tenant IDs, or file paths without lookup.
  3. Allowlists — URLs, hostnames, HTTP methods, and SQL read-only modes fixed per environment; deny by default.
  4. Confirmation — Preview side effects (email recipients, money movement, bulk deletes) before execution; require explicit user confirm for irreversible actions.
  5. Idempotency keys — For payments or writes, prevent duplicate execution on retries (models and gateways retry often).
  6. Rate and cost limits — Cap invocations per session and per tool to mitigate loops and “denial of wallet” (LLM10).

Failure modes to test

FailureExampleMitigation
Injection into argumentsModel emits '; DROP TABLE-- into a SQL toolParameterized queries inside tool impl; no raw string concat.
SSRF via HTTP toolModel picks http://169.254.169.254/Egress allowlists; block metadata IPs; require signed URLs.
Confused deputyModel passes victim’s resource ID to admin toolPer-request authz check in tool handler.
Over-broad retrievalRAG returns another tenant’s chunkTenant filters in retrieval; never trust model for partition key.

Schema tips

  • Prefer enums and bounded integers over free text.
  • Validate ranges (amounts, dates) in application code after the model returns JSON—models hallucinate plausible numbers.
  • Version schemas; fail closed if the model returns unknown fields unless you explicitly allow extras.