Agents overview

An agent typically runs a loop: plan → call tools → read results → repeat. Security issues compound: small mistakes in tool choice or arguments have side effects, and prompt injection (untrusted content steering the model) can hijack the loop.

Industry taxonomies such as the OWASP Top 10 for LLM Applications (2025) label related failures (excessive agency, improper output handling, sensitive disclosure). Use AI Security — Risk landscape for the full list; the articles here focus on what to implement.

Core ideas

Trust boundary — Anything the model reads (web, files, tools) is untrusted unless you have a cryptographic or procedural guarantee.
Least privilege — Fewer, narrower tools beat a general-purpose shell or “one HTTP node that can reach anything.”
Human gates — Required for irreversible or regulated actions; automation should fail closed when uncertain.
Observability — Log tool invocations with correlation IDs so incidents are reconstructible (data/logging posture).

Articles

Topic	Article
Tool design	Tool use safely
Orchestration	Multi-agent
Approvals	Human-in-the-loop
External servers	MCP security
Execution	Sandboxing

Tool use safely — narrowing tools and authorization
n8n — Workflow automation calling LLMs and tools
Threat modeling for LLM apps — design-review framing

Core ideas​

Articles​

Related​

Core ideas

Articles

Related