🔒 Security Boundaries in Agentic Architectures
SecurityAI
核心观点
Most agents today run generated code with full access to your secrets. As more agents adopt coding agent patterns, they're becoming multi-component systems that each need a different level of trust.
4个参与者模型
An agentic system has four distinct actors, each with a different trust level:
1. Agent (可信任级别: 低)
The agent is the LLM-driven runtime. The agent is subject to prompt injection and unpredictable behavior. Information should be revealed on a need-to-know basis.
2. Agent Secrets (可信任级别: 高)
Agent secrets are the credentials the system needs to function, including API tokens, database credentials, and SSH keys. These become dangerous when other components can access them directly.
3. Generated Code Execution (可信任级别: 极低)
The programs the agent creates and execute are the wildcard. Generated code can do anything the language runtime allows.
4. Filesystem (可信任级别: 中)
The filesystem and broader environment. It cannot trust the agent to have full access or run arbitrary programs without a security boundary.
攻击案例: Prompt Injection
攻击者通过日志文件注入恶意指令,agent被欺骗执行脚本,将SSH密钥和AWS凭据发送到外部服务器。
4种安全架构 (从低到高)
1. Zero Boundaries (默认)
No boundaries between any of the four actors. The agent can read .env files and SSH keys. On a server, it means access to all credentials.
2. Secret Injection Without Sandboxing
A secret injection proxy intercepts outbound network traffic, injecting credentials only as requests travel to their intended endpoint. But it doesn't prevent misuse during active runtime.
3. Shared Sandbox
Wrapping the agent harness and generated code in a shared VM. But the agent and generated program still share the same security context.
4. Separating Agent Compute from Sandbox Compute (推荐)
Running the agent harness and generated code on independent compute, in separate VMs with distinct security contexts. Secrets live in one context, filesystem and generated code execution live in another.
设计原则
- The harness should never expose its own credentials to the agent directly
- The agent should access capabilities through scoped tool invocations
- Tools should be as narrow as possible - a parameter is subject to prompt injection
- Generated programs that need their own credentials are a separate concern
核心洞察
"The missing piece is running the agent harness and the programs the agent generates on independent compute, in separate VMs or sandboxes with distinct security contexts."