References¶
Academic papers, blog posts, and resources on LLM security and prompt injection.
Foundational Papers¶
Prompt Injection Attacks¶
| Paper | Authors | Year | Key Contribution |
|---|---|---|---|
| Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection | Greshake et al. | 2023 | Foundational paper on indirect prompt injection; attacks on Bing Chat, code assistants |
| Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition | Schulhoff et al. | 2023 | Large-scale prompt injection competition; taxonomy of attack techniques |
| Prompt Injection attack against LLM-integrated Applications | Liu et al. | 2023 | Systematic study of prompt injection in integrated applications |
Defense Techniques¶
| Paper | Authors | Year | Key Contribution |
|---|---|---|---|
| Defending Against Indirect Prompt Injection Attacks With Spotlighting | Hines et al. (Microsoft) | 2024 | Random delimiter defense; reduces attack success from >50% to <2% |
| StruQ: Defending Against Prompt Injection with Structured Queries | Chen et al. | 2024 | Structured data extraction as defense |
| CaMeL: Defeating Prompt Injections by Design | Google DeepMind | 2025 | Capability-based security architecture; typed data flow |
| Jatmo: Prompt Injection Defense by Task-Specific Finetuning | Piet et al. | 2023 | Fine-tuning models to resist injection |
| Design Patterns for Securing LLM Agents against Prompt Injections | Beurer-Kellner et al. | 2025 | Catalog of architectural design patterns with security-utility tradeoff analysis (ETH / IBM / Google / Microsoft co-authored) |
Jailbreaking & Red Teaming¶
Key Blog Posts & Articles¶
Simon Willison's Prompt Injection Series¶
Essential reading from the person who named and defined prompt injection:
- Prompt injection attacks against GPT-3 (2022) — Original post naming the vulnerability
- Delimiters won't save you from prompt injection (2023) — Why simple defenses fail
- Dual LLM pattern (2023) — Architectural defense pattern
- The Dual LLM pattern for building AI assistants that can resist prompt injection (2023) — Detailed implementation guide
- Full series
Other Notable Posts¶
- Anthropic: Many-shot jailbreaking (2024) — Long-context attacks
- OpenAI: Prompt injection — OpenAI's acknowledgment that injection is "unlikely to ever be fully solved"
- NCSC — Prompt Injection Is Not SQL Injection (Dec 2025)
- Anthropic — Disrupting AI-Orchestrated Espionage (Sep 2025) — First documented AI-orchestrated cyberattack
- Microsoft MSRC — How Microsoft Defends Against Indirect Prompt Injection (Jul 2025)
- tldrsec — Prompt Injection Defenses (comprehensive catalog)
- HiddenLayer — 2026 AI Threat Landscape Report
Standards & Frameworks¶
| Resource | Organization | Description |
|---|---|---|
| OWASP Top 10 for LLM Applications (2025) | OWASP | Industry standard risk ranking; LLM01 = Prompt Injection |
| OWASP Top 10 for Agentic Applications (2026) | OWASP | Agentic-specific risks |
| OWASP GenAI Data Security Risks & Mitigations (2026) | OWASP | Data security for GenAI |
| NIST AI Risk Management Framework | NIST | Broader AI risk guidance |
| NIST AI 600-1 — Generative AI Risk Management Profile | NIST | GenAI-specific risk profile |
| NIST SP 800-218A — Secure Software Development for GenAI | NIST | Secure development practices for GenAI |
| MITRE ATLAS | MITRE | Adversarial threat landscape for AI systems — 66 techniques, 46 subtechniques as of Oct 2025 |
Tools Documentation¶
| Tool | Documentation | Focus |
|---|---|---|
| Vigil ⚠️ inactive since 2023 | Multi-layer detection | YARA, vectors, ML, canaries |
| LLM Guard | Runtime guardrails | Input/output scanning |
| Garak | Red teaming | Vulnerability probing |
| NeMo Guardrails | Dialog control | Colang DSL |
| Rebuff ⚠️ archived May 2025 | Self-hardening detection | Canary tokens |
Datasets¶
| Dataset | Source | Description |
|---|---|---|
| Vigil Prompt Injection Dataset | deadbits | Embeddings of known attacks |
| HackAPrompt Dataset | Schulhoff et al. | Competition submissions |
| Jailbreak Chat (defunct) | Community | Crowdsourced jailbreaks |
Conference Talks¶
- DEF CON 31 — Hacking AI: Prompt Injection and More (2023)
- Black Hat USA 2023 — Compromising LLMs: The Indirect Prompt Injection Threat
Real-World Incidents & CVEs¶
- CVE-2025-34291 — Langflow RCE (CVSS 9.4) — AI agent framework account takeover
- CVE-2025-32711 — Microsoft Copilot EchoLeak — First zero-click attack against an AI agent
- CVE-2025-6514 — mcp-remote OS command injection — Affected 437K environments
- Slack AI Data Exfiltration (Aug 2024) — Memory poisoning via indirect prompt injection
- PhantomRaven Supply Chain Attack (2025) — 126 malicious npm packages via remote dynamic dependencies, 86K downloads
Contributing¶
Found a relevant paper or resource? Open a PR to add it!