Skip to content

References

Academic papers, blog posts, and resources on LLM security and prompt injection.


Foundational Papers

Prompt Injection Attacks

Paper Authors Year Key Contribution
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection Greshake et al. 2023 Foundational paper on indirect prompt injection; attacks on Bing Chat, code assistants
Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition Schulhoff et al. 2023 Large-scale prompt injection competition; taxonomy of attack techniques
Prompt Injection attack against LLM-integrated Applications Liu et al. 2023 Systematic study of prompt injection in integrated applications

Defense Techniques

Paper Authors Year Key Contribution
Defending Against Indirect Prompt Injection Attacks With Spotlighting Hines et al. (Microsoft) 2024 Random delimiter defense; reduces attack success from >50% to <2%
StruQ: Defending Against Prompt Injection with Structured Queries Chen et al. 2024 Structured data extraction as defense
CaMeL: Defeating Prompt Injections by Design Google DeepMind 2025 Capability-based security architecture; typed data flow
Jatmo: Prompt Injection Defense by Task-Specific Finetuning Piet et al. 2023 Fine-tuning models to resist injection
Design Patterns for Securing LLM Agents against Prompt Injections Beurer-Kellner et al. 2025 Catalog of architectural design patterns with security-utility tradeoff analysis (ETH / IBM / Google / Microsoft co-authored)

Jailbreaking & Red Teaming

Paper Authors Year Key Contribution
Garak: A Framework for Security Probing Large Language Models NVIDIA 2024 Comprehensive LLM vulnerability scanner
Universal and Transferable Adversarial Attacks on Aligned Language Models Zou et al. 2023 Automated adversarial suffix generation
Do Anything Now: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on LLMs Shen et al. 2023 Analysis of jailbreak techniques in the wild
Zhan et al. — Adaptive Attacks Break Defenses Against Indirect Prompt Injection (NAACL 2025) Zhan et al. 2025 Adaptive attacks defeating current defenses
Heverin et al. — Systematically Analysing Prompt Injection Vulnerabilities in Diverse LLM Architectures (2025) Heverin et al. 2025 Systematic analysis across architectures
Fu et al. — Imprompter: Tricking LLM Agents into Improper Tool Use Fu et al. 2025 Attacks on agent tool use
PoisonedRAG — Knowledge Corruption Attacks to RAG (USENIX Security 2025) Zou et al. 2025 5 crafted documents can manipulate AI responses 90% of the time

Key Blog Posts & Articles

Simon Willison's Prompt Injection Series

Essential reading from the person who named and defined prompt injection:

Other Notable Posts


Standards & Frameworks

Resource Organization Description
OWASP Top 10 for LLM Applications (2025) OWASP Industry standard risk ranking; LLM01 = Prompt Injection
OWASP Top 10 for Agentic Applications (2026) OWASP Agentic-specific risks
OWASP GenAI Data Security Risks & Mitigations (2026) OWASP Data security for GenAI
NIST AI Risk Management Framework NIST Broader AI risk guidance
NIST AI 600-1 — Generative AI Risk Management Profile NIST GenAI-specific risk profile
NIST SP 800-218A — Secure Software Development for GenAI NIST Secure development practices for GenAI
MITRE ATLAS MITRE Adversarial threat landscape for AI systems — 66 techniques, 46 subtechniques as of Oct 2025

Tools Documentation

Tool Documentation Focus
Vigil ⚠️ inactive since 2023 Multi-layer detection YARA, vectors, ML, canaries
LLM Guard Runtime guardrails Input/output scanning
Garak Red teaming Vulnerability probing
NeMo Guardrails Dialog control Colang DSL
Rebuff ⚠️ archived May 2025 Self-hardening detection Canary tokens

Datasets

Dataset Source Description
Vigil Prompt Injection Dataset deadbits Embeddings of known attacks
HackAPrompt Dataset Schulhoff et al. Competition submissions
Jailbreak Chat (defunct) Community Crowdsourced jailbreaks

Conference Talks


Real-World Incidents & CVEs


Contributing

Found a relevant paper or resource? Open a PR to add it!