Abstract

Prompt injection attacks are a significant threat to the security of LLM-integrated applications.  These attacks exploit the lack of a clear separation between instructions/prompts and user data.  I will introduce the notion of structured queries, a general approach to tackle this problem by explicitly separating prompt and data and training LLMs to respect this separation.  I will describe how to adjust standard instruction tuning to respect this separation, and show the resulting models provide significant improvements in robustness against prompt injection.