Private Retrieval-Augmented Generation

Abstract

Retrieval-Augmented Generation (or RAG) is an essential component of the LLM pipeline: it enables lookup of information related to a prompt to improve the accuracy of the generated text. RAG databases and RAG queries are often sensitive, yet existing mechanisms for search on encrypted data do not support the high semantic accuracy necessary for RAG. In this talk, I will present Compass, a semantic search system over encrypted data that offers high accuracy, comparable to state-of-the-art unencrypted RAG algorithms. Compass protects data, queries, and search results from a fully compromised server. At the core of Compass is a co-design of cryptographic and machine learning algorithms: Compass contributes a cryptography-friendly algorithm to traverse the graph of Hierarchical Navigable Small Worlds, a top-performing nearest neighbor search index, over a white-box adaptation of Oblivious RAM. Through our techniques, Compass achieves user-perceived latencies of a few seconds and is orders of magnitude faster than baselines for encrypted embeddings search.