Challenges in Making LLMs Safe and Robust

Remote video URL

Aditi Raghunathan (CMU)’s presentation in the Large Language Models and Transformers, Part 1 Boot Camp addresses the root causes of numerous safety concerns and wide-ranging attacks on current large language models. Using a simple illustrative problem, she walks us through several defense strategies and evaluates their strengths and weaknesses, drawing connections to the broader literature on safety and robustness in machine learning.

,