Thursday, Oct. 3, 2024

Challenges in Making LLMs Safe and Robust

Aditi Raghunathan (CMU)’s presentation in the Large Language Models and Transformers, Part 1 Boot Camp addresses the root causes of numerous safety concerns and wide-ranging attacks on current large language models. Using a simple illustrative problem, she walks us through several defense strategies and evaluates their strengths and weaknesses, drawing connections to the broader literature on safety and robustness in machine learning.

Challenges in Making LLMs Safe and Robust

Related stories