This is the first part in a special, year-long program on LLMs and Transformers which spans AY 24-25. The program is inspired by the success of the Simons Workshop on LLMs and Transformers held in August 2023.
The overarching goal of this program is to try to understand the ongoing revolution in transformers and large language models (LLMs) through a wide lens, in a relaxed setting that facilitates discussion, debate, and intellectual cross-pollination. At a conceptual level, LLMs profoundly change the landscape for theories of human language, of the brain and computation, and of the nature of human intelligence. In linguistics, they provide a new way to think about grammar, semantics, and conceptual representation. In neuroscience, vector models provide a new approach to computational models of the brain. In cognitive science, they challenge our notions of what are the essential elements of human intelligence.
The program will also explore very concrete questions about transformers as models of computation. This includes algorithmic ideas to reduce the complexity of training to nearly linear in the length of the input, as well as scaling laws studying how cross-entropy loss scales with model size, data set size and amount of compute. The program will also explore how scaling laws might help in understanding high level outcomes such as the emergence of complex skills in LLM models.
At a practical level it is clear that LLMs will have a profound impact on human society, and issues of alignment, trust and security will play a central role. Alignment refers to the gap between complex human values and the mechanisms that drive AI decision-making. Related issues include trustworthiness (how do we know the model will do what it’s intended to?), interpretability (can we identify with certainty why a machine learning algorithm delivers a specific answer?), safety (can we safeguard against destructive actions by ML algorithms or humans using them?), security (can we protect data and systems from adversaries?), and fairness (can we safeguard against bias?). The legal and regulatory dimension of technological developments in AI, and its practical interaction with the capabilities and design of large language models, will be another key area of inquiry.
There will be three workshops in the Fall as part of this program. Dates TBD:
- Boot Camp
- Workshop 1: Transformers as a computational model
- Workshop 2: Alignment, trust, watermarking, and copyright issues in LLMs