Special Year on Large Language Models and Transformers, Part 2

About

This is the second part in a special year-long program on large language models and transformers that spans the 2024–2025 academic year. The spring semester is supported, in part, through a partnership with IVADO.

This program's overarching goal is to try to understand the ongoing revolution in transformers and large language models (LLMs) through a wide lens, in a relaxed setting that facilitates discussion, debate, and intellectual cross-pollination. At a conceptual level, LLMs profoundly change the landscape for theories of human language, of the brain and computation, and of the nature of human intelligence. In linguistics, they provide a new way to think about grammar, semantics, and conceptual representation. In neuroscience, vector models provide a new approach to computational models of the brain. In cognitive science, they challenge our notions of what are the essential elements of human intelligence.

This program will explore very concrete questions about transformers as models of computation. This includes algorithmic ideas to reduce the complexity of training to nearly linear in the length of the input, as well as scaling laws studying how cross-entropy loss scales with model size, data set size, and amount of compute. The program will also explore how scaling laws might help in understanding high-level outcomes such as the emergence of complex skills in LLM models.

At a practical level, it is clear that LLMs will have a profound impact on human society, and issues of alignment, trust, and security will play a central role. Alignment refers to the gap between complex human values and the mechanisms that drive AI decision-making. Related issues include trustworthiness (How do we know the model will do what it’s intended to?); interpretability (Can we identify with certainty why a machine learning algorithm delivers a specific answer?); safety (Can we safeguard against destructive actions by ML algorithms or humans using them?); security (Can we protect data and systems from adversaries?); and fairness (Can we safeguard against bias?). The legal and regulatory dimension of technological developments in AI, as well as its practical interaction with the capabilities and design of large language models, will be another key area of inquiry.

Three additional workshops took place in Fall 2024, during Part 1 of the program.