Abstract

The talk will present evidence that today’s large language models (LLMs) display somewhat deeper ``understanding’’ than one would naively expect. This understanding has to do with their own "skills".

1. When asked to solve a task by combining a set of k simpler skills (“test of compositional capability”), they are able to do so despite not having seen those combinations of skills during their training.

2. They show ability to reason about of their own learning processes [Didolkar, Goyal et al'24] which is reminiscent of "metacognitive knowledge"[Flavel'76] in humans. For instance, given examples of an evaluation task, they can produce a catalog of suitably named skills that are relevant for solving each example of that task. Furthermore, this catalog of skills is meaningful, in the sense that incorporating it into training and reasoning pipelines improves performance (including of other unrelated LLMs) on that task. Or they can generate powerful synthetic datasets for Instruction-following [Kaur, Park et al'24] and MATH [Shah et al'24]

We discuss mechanisms by which such complex understanding could arise (including a theory by [Arora,Goyal’23] that tries to explain (a)).

Video Recording