Description

In their legendary WW-II effort to decipher the enigma code, I.J. Good and Alan Turing derived an equally enigmatic estimator for the probabilities of unlikely and even unseen events. It estimates the probability of events by considering not just the number of times they appeared in the sample, but also how many times other, possibly unrelated, events were observed. Though not well understood for over half a century, the Good-Turing estimator has proved invaluable in practice and is used in a variety of applications, including natural-language processing, bioinformatics and ecology. We will review the estimator, its early heuristic explanations, rigorous proofs of its efficacy that emerged over the past few years, the best possible performance of any estimator, and some unexpected implications.

Light refreshments will be served before the lecture at 3:30 p.m.

Sections of the recording containing clips from "The Imitation Game" have been removed from the YouTube version. The full recording can be downloaded via the link below.

YouTube Video
Remote video URL