Abstract
I will present my recent work on constructing generalization bounds for deep neural networks in order to understand existing learning algorithms and propose new ones. The tightness of generalization bounds varies widely, and depends on the complexity of the learning task and the amount of data available, but also on how much information the bounds take into consideration. My work is particularly concerned with data and algorithm-dependent bounds that are numerically nonvacuous. I will first present computational techniques that use PAC-Bayes bounds built from parameters obtained by stochastic gradient descent (SGD) and discuss the limitations of these bounds for particular choices of prior and posterior distributions on the parameters. I will then talk about my recent progress on tightening these bounds by constructing data-distribution dependent priors using the training data.
Joint work with Daniel M. Roy (University of Toronto, Vector Institute) and Waseem Gharbieh (Element AI), Alexander Lacoste (Element AI), Chin Wei (MILA and Element AI).