Representation, optimisation and generalisation in deep learning

Presenter: Professor Peter Bartlett, UC Berkeley

Deep neural networks have improved state-of-the-art performance for prediction problems across an impressive range of application areas, and they have become a central ingredient in AI systems. This talk considers factors that affect their performance, describing some recent results in two directions. First, we investigate the impact of depth on representation and optimization properties of these networks. We focus on deep residual networks, which have been widely adopted for computer vision applications because they exhibit fast training, even for very deep networks. We show that as the depth of these networks increases, they are able to represent a smooth invertible map with a simpler representation at each layer, and that this implies a desirable property of the functional optimization landscape that arises from regression with deep function compositions: stationary points are global optima. Second, we consider the generalization behavior of deep networks, that is, how their performance on training data compares to predictive accuracy. In particular, we aim to understand how to measure the complexity of functions computed by these networks. For multiclass classification problems, we present a margin-based generalization bound that scales with a certain margin-normalized "spectral complexity," involving the product of the spectral norms of the weight matrices in the network. We show how the bound gives insight into the observed performance of these networks in practical problems.

Joint work with Steve Evans and Phil Long, and with Matus Telgarsky and Dylan Foster.

Biography - Peter Bartlett:

Peter Bartlett is a professor in the Computer Science Division and Department of Statistics and Associate Director of the Simons Institute for the Theory of Computing at the University of California at Berkeley. His research interests include machine learning and statistical learning theory. He is the co-author, with Martin Anthony, of the book Neural Network Learning: Theoretical Foundations. He has served as an associate editor of the journals Bernoulli, the Journal of Artificial Intelligence Research, the Journal of Machine Learning Research, the IEEE Transactions on Information Theory, Machine Learning, and Mathematics of Control Signals and Systems, and as program committee co-chair for COLT and NIPS. He was awarded the Malcolm McIntosh Prize for Physical Scientist of the Year in Australia in 2001, and was chosen as an Institute of Mathematical Statistics Medallion Lecturer in 2008, and an IMS Fellow and Australian Laureate Fellow in 2011. He was elected to the Australian Academy of Science in 2015.

About Maths Colloquium

The Mathematics Colloquium is directed at students and academics working in the fields of pure and applied mathematics, and statistics.

We aim to present expository lectures that appeal to our wide audience.

Information for speakers

Maths colloquia are usually held on Mondays, from 2pm to 3pm, in various locations at St Lucia.

Presentations are 50 minutes, plus five minutes for questions and discussion.

Available facilities include:

computer
data projector
chalkboard or whiteboard

To avoid technical difficulties on the day, please contact us in advance of your presentation to discuss your requirements.

Venue

Hawken Engineering Building (50)

Room:

N202