Speaker: Nhat Ho
Affiliation: University of Texas, Austin, USA


The growth in scope and complexity of modern data sets presents the field of statistics and data science with numerous inferential and computational challenges, among them how to deal with various forms of heterogeneity. Latent variable models provide a principled approach to modeling heterogeneous collections of data. However, due to the over-parameterization, it has been observed that parameter estimation and latent structures of these models have non-standard statistical and computational behaviors. In this talk, we provide new insights into these behaviors under mixture models, a building block of latent variable models.

From the statistical viewpoint, we propose a general framework for studying the convergence rates of parameter estimation in mixture models based on Wasserstein distance. Our study makes explicit the links between model singularities, parameter estimation convergence rates, and the algebraic geometry of the parameter space for mixtures of continuous distributions. Based on the convergence rates of parameter estimation under the Wasserstein distance, we propose a novel Merge-Truncate-Merge procedure to consistently estimate the true number of components in mixture models.

From the computational side, we study the non-asymptotic behavior of the Expectation-Maximization (EM) algorithm, which is a stable method, and some high-order and unstable algorithms, including Newton’s method, under the over-specified settings of mixture models in which the likelihood need not be strongly concave, or, equivalently, the Fisher information matrix might be singular. Focusing on the simple setting of a two-component mixture fit with equal mixture weights to a multivariate Gaussian distribution, we demonstrate that EM updates converge to a fixed point at Euclidean distance O((d/n)^1/4) from the true parameter after O((n/d)^1/2) steps where d is the dimension. On the other hand, Newton’s method can achieve the same statistical accuracy as the EM algorithm in exponentially fewer steps - namely, with the number of iterations being reduced from O((n/d)^1/2) to O(log(n/d)). 

Our result shows that unstable algorithms are preferred to their stable counterparts under this simple setting of mixture models. As a consequence, it rules out the belief that the stability of the algorithm is a necessity for obtaining efficient statistical estimators.

About Statistics, modelling and operations research seminars

Students, staff and visitors to UQ are welcome to attend our regular seminars.

The events are jointly run by our Operations research and Statistics and probability research groups.

The Statistics, modelling and operations research (SMOR) Seminar series seeks to celebrate and disseminate research and developments across the broad spectrum of quantitative sciences. The SMOR series provides a platform for communication of both theoretical and practical developments, as well as interdisciplinary topics relating to applied mathematics and statistics.


Via Zoom (https://uqz.zoom.us/j/85172010876)