Keynote Speakers

Peter Hall Lecture

by Professor Jianqing Fan, Princeton University

Title

Classification and diffusion-induced neural density estimators and simulators for generative AI

Abstract

Neural network-based methods for conditional density estimation have recently gained substantial attention, as various neural density estimators have outperformed classical approaches in real-data experiments. Despite these empirical successes, implementation can be challenging due to the need to ensure non-negativity and unit-mass constraints, and theoretical understanding remains limited. In particular, it is unclear whether such estimators can adaptively achieve faster convergence rates when the underlying density exhibits a low-dimensional structure. This paper addresses these gaps by proposing a structure-agnostic neural density estimator, called the classification-induced neural density estimator and simulator (CINDES) that is straightforward to implement and provably adaptive, attaining faster rates when the true density admits a low-dimensional composition structure. Another key contribution of our work is to show that the proposed estimator integrates naturally into generative sampling pipelines, most notably score-based diffusion models, where it achieves provably faster convergence when the underlying density is structured. We validate its performance through extensive simulations and a real-data application. We also prove the optimality of score-based diffusion models for density estimation when the target density admits a factorizable, low-dimensional, nonparametric structure in a separate work. The main challenge is that the low-dimensional, factorizable structure no longer holds for most diffused timesteps, and it is very difficult to show that these diffused score functions can be well approximated without a significant increase in the number of network parameters.
(Join works with Yihong Gu, Dehao Dai, Mukherjee, and Ximing Li)

Biography

Jianqing Fan, Academician of Academia Sinica and member of Royal Academy of Belgium, is the Frederick L. Moore Professor at Princeton University. He was a professor at UNC-Chapel Hill, UCLA, and the Chinese University of Hong Kong, and the president of the Institute of Mathematical Statistics and the International Chinese Statistical Association. He is the joint editor of the Journal of the American Statistical Associationand was the co-editor of The Annals of Statistics, Probability Theory and Related Fields, Econometrics Journal, Journal of Econometrics, and Journal of Business and Economics Statistics. His research interests include high-dimensional statistics, data science, machine learning, mathematics of AI, financial economics, and computational biology. He coauthored 4 books and published over 300 papers. His published work has been recognized by The 2000 COPSS Presidents’ Award, Morningside Gold Medal of Applied Mathematics, Guggenheim Fellow, P.L. Hsu Prize, Guy medal in silver, Noether Distinguished Scholar Award, Le Cam Award and Lecture, Frontiers of Science Award, and Wald Memorial Award and Lecture, and follow of American Associations for Advancement of Science, Institute of Mathematical Statistics, American Statistical Association, and Society of Financial Econometrics

Keynote Lecture

by Professor Daniela Witten, U of Washington

VIEW PROFILE

Title

Testing hypotheses via orthogonalization

Abstract

Classical hypothesis testing frameworks break down in contemporary settings in which null hypotheses are increasingly abstract, the same data are used to both generate and test hypotheses, and minimal assumptions about the underlying data are made. In this work, we propose a new framework for conducting valid hypothesis tests in broad contexts. We propose to add and subtract external noise generated from a symmetric shift-family to our data, X, to partition it into two pieces, X1 and X2. We provide a generic strategy for orthogonalizing X2 against X1 under the null hypothesis H0, then show that testing whether the orthogonalization was successful provides a valid test of H0 under mild assumptions. Remarkably, this framework extends naturally to the post-selection inference setting with minimal modifications: we simply select a hypothesis on X1, then perform orthogonalization under the selected null. As our approach neither requires pre-specification of the selection mechanism, nor is restricted to a small class of data-generating distributions, it dramatically expands the settings for which valid post-selection inference can be conducted. We showcase the flexibility of our proposal in a number of case studies. This is joint work with Ameer Dharamshi (University of Washington).

Biography

Daniela Witten is a professor of Statistics and Biostatistics at University of Washington, and the Dorothy Gilford Endowed Chair in Mathematical Statistics. She develops statistical machine learning methods for high-dimensional data, with a focus on unsupervised learning.
She has received a number of awards for her research in statistical machine learning: most notably the Spiegelman Award from the American Public Health Association for a (bio)statistician under age 40, and the Presidents’ Award from the Committee of Presidents of Statistical Societies for a statistician under age 41.
Daniela is a co-author of the textbook “Introduction to Statistical Learning”, and previously served as Joint Editor of Journal of the Royal Statistical Society, Series B.

Keynote Lecture

by Professor Xuming He, Washington U in St. Louis

VIEW PROFILE

Title

Distributed-oracle estimation for high-dimensional quantile regression

Abstract

Quantile regression (QR) is a valuable tool for analyzing heterogeneous covariate effects across the entire outcome distribution including lower and upper tails. However, implementing QR in high-dimensional settings where data are decentralized presents computational and communication hurdles. We propose a communication-efficient estimator for high-dimensional QR designed for data distributed across multiple machines. To use folded-concave penalties, we develop an iterative multi-step (IM) algorithm utilizing a surrogate smoothed quantile loss. This approach effectively balances statistical efficiency with communication constraints. To provide a theoretical foundation for our method, we introduce the concept of a distributed-oracle estimation and demonstrate that the IM estimator converges to this oracle with high probability. Furthermore, we extend our framework to enable distributed inference for specific low-dimensional components of interest. This talk is based on joint work with SongshanYang, Yifan Gu, and Hangfang Yang.

Biography

Xuming He is the Kotzubei-Beckmann Distinguished Professor and Inaugural Chair of Statistics and Data Science at the Washington University in St. Louis. His research focuses on robust statistics, semiparametric regression, Bayesian inference, and post-selection inference. His interdisciplinary work promotes the application of statistics and data science across various fields, including biosciences, public health, and socioeconomic studies.

He is an elected Fellow of the American Association for the Advancement of Science (AAAS), the American Statistical Association (ASA), and the Institute of Mathematical Statistics (IMS). He is Past President of the International Statistical Institute (ISI), and currently serves as a Joint Editor of the Journal of the Royal Statistical Society – Series B. His recent honors and awards include the IMS Carver Medal (2022), ASA Founders Award (2021), and the Gottfried E. Noether Distinguished Scholar Award (2025) from the ASA.