# Previous Wellner Lectures

## The Spring 2024 Jon A. Wellner Lecture

Tuesday, January 30, 2024

2:30 p.m. – Lecture in TLC 029

3:30 p.m. – Refreshments in the IRIC atrium

### Title

Data thinning and its applications

### Speaker

**Daniela Witten, Ph.D.**

Professor, Biostatistics Dorothy Gilford Endowed Chair of Mathematical Statistics, University of Washington

We propose data thinning, a new approach for splitting an observation from a known distributional family with unknown parameter(s) into two or more independent parts that sum to yield the original observation, and that follow the same distribution as the original observation, up to a (known) scaling of a parameter. This proposal is very general, and can be applied to a broad class of distributions within he natural exponential family, including the Gaussian, Poisson, negative binomial, Gamma, and binomial distributions, among others. Furthermore, we generalize data thinning to enable splitting an observation into two or more parts that can be combined to yield the original observation using an operation other than addition; this enables the application of data thinning far beyond the natural exponential family. Data thinning has a number of applications to model selection, evaluation, and inference. For instance, cross-validation via data thinning provides an attractive alternative to the “usual” approach of cross-validation via sample splitting, especially in unsupervised settings in which the latter is not applicable. We will present an application of data thinning to single-cell RNA-sequencing data, in a setting where sample splitting is not applicable. This is joint work with Anna Neufeld (Fred Hutch), Ameer Dharamshi (University of Washington), Lucy Gao (University of British Columbia), and Jacob Bien (University of Southern California).

Daniela Witten is a professor of Statistics and Biostatistics at University of Washington, and the Dorothy Gilford Endowed Chair in Mathematical Statistics. She develops statistical machine learning methods for high-dimensional data, with a focus on unsupervised learning.

She has received a number of awards for her research in statistical machine learning: most notably the Spiegelman Award from the American Public Health Association for a (bio)statistician under age 40, and the Presidents' Award from the Committee of Presidents of Statistical Societies for a statistician under age 41.

Daniela is a co-author of the textbook "Introduction to Statistical Learning", and beginning in 2023 will serve as Joint Editor of Journal of the Royal Statistical Society, Series B.

## The Fall 2022 Jon A. Wellner Lecture

Thursday, September 29, 2022

3 p.m. – Lecture in TLC 029

4 p.m. – Refreshments in the IRIC Atrium

### Title

Inference for Longitudinal Data After Adaptive Sampling

### Speaker

**Susan A. Murphy, Ph.D.**

Mallinckrodt Professor of Statistics and of Computer Science,

Radcliffe Alumnae Professor at the Radcliffe Institute,

Harvard University

Adaptive sampling methods, such as reinforcement learning (RL) and bandit algorithms, are increasingly used for the real-time personalization of interventions in digital applications like mobile health and education. As a result, there is a need to be able to use the resulting adaptively collected user data to address a variety of inferential questions, including questions about time-varying causal effects. However, current methods for statistical inference on such data (a) make strong assumptions regarding the environment dynamics, e.g., assume the longitudinal data follows a Markovian process, or (b) require data to be collected with one adaptive sampling algorithm per user, which excludes algorithms that learn to select actions using data collected from multiple users. These are major obstacles preventing the use of adaptive sampling algorithms more widely in practice. In this work, we proved statistical inference for the common Z-estimator based on adaptively sampled data. The inference is valid even when observations are non-stationary and highly dependent over time, and (b) allow the online adaptive sampling algorithm to learn using the data of all users. Furthermore, our inference method is robust to miss-specification of the reward models used by the adaptive sampling algorithm. This work is motivated by our work in designing the Oralytics oral health clinical trial in which an RL adaptive sampling algorithm will be used to select treatments, yet valid statistical inference is essential for conducting primary data analyses after the trial is over.

Susan Murphy’s research focuses on improving sequential, individualized, decision making in digital health. She developed the micro-randomized trial for use in constructing digital health interventions; this trial design is in use across a broad range of health-related areas. Her lab works on online learning algorithms for developing personalized digital health interventions. Dr. Murphy is a member of the National Academy of Sciences and of the National Academy of Medicine, both of the US National Academies. In 2013 she was awarded a MacArthur Fellowship for her work on experimental designs to inform sequential decision making. She is a Fellow of the College on Problems in Drug Dependence, Past-President of Institute of Mathematical Statistics, Past-President of the Bernoulli Society and a former editor of the Annals of Statistics.

## The Spring 2022 Jon A. Wellner Lecture

Tuesday, March 29, 2022

3 p.m. – Lecture in TLC 031

4 p.m. – Refreshments in the IRIC Atrium

### Title

Fitting stochastic epidemic models to noisy surveillance data: are we there yet?

### Speaker

**Vladimir N. Minin****, Ph.D.**

Professor, Department of Statistics and Associate Director of the Infectious Disease Science Initiative, University of California, Irvine

Stochastic epidemic models describe how infectious diseases spread through a population of interest. These models are constructed by first assigning individuals to compartments (e.g., susceptible, infectious, and recovered) and then defining a stochastic process that governs the evolution of sizes of these compartments through time. I will review multiple lines of attack of a challenging and not fully solved problem of fitting these models to noisy infectious disease surveillance data. These solutions involve a range of mathematical techniques: particle filter Markov chain Monte Carlo algorithms, approximations of stochastic differential equations, and Poisson random measure-based Bayesian data augmentation. Importantly, many of these computational strategies open the door for integration of multiple infectious disease surveillance data streams, including less conventional ones (e.g., pathogen wastewater monitoring and genomic surveillance). Such data integration is critical for making key parameters of stochastic epidemic models identifiable. I will illustrate the state-of-the-art statistical inference for stochastic epidemic models using Influenza, Ebola, and SARS-CoV-2 surveillance data and will conclude with open problems and challenges that remain to be addressed.

Minin’s research interests revolve around developing statistically rigorous solutions to problems that arise in biological sciences. These solutions often involve formulating stochastic models that can describe complex dynamics of biological systems and devising computationally efficient algorithms to fit these models to data. Minin is currently most active in infectious disease epidemiology, working on Bayesian estimation of disease transmission model parameters. His other research interests include phylogenetics, population genetics, computational immunology, and systems biology. Minin received a B.S. in Mathematics from Odesa National University, an M.S. in Mathematics from the University of Idaho, and a Ph.D. in Biomathematics from the University of California, Los Angeles.

## The 2019 Jon A. Wellner Lecture

Tuesday, September 24, 2019

3 p.m. – Refreshments in the Paul J. Joyce Faculty-Staff Lounge, Brink Hall

4 p.m. – Lecture in TLC 031

### Title

Nonparametric Inference Under Shape Constraints: Past, Present and Future

### Speaker

**Richard J. Samworth, Ph.D.**

Professor of Statistical Science and Director of the Statistical Laboratory, University of Cambridge

Traditionally, we think of statistical methods as being divided into parametric approaches, which can be restrictive, but where estimation is typically straightforward (e.g. using maximum likelihood), and nonparametric methods, which are more flexible but often require careful choices of tuning parameters. The area of nonparametric inference under shape constraints sits somewhere in the middle, seeking in some ways the best of both worlds. I will give an introduction to this currently very active area, providing some history, recent developments and a future outlook.

Professor Richard J. Samworth's main research interests are in nonparametric and high-dimensional statistics. Particular topics include shape-constrained estimation problems; data perturbation methods (e.g. subsampling, bootstrap sampling, random projections, knockoffs); nonparametric classification; (conditional) independence testing; estimation of entropy and other functionals; changepoint detection and estimation; missing data; variable selection; and applications, including genetics, archaeology and oceanography.

**Awards**

- Fellow of the American Statistical Association
- Fellow of the Institute of Mathematical Statistics
- The Royal Statistical Society?s Research Prize in 2008
- The Guy Medal in Bronze, Royal Statistical Society, in 2012
- The COPSS Presidents' Award 2018
- IMS Medallion Lecture 2018
- The Adams Prize 2017
- Philip Leverhulme Prize 2014

## The 2018 Jon A. Wellner Lecture

Thursday, September 6, 2018

3 p.m. – Refreshments in Paul Joyce Faculty-Staff Lounge, Brink Hall

4 p.m. – Lecture in TLC 031

### Title

New Multiplier Inequalities and Applications

### Speaker

**Jon A. Wellner, Ph.D.**

Professor of Statistics and Biostatistics, University of Washington

Multiplier inequalities have proved to be one of the key tools of modern empirical process theory, with applications to central limit theorems, bootstrap theory, and weighted likelihood methods in statistics. In this talk I will review some classical multiplier inequalities, present a new multiplier inequality, and discuss several statistical applications. The applications include new results concerning convergence rates of least squares estimators (LSE) in regression models with possibly heavy-tailed errors. Particular cases involving sparse linear regression and shape restrictions will be mentioned.

[This talk is based on the University of Washington Ph.D. work of Qiyang (Roy) Han.]