Jon A. Wellner Lecture
The Spring 2024 Jon A. Wellner Lecture
Tuesday, January 30, 2024
2:30 p.m. – Lecture in TLC 029
3:30 p.m. – Refreshments in the IRIC atrium
Data thinning and its applications
Daniela Witten, Ph.D. – Professor, Biostatistics
Dorothy Gilford Endowed Chair of Mathematical Statistics, University of Washington
We propose data thinning, a new approach for splitting an observation from a known distributional family with unknown parameter(s) into two or more independent parts that sum to yield the original observation, and that follow the same distribution as the original observation, up to a (known) scaling of a parameter. This proposal is very general, and can be applied to a broad class of distributions within he natural exponential family, including the Gaussian, Poisson, negative binomial, Gamma, and binomial distributions, among others. Furthermore, we generalize data thinning to enable splitting an observation into two or more parts that can be combined to yield the original observation using an operation other than addition; this enables the application of data thinning far beyond the natural exponential family. Data thinning has a number of applications to model selection, evaluation, and inference. For instance, cross-validation via data thinning provides an attractive alternative to the “usual” approach of cross-validation via sample splitting, especially in unsupervised settings in which the latter is not applicable. We will present an application of data thinning to single-cell RNA-sequencing data, in a setting where sample splitting is not applicable. This is joint work with Anna Neufeld (Fred Hutch), Ameer Dharamshi (University of Washington), Lucy Gao (University of British Columbia), and Jacob Bien (University of Southern California).
About the Speaker
Daniela Witten is a professor of Statistics and Biostatistics at University of Washington, and the Dorothy Gilford Endowed Chair in Mathematical Statistics. She develops statistical machine learning methods for high-dimensional data, with a focus on unsupervised learning.
She has received a number of awards for her research in statistical machine learning: most notably the Spiegelman Award from the American Public Health Association for a (bio)statistician under age 40, and the Presidents' Award from the Committee of Presidents of Statistical Societies for a statistician under age 41.
Daniela is a co-author of the textbook "Introduction to Statistical Learning", and beginning in 2023 will serve as Joint Editor of Journal of the Royal Statistical Society, Series B.
About the Jon A. Wellner Lecture
The Jon A. Wellner Lecture was established by Jon and Vera Wellner to provide educational experiences outside the classroom for students and faculty and to help to raise the profile of the University of Idaho by bringing well-known experts in the fields of statistics and probability to Moscow.