University of Idaho - I Banner
A student works at a computer

VandalStar

U of I's web-based retention and advising tool provides an efficient way to guide and support students on their road to graduation. Login to VandalStar.

Contact

Physical Address:
Brink Hall 300

Mailing Address:
875 Perimeter Drive, MS 1103
Moscow, ID 83844-1103

Phone: 208-885-6742

Fax: 208-885-5843

Email: mathstat@uidaho.edu

Web: Department of Mathematics and Statistical Science

Jon A. Wellner Lecture

The Spring 2024 Jon A. Wellner Lecture

Tuesday, January 30, 2024
2:30 p.m. – Lecture in TLC 029
3:30 p.m. – Refreshments in the IRIC atrium

Title

Data thinning and its applications

Speaker

Daniela Witten, Ph.D. – Professor, Biostatistics
Dorothy Gilford Endowed Chair of Mathematical Statistics, University of Washington

 

Abstract

We propose data thinning, a new approach for splitting an observation from a known distributional family with unknown parameter(s) into two or more independent parts that sum to yield the original observation, and that follow the same distribution as the original observation, up to a (known) scaling of a parameter.  This proposal is very general, and can be applied to a broad class of distributions within he natural exponential family, including the Gaussian, Poisson, negative binomial, Gamma, and binomial distributions, among others.  Furthermore, we generalize data thinning to enable splitting an observation into two or more parts that can be combined to yield the original observation using an operation other than addition; this enables the application of data thinning far beyond the natural exponential family.  Data thinning has a number of applications to model selection, evaluation, and inference.  For instance, cross-validation via data thinning provides an attractive alternative to the “usual” approach of cross-validation via sample splitting, especially in unsupervised settings in which the latter is not applicable.  We will present an application of data thinning to single-cell RNA-sequencing data, in a setting where sample splitting is not applicable.  This is joint work with Anna Neufeld (Fred Hutch), Ameer Dharamshi (University of Washington), Lucy Gao (University of British Columbia), and Jacob Bien (University of Southern California). 

About the Speaker

Daniela Witten is a professor of Statistics and Biostatistics at University of Washington, and the Dorothy Gilford Endowed Chair in Mathematical Statistics. She develops statistical machine learning methods for high-dimensional data, with a focus on unsupervised learning. 

She has received a number of awards for her research in statistical machine learning: most notably the Spiegelman Award from the American Public Health Association for a (bio)statistician under age 40, and the Presidents' Award from the Committee of Presidents of Statistical Societies for a statistician under age 41. 

Daniela is a co-author of the textbook "Introduction to Statistical Learning", and beginning in 2023 will serve as Joint Editor of Journal of the Royal Statistical Society, Series B. 

Previous Lectures

About the Jon A. Wellner Lecture

The Jon A. Wellner Lecture was established by Jon and Vera Wellner to provide educational experiences outside the classroom for students and faculty and to help to raise the profile of the University of Idaho by bringing well-known experts in the fields of statistics and probability to Moscow.

Contact

Physical Address:
Brink Hall 300

Mailing Address:
875 Perimeter Drive, MS 1103
Moscow, ID 83844-1103

Phone: 208-885-6742

Fax: 208-885-5843

Email: mathstat@uidaho.edu

Web: Department of Mathematics and Statistical Science