Bootstrap and Resampling Methods
Semester 2 2023
http://www.tau.ac.il/~saharon/Resampling.html


Lecturer: Saharon Rosset, Schreiber 203, saharon@tauex.tau.ac.il
Office hrs: By appointment.

 
Final take home sign-up sheet (Deadline: 20 June 2023)
(since this is a small class, if neither date range works for someone, write to me and we can look for a personalized alternative)
 

Make-up classes plan (Updated as we are asked to get to 12 classes in the semester): Friday 23/6 11-13:30

Announcements and handouts

12 March
Intro to Bootstrap presentation
Uniform example

20 March
Homework 1, due on 3 April at the beginning of the Pesach vacation.

16 April
Homework 2, due on 30 April before class.

30 April
Code for stamps hypothesis testing example.
Presentation on Efron et al.’s analysis of bootstrap for phylogenetic tree inference (originally prepared by Aya Vituri)

13 May
Homework 3, due on 28 May before class.

14 May
MCMC intro.

8 June
Homework 4, due on 26 June (first take home date). For the extra credit part of Problem 3 use Assaf’s python code

11 June
Code for class today: Circles in a square, Gibbs sampling for beta-binomial

18 June
Assaf’s presentation

25 June
Importance sampling example, Permutation testing with IS implemented via MCMC!

Syllabus

The goal of this course is to introduce the main ideas and uses of the Bootstrap and related methods.
The first part of the course will follow the book ”An Introduction to the Bootstrap” by Efron and Tibshirani.
We will cover chapters 1-19 and possibly some material from later chapters.
The rest of the course will cover some of the following areas, as time and the mutual interest of instructor and students dictate:

 
Preliminary plan:
Week 1: Introduction, up to chapter 6
Week 2: Chap. 7-10
Week 3: Chap. 11-12
Week 4-7: Chap. 13-20
Week 8-9: Bootstrap applications from the literature
Week 10-12: Markov Chain Monte Carlo (MCMC) introduction and applications, importance sampling

The final grade will be based on a combination of homework and a final take home exam. The homework and exam will require a combination of theoretical work and some programming and data analysis.

Prerequisites

Solid knowledge of mathematical foundations: Calculus; Linear Algebra
Undergraduate courses in: Probability; Statistical Theory; Applied Statistics (e.g., Regression)
Statistical programming experience in R is an advantage

Main textbook

An Introduction to the Bootstrap by Efron and Tibshrani (1993, Chapman and Hall). The library has several hard copies, but we now have online access to the electronic copy for all TAU users.

Other resources

Papers on bootstrap we may discuss:
Confidence limits on phylogenies: an approach using the bootstrap by Felsenstein
Bootstrap confidence levels for phylogenetic trees by Efron et al. 1996
Heuristics of instability and stabilization in model selection by Breiman

Resources on MCMC and related topics:
The Monte-Carlo Method by Paul J. Atzberger
An Introduction to MCMC for Machine Learning by Andrieu et al.

Computing

The course will require some use of statistical modeling software. It is strongly recommended to use R (freely available for PC/Unix/Mac).
R Project website also contains extensive documentation.
A basic “getting you started in R” tutorial. Uses the Boston Housing Data (thanks to Giles Hooker).
Modern Applied Statistics with Splus by Venables and Ripley is an excellent source for statistical computing help for R/Splus.