Introduction to Statistical Learning

 
Semester 2 2015/16
Tuesday 14-17, Dan-David 204
Home page on http://www.tau.ac.il/ ∼ saharon/IntroStatLearn.html
Lecturer: Saharon Rosset
Schreiber 022
saharon@post.tau.ac.il
Office hrs: Thursday 16-17 or by appointment (coordination needed in any case).

Announcements and handouts

(8 March) R code from class to analyze the prostate data using least squares regression and nearest neighbors.
(15 March) R code from class to analyze the advertising data using least squares regression.
(18 March) Homework 1, due on 5/4 in class. k-NN R code to be used for problem 1.
(22 March) Nature paper from 2009 introducing Google Flu Trends (GFT)
(29 March) R code from class to analyze the default data using logistic regression and LDA.
(5 April) R code for Chapter 4 from the book website to demonstrate classification methods.
(19 April) Homework 2, due on 3/5 in class.
(3 May) R code from class verifying the theoretical result on optimism.
(8 May) Homework 3, due on 24/5 in class.
(25 May) Homework 4, due on 9/6 in my mailbox on floor 1 of Schreiber (or 7/6 in class). It uses the code boost.r.
(7 June) Bootstrap presentation from class

Syllabus

The goal of this course is to introduce the basic ideas of "modern" statistical learning and predictive modeling, from a statistical, theoretical and computational perspective, together with applications in big data.
The topics we will cover include: Both the class material and homework will combine theoretical aspects with practical implementation aspects and demonstrations on data.
The grade will be a combination of homework problem sets (about six overall, worth about 30% of final grade) and a final in-class exam (about 70% of final grade).

Prerequisites

Basic knowledge of mathematical foundations: Calculus; Linear Algebra
Undergraduate courses in: Probability; Regression; Theoretical Statistics (possibly in parallel)
Statistical programming experience in R is an advantage

Textbook

An Introduction to Statistical Learning with Applications in R by James, Witten, Hastie and Tibshirani.

Computing

The book labs, all class demonstrations and any code given in the HW will be in R (freely available for PC/Unix/Mac). There is no requirement to use it, but it is highly recommended.
R Project website also contains extensive documentation.
A basic "getting you started in R" tutorial. Uses the Boston Housing Data (thanks to Giles Hooker).
Modern Applied Statistics with Splus by Venables and Ripley is an excellent source for statistical computing help for R/Splus.



File translated from TEX by TTH, version 4.08.
On 13 Jun 2016, 17:02.