Topics in Statistical Genetics

Semester 2 2015/16
Thursday 13-16, Dan-David 202
Home page on ∼ saharon/StatsGenetics.html
Lecturer: Saharon Rosset
Schreiber 022
Office hrs: Thursday 16-17 or by appointment (coordination needed in any case).
Special dates: Extra classes on Sunday 17.4 and Tuesday 19.4 (same hour and room).


The goal of this course is to introduce some of the major topics in Genetics, and gain a statistical perspective on them.
We will start with a brief introduction to Genetics concepts, and gradually start elaborating on statistical aspects of the questions that come up. As needed, we will introduce relevant areas of statistics in some detail.
In the latter part of the course we will pick a hot current research topic and concentrate on it for a few weeks.
The final grade will be based on a combination of homework (3-4), a final take home exam, and possibly a class presentation.
Tentative topics list (each topic 1-2 weeks):

Announcements and handouts

(25 February) Class 1 presentation.
(3 March) R code from class for fitting and testing on mtDNA data.
African mtDNA paper which describes the data used.
(17 March) Topics discussed in class today include:
  1. 4*4 Mutation models: definitions, estimation and hypothesis testing/model selection.
    Relevant reading material: any textbook in statistical genetics, such as Yang (2006), review by Huelsenbeck and Crandall which we discussed in class.
  2. STR mutation models: simple random walk and (δμ)2 method; more complex and realistic models; existence of stationary distribution; estimation and model selection.
    Relevant reading material: Chap. 11 in Nielsen (2005) on models of microsatellite evolution, Whittaker et al. (2003) whose data is used in HW1.
Homework 1 due 31 March in class. Resources for this homework:
mtDNA mutation counts for problem 1.
mtDNA loci list for problem 1.
The paper by Whittaker et al. (2003) for problem 3 is available in pdf or html.
(31 March) Today in class we concentrated on phylogenetic trees: structure learning and inference, with detailed derivation of the "dynamic program" approach to calculating likelihood on a tree.
There are plenty of resources to read about this, including the review by Huelsenbeck and Crandall and the book Inferring phylogenies by Felsenstein.
Homework 2 due 19 April in class. PHYLIP homepage for problem 1.
The primate data for problem 1.
HapMap YRI Chromosome 22 dataset for problem 3.
(14 April) R code for analyzing the kidney disease data shown in class.
(17 April) Paper by Tang et al. (2005) about estimating ancestry which we discussed in class.
R code implementing the approach.
(19 April) Papers on PCA in GWAS: Genes Mirror Geography Within Europe by Novembre et al.; PCA Corrects for Stratification by Price et al.
Code: Running PCA on movies example and simulated genetic data. Comparable example using the EM approach.
Homework 3 due 5 May in class.
(5 May) Popular presentation on GWAS.
Book chapter on mixed models in genetics.
Yang et al. (2010) famous paper applying mixed models to GWAS.
(8 May) Homework 4 due 7 June in our last class.


Basic knowledge of mathematical foundations: Calculus; Linear Algebra
Undergraduate courses in: Probability; Theoretical Statistics
Statistical programming experience in R is an advantage
Prior basic knowledge in Biology and Genetics is an advantage

Some recommended books

Human Evolutionary Genetics by Jobling, Hurles and Tyler-Smith
An excellent introduction to Human Genetics, with a quantitative flavor
Principles of Population Genetics by Hartl and Clark
Comprehensive overview of computational methods in Genetics
Statistical Methods in Molecular Evolution edited by R. Nielsen
Collection of tutorials and reviews on major topics in Statistical Genetics


The course will require some use of statistical modeling software. It is strongly recommended to use R (freely available for PC/Unix/Mac).
R Project website also contains extensive documentation.
A basic "getting you started in R" tutorial. Uses the Boston Housing Data (thanks to Giles Hooker).
Modern Applied Statistics with Splus by Venables and Ripley is an excellent source for statistical computing help for R/Splus.

File translated from TEX by TTH, version 4.08.
On 08 May 2016, 16:19.