02-711 Computational Molecular Biology and Genomics

Cross-listed as 03-511, 03-711, 15-495, and 15-856

Announcements
 
1/12/09 - The first class will be held on Monday, 10:30-12:50 in Mellon Institute 409. See you there!
1/14/09- Sign up for scribing here. Click Here
- Quiz on Wednesday!
- No class on Monday 1/19/09 (MLK)
1/19/09- Supplemental readings for last Wednesday (1/14/09) is available for pick up in front of 411L
1/30/09- Website is back in our control again
- Handouts for the lecture 1-5 is out
- Recitation slides is up on the syllabus section
- Solution for quiz #1 is up
2/8/09- Handouts (6,7, and 8) and scribe (6-7) is out.
2/9/09- Handout 8 is updated to reflect new content added in the lecture
- Project proposal is due 2/16/09
2/27/09- Handouts and readings are now posted online
- Practice midterm and solution is available
- Midterm on Monday 3/2/09
4/2/09- Updated scribing schedule. Click Here
Course Description
 
This course provides an advanced introduction to computational molecular biology and genomics. The course particularly focuses on computational methods relevant to: (1) sequencing - DNA sequencing and genome assembly; (2) genetic variation - coalescent theory, haplotype inference, allele frequency estimation; (3) phenotypic variation - linkage analysis, association studies; and (4) evolution - multiple sequence alignment, comparative genomics, phylogeny reconstruction.
Prerequisite
 
Students are expected to have the following background:
  • Knowledge in basic principles and skills in computer science (15-211, 15-451) and in biology (03-121)
  • Familiarity with basic probability theory and linear algebra.
Class Schedule
 
Lecture: Mon Wed 10:30-11:50am, Mellon Institute 409.
Discussion section: to be announced.
Teaching Staff
 

Professor: Su-In Lee
Office: Mellon Institute 411L
Office hours: Tuesday 9-10am
Phone: (412) 268-4659
Email: 4123silee@cs.stanford.edu3214 (remove numbers)

Teaching Assistant: Ming-Chi Tsai
Office: Mellon Institute 654F
Office hours: Wednesday, 3:30-4:30pm
Phone: (412) 368-9733
Email: 5129mingchit@andrew.cmu.edu9215 (remove numbers)

Course Requirements
 
There are four course requirements:
  1. Miderm. The midterm will cover the material of the first part of the course.
  2. Project. The term project may be done with 1-2 partners. Students should submit proposal (due 2/9), milestone (due 3/23) and final report (in the final week). Projects will be presented to the class during the final 1-2 days of the course.
  3. Quiz. Each Wednesday (except in exam weeks), there will be a 10-15 minutes quiz over the material covered in the previous week.
  4. Scribing. Each class will be scribed by a student. The scribed notes are due end of the week in which the lecture is given. Please sign up to scribe a particular lecture beforehand and email the scribed note to TA as an attachment.
Grades will be determined: 20% midterm, 60% project, 10% quiz and 10% scribing.
Syllabus
 
< tr>
 DateMaterial CoveredReadingScribeHandouts
11/12/09Introduction  
0. Mathematical tools
      
    21/14/09Probability theory review and probabilistic modelsKoller and Friedman (Ch. 2, pg. 29-62)
    31/21/09Probabilistic modelsKoller and FriedmanClass3Quiz 1;
    Solution
    41/26/09Maximum likelihood estimationKoller and FriedmanClass4
    51/28/09Expectation maximizationProf Andrew Ng's CS229 lecture noteRecitation
    Quiz 2
    1. Retrieving DNA sequence information
        
      62/2/09DNA sequencing and genome assemblyClass6
      72/4/09Genome assemblyClass7;
      Class 6-7 (v2)
      Quiz 3
      82/9/09Whole-genome shotgun sequencingARACHNE (Batzoglou et al, 2002)
      Arachne 2 (Jaffe et al, 2003)
      Class8
      2. Understanding genetic variation
          
        92/11/09Hardy-Weinberg equilibriumHartl and Clark (Ch. 2)Quiz 4
        102/16/09Linkage disequilibrium IHartl and Clark (Ch. 2)Class10
        112/18/09Linkage disequilibrium IIHartl and Clark (Ch. 2)Class11Quiz 5;
        Solution
        122/23/09Introduction to coalescent modelsHudson et al, 1990Class12;
        Class12
        (v2)
        ;
        Class12
        (v3)
        132/25/09Allele frequency distributionsHudson et al, 1990Class13Quiz 6
        3/2/09MidtermPractice Midterm;
        Solution
        143/4/09Coalescent models with recombinationHudson et al, 1990Class14
        153/16/09Maximum Likelihood Estimate for Allele FrequenciesClass15
        163/18/09Expectation-MaximizationStephens et al. AJHG 2001
        173/23/09Haplotype inference -- EM based algorithmExcoffier and Slatkin. Mol Biol Evol 1995Class17
        183/25/09Haplotype InferenceClark. Mol Biol Evol 1990
        3. Genotype to phenotype
            
          193/30/09Association studiesZaykin et al. Hum Hered 2002
          204/1/09Haplotype based association testsZaykin et al. Hum Hered 2002Class20
          Class20 (V2)
          214/6/09Indirect association; genotype imputationMarchini et al. Nat Gen 2007Class21
          Class21a
          4. Evolution
             
            224/8/09Linkage StudyClass22
            234/13/09Linkage Study IIClass23
            244/15/09
            254/20/09Multiple sequence alignmentClass25
            264/22/09Multiple sequence alignment
            5. Final project
                
              4/27/09Final presentation I
              4/29/09Final presentation II