University
of Pittsburgh Carnegie Mellon University

Joint CMU-Pitt Ph.D. Program in Computational Biology

Robert F. Murphy and Ivet Bahar, Directors

Home
Background
History
Curriculum
Admissions
Training Faculty
Students
Journal Club
Seminar Series
Committees
Alternative Programs

Curriculum - Core Course

02-710/MSCBIO 2070 Computational Genomics

Dramatic advances in experimental technology and computational analysis are fundamentally transforming the basic nature and goal of biological research. The emergence of new frontiers in biology, such as evolutionary genomics and systems biology is demanding new methodologies that can confront quantitative issues of substantial computational and mathematical sophistication. In this course we will discuss classical approaches and latest methodological advances in the context of the following biological problems: 1) Computational genomics, focusing on gene finding, motifs detection and sequence evolution.2) Analysis of high throughput biological data, such as gene expression data, focusing on issues ranging from data acquisition to pattern recognition and classification. 3) Molecular and regulatory evolution, focusing on phylogenetic inference and regulatory network evolution, and 4) Systems biology, concerning how to combine sequence, expression and other biological data sources to infer the structure and function of different systems in the cell. From the computational side this course focuses on modern machine learning methodologies for computational problems in molecular biology and genetics, including probabilistic modeling, inference and learning algorithms, pattern recognition, data integration, time series analysis, active learning, etc.

Similar courses that have been taught previously are listed below.

CMU 03-711 Computational Molecular Biology and Genomics [Course web page]

An advanced introduction to computational molecular biology, using an applied algorithms approach. The first part of the course will cover established algorithmic methods, including pairwise sequence alignment and dynamic programming, multiple sequence alignment, fast database search heuristics, hidden Markov models for molecular motifs and phylogeny reconstruction. The second part of the course will explore emerging computational problems driven by the newest genomic research. Course work includes four to six problem sets, one midterm and final exam. A project based on recent results from the genomics literature will be required of students taking 03-711.

15-899B Computational Genomics: From Experimental Data to Systems Biology [Course web page]

Recent advances in High-throughput experimental methods in molecular biology hold great promise. We now have the complete DNA sequence for many organisms. DNA microarrays have been used to measure the expression levels of thousands of genes, and more recently microarrays have been exploited to measure genome-wide protein-DNA binding events. While useful, these datasets present many computational challenges. In addition to analyzing individual data sources, principled computational methods are required in order to combine these data sources to infer genetic interaction networks. In this class we will discuss statistical and algorithmic approaches for contemporary problems in functional genomics, with an emphasis on fusing diverse sources to model systems in the cell. Topics include: DNA sequence data, binding motifs, gene expression data normalization, clustering and visualization, cancer classification, continuous dynamic models, protein-DNA binding, protein interaction networks, information fusion, graphical models and systems biology.

Pitt HUGEN 2024 Statistical Methods in Bioinformatics

This course will introduce several of the most important current topics in bioinformatics, with emphasis on applications of current state-of-the-art methods and software in analysis of biological data. The first half of the course will deal with algorithms related to DNA and protein sequence analysis. It will include limited computer lab time introducing popular software used for database searches (e.g. BLAST) and motif recognition. The second half of the course will cover DNA microarrays and related high-throughput technologies. We will introduce the different technologies available for monitoring gene expression and discuss data preprocessing and quality issues. We will then discuss statistical methods for analyzing microarray data, including clustering and classification techniques. Software tools for microarray analysis will be introduced in computer lab sessions.