Computational Genomics

Computational Genomics involves the analysis of massive quantities of genomic and proteomic data by systematic development and application of probability and statistics theories, information technologies, and data mining techniques.

Computational Genomics entails efforts to digest the daunting quantity of genomic and proteomic data now available by systematic development and application of probability and statistics theories, information technologies and data mining techniques. Linguistics methods are viewed as promising tools towards elucidating sequence-structure-function relations, and complementing computational genomics studies. Computational genomics targets understanding gene/protein function, identifying and characterizing cellular regulatory networks and discerning the link between genes and diseases. Discovery and processing of this information is pivotal in the development of novel gene therapy strategies and tools.

 

Faculty

Machine Learning Department, Lane Center for Computational Biology, and Biological Sciences

My primary research areas are computational Biology, Bioinformatics and Machine learning. I am heading the Systems Biology Group at the School of Computer Science at CMU. Our group develops computational methods for understanding the interactions, dynamics and conservation of complex biological systems. Our work addresses issues ranging from the experimental design level to the systems biology level. I am also interested in how shared principles between computation and biology can be used to improve our understanding of both fields. We are looking at algorithms used by nature to see if we can obtain new ideas on how to design better algrotihms for distributed computing systems while at the same time infer new insights regarding information processing in biology.
CAREER
OVERTON

Lab Website

Departments of Computational & Systems Biology and Biomedical Informatics

Our ultimate goal is to investigate the molecular mechanisms of chronic diseases.    We develop new computational methods to model biological processes and mine high-dimensional, multi-modal biomedical data.    We are very interested in the effect of gene regulatory networks in disease. 

Lab Website

Department of Computational and Systems Biology 

Our overarching goal is to understand how the functions of proteins and their networks change over time. We are particularly focused on the process of co-evolution within functional networks and the ways by which proteins influence each other during evolution. We study a variety of organisms ranging from single cells to primates. Currently we perform experiments in yeast and bacteria to retrace the evolution of protein function, while our computational studies are based in yeast species,Drosophila, and mammals.

Lab Website

KAUFMAN

Department of Biomedical Informatics

Research Interests:
  • Application of decision theory, probability theory, Bayesian statistics, and artificial intelligence to biomedical informatics research problems
  • Causal modeling and discovery from clinical and high-throughput molecular data
  • Computer-aided medical diagnosis and prediction
  • Machine-learning approaches to improving patient safety
  • Biosurveillance of disease outbreaks

Lab Website

Departments of Biological Sciences and Computer Science

Current research efforts in the laboratory include

  • statistical tests for recognizing significant patterns in gene organization on chromsomes,
  • the role of large scale duplication in the evolution of insulin,
  • homology identification for multi-domain protein families,
  • tree-based methods for estimating gene duplication times,
  • the role of duplication in pathway evolution.

Lab Website

PACKARD

Department of Biomedical Informatics and Intelligent Systems Program 

Dr. Ganapathiraju’s primary area of research is in Systems Biology, specifically on protein-protein interaction prediction at the system level.  The outcomes of this research will subsequently be applied to translational bioinformatics. A second core area is in Sequence Analysis, for pattern mining in whole-genome and whole-proteome sequences, with application of suffix array data structures for preprocessing the genome sequences.

Lab Website

BRAINS

Department of Biological Sciences

-

Lab Website

Department of Biological Sciences

My laboratory is investigating the following critical questions: (1) if and how the distributions of bacterial strains present in carriage differ from those in infections, (2) how related these different strains are from one another, (3) in what time frame and to what extent do strains modify their genetic compositions, (4) what factors affect their genomic plasticity and, (5) what molecules facilitate intra- and inter-species communications and host interactions.

Lab Website

Department of Biological Sciences

Our group is broadly interested in how genomes control cell fate decisions during embryonic development and also later regeneration. We are also interested to understand how these processed have evolved and how the evolution of genomes leads to the evolution of morphology. We especially focus our research on (i) how gene regulation has evolved (which includes transcriptional evolution of gene regulatory networks- GRNs), (ii) how proteins evolve biochemical differences in function, and (iii) the evolutionary basis of regeneration. We are currently also very focused on understanding how the regulatory mechanisms governing regeneration.

Lab Website

Department of Biomedical Informatics

Research Interests:
  • Application of artificial intelligence, machine learning, Bayesian networks, and other computational methods to problems in biology, medicine, and translational research
  • Modeling of interactome networks and human diseases
  • Personalized medicine and cancer bioinformatics
  • Medical decision support systems
  • Biosurveillance system development
  • Image processing

Lab Website

Lane Center for Computational Biology and Machine Learning Department

Research interests: My main research interests lie in developing statistical machine learning techniques to address significant methodological problems in computational genomics. Recent advances in genome-wide profiling technology have allowed researchers to probe various aspects of biological systems on a system-wide scale, such as the transcriptome, proteome, metabolome, and epigenome. In addition, it is expected that in the future, genome sequencing will become a routine process that can be applied to a large number of individuals. Given the high-dimensional nature of genome-scale data in which many entities interact with each other in a complex manner, I’m interested in developing statistical machine learning techniques for discovering the genetic basis of diseases and disease-related biological processes with the ultimate goal of personalized medicine.: My main research interests lie in developing statistical machine learning techniques to address significant methodological problems in computational genomics. Recent advances in genome-wide profiling technology have allowed researchers to probe various aspects of biological systems on a system-wide scale, such as the transcriptome, proteome, metabolome, and epigenome. In addition, it is expected that in the future, genome sequencing will become a routine process that can be applied to a large number of individuals. Given the high-dimensional nature of genome-scale data in which many entities interact with each other in a complex manner, I’m interested in developing statistical machine learning techniques for discovering the genetic basis of diseases and disease-related biological processes with the ultimate goal of personalized medicine.

Lab Website

SLOAN
CAREER

Lane Center for Computational Biology

We are interested in designing graph and optimization algorithms to extract insight from biological data. In particular, we focus on the following classes of problems:

  • Protein interactions and networks: Evolution of interactions; protein function prediction; clustering within networks; protein structure prediction. This work is supported by NSF grant EF-0849899 and by NSF grant CCF-1053918/CCF-1256087 (CAREER award).
  • Genomics & genome assembly: RNA-seq expression quantification; genome assembly; overlapping genes in bacteria; transcription termination in bacteria (See the TransTermHP program for predicting Rho-independent terminators). This work was supported by NSF grant IIS-0812111. (PI: Mihai Pop) and currently by NIH grant 1R21HG006913.
  • Viral evolution: Reassortment in the influenza genome. This work is supported by NIH grant1R21AI085376.
  • Chromatin structure and function: Algorithms for determining the spatial organization of eukaryotic genomes from Chromosome Conformation Capture data.(Previously supported by a UMIACS New Research Frontiers Award.)

Lab Website

SLOAN
CAREER

Departments of Developmental Biology, and Computational & Systems Biology

How do different organs and tissues arise? What are the genetic and epigenetic mechanisms that drive this development? To address these questions we design statistical methods and algorithms and apply them to large-scale, genome-wide data. Ultimately, our goal is to generate, test, and confirm hypotheses that are relevant to human health.

Lab Website

Department of Biological Sciences and Lane Center for Computational Biology

We are interested in the regulation of alternative pre-mRNA splicing, its contribution to cell function and development, and its disruption in disease.

Lab Website

Department of Biomedical Informatics

Dr. Lu’s research focuses on the computational methods for identifying signaling pathways underlying biological processes and diseases as well as statistical methods for acquiring knowledge from biomedical literature. He was trained in Pharmacology and works in the field of bioinformatics after NLM sponsored postdoctoral training in Biomedical Informatics. His research interest concentrates on applying latent variable models to simulate biological signaling system and text mining.

Currently, Dr. Lu is working on developing his research in translational bioinformatics and systems/computational biology and its application to specific domains relevant to human disease. He is pursuing collaboration in the area of natural language processing and text mining with the eventual goal of establishing a Center or Institute in Translational Bioinformatics.

Lab Website

Department of Biological Sciences

Gene expression varies between individuals and species, and this variation is largely responsible for phenotypic diversity and disease. Research in the McManus lab focuses on understanding the genetic causes of variation in gene expression. Gene expression involves transcription of DNA into mRNA, alternative splicing of mRNA, translation of mRNA into proteins, and regulation of mRNA and protein levels through turnover pathways. Differences in the regulatory networks controlling these processes lead to gene expression variation. Our lab uses high-throughput sequencing and bioinformatics to compare regulation of alternative splicing and mRNA translation in closely related species of fruit flies and yeast. We also use these tools to investigate the structure of large RNAs genome wide. RNA structures play important roles in gene expression, yet very little is known about the structures of most large RNAs.

Lab Website

Department of Biological Sciences

Our lab is focused on understanding how complex developmental programs that control morphology evolve. To do this, we study rapidly evolving traits of the Drosophila model system: pigmentation, male genitalia, and gene expression traits.

Lab Website

Department of Statistics and Lane Center for Computational Biology

Research Interests:

The primary goal of Dr. Roeder’s research group is to develop statistical tools for finding associations between patterns of genetic variation and complex disease. Current data typically involves Next Generation Sequencing and gene expression, including RNAseq. Her methodological work is motivated by studies of schizophrenia, autism and other genetic disorders.

Lab Website

Language Technologies Institute, Department of Machine Learning, Department of Computer Science, and Lane Center for Computational Biology

My research interests are in: (1) Forecasting Epidemics – our long term vision is to make epidemiological forecasting as universally accepted and useful as weather forecasting is today. (2) Information and Communication Technologies for Development – (ICT4D) and specifically Spoken Language Technologies for Development (SLT4D) (3) Modeling the evolution of viral epidemics.

Lab Website

Departments of Biological Sciences and Computer Science, and Lane Center for Computational Biology

My research interests are in the area of computational molecular biology and the modeling and simulation of biological systems. My group is currently working most actively on three topics: simulation methods for macromolecular assembly systems, with special focus on more realistic models of assembly in cellular environments; methods for analysis of human genetic variation data, most recently focused on phylogenetics and population substructure analysis; and application of phylogenetic methods to study cancer progression.

Lab Website

CAREER
PECASE

Department of Biostatistics

My research interests focus on statistical applications of genomics and bioinformatics. We mainly work on data mining of high-throughput genomic and proteomic data (such as microarray, next-generation sequencing and mass spectrometry data) and develop methods in candidate marker detection, supervised machine learning (classification), unsupervised machine learning (clustering) and other topics driven by biological problems. Related research also include statistical modelling, statistical computing and graphical visualization of data. Collaboration with biology labs plays an important role where most of our projects and methodological ideas come from.

Lab Website

Lane Center for Computational Biology

My research focuses on understanding complex human diseases (in particular, asthma and substance abuse and addiction) by undertaking integrative approaches, which combine biology/medicine, computational and statistical learning, bioinformatics, and genomics. I am interested in applying and developing computational algorithms, software and tools i) to help identify genetic and regulatory mechanisms underlying human diseases, so that we can better understand why different genetic and gene expression changes in patients can lead to different disease phenotypes; ii) to identify biomarkers, and classify and diagnose various human diseases using metadata (e.g., genetic, gene expression and phenotypic data); iii) to predict outcomes of patients using metadata; and iv) to identify potential anti-disease drug targets.

Lab Website

Machine Learning Department, Lane Center for Computational Biology, Language Technologies Institute and Computer Science Department

Research synopsis: My principal research interests lie in the development of machine learning and statistical methodology, and large-scale computational system and architecture, for solving problems involving automated learning, reasoning, and decision-making in high-dimensional, multimodal, and dynamic possible worlds in artificial, biological, and social systems.

Lab Website

CAREER
SLOAN

Department of Pharmaceutical Sciences

We integrate multiple-dimensional genomic data and functional approaches taiming to better characterize human solid tumors and identify novel therapy. Currently, ongoing work involves the (i) characterization of genetic and epigenetic alternation of non protein-coding components (ncRNAs) of the genome, such as miRNA and lncRNA genes in solid tumor and (ii) mechanistic studies of resistance to cancer therapeutics, especially the targeted therapy for the tumor pathways with the most prevalent genomic alterations in human solid tumors. Our approaches have proved successful in several instances, including the identification of miR-506 as a tumor suppressor miRNA inhibiting Epithelial-to-Mesenchymal (EMT) and cell cycle pathway and the discovery of BRCA2 gene mutations leading to the genome instability and cisplatin response in ovarian cancer.

Lab Website