Applications and analyses of Third Generation Sequencing data to transcriptome and epigenetics research
The rapid development of Third Generation Sequencing (TGS) technologies, including PacBio and Oxford Nanopore Technologies, has greatly advanced many omics research. Indeed, single-molecules long reads provided by TGS platforms offer many chances of innovative applications, as well as high demand and challenges of bioinformatics method development. In this presentation, I will discuss how to utilize the unique information of TGS data in transcriptome and epigenetics research: 1) Integration short reads into the analysis of TGS data (termed hybrid sequencing) to improve the overall performance and resolution of transcriptome analysis (including discovery of novel genes, fusion genes and allele-specific expression) at the gene isoform level. The proof-of-concept applications to breast cancer cells and human embryonic stem cells revealed the isoform-level complexity of fusion gene expression and allele-specific expression, and also discovered novel genes involved in pluripotency regulation. 2) Capture GpC specific 5mC footprint after DNA methyltransferase treatment to identify nucleosome occupancy and chromatin accessibility at single DNA molecules. The proof-of-concept application to yeast resolved the heterogeneity of combinatorial complexity of multiple nucleosomes and chromatin accessibility status at large genomic range.