David M. Sturgill, Ph.D.
- Center for Cancer Research
- National Cancer Institute
- Building 41, Room B622
- Bethesda, MD 20892
RESEARCH SUMMARYDr. Sturgill is a computational biologist with a primary interest in understanding genome function and gene regulation. Through integrative analysis of high-throughput datasets, and applying novel statistical strategies, Dr. Sturgill has made important contributions to the fields of genomics and transcriptomics.
Areas of Expertise
1) bioinformatics, 2) alternative splicing, 3) gene expression, 4) genomics,
5) computational biology
Genes can be differentially regulated to produce different proteins in different cellular contexts. The misregulation of genes is a major contributor to many diseases. The advent of RNA-Seq experiments, in which transcripts are converted to DNA and sequenced, has provided the opportunity to interrogate gene regulation at the transcriptional and post-transcriptional level on a global scale.
Sequencing technology has made it possible to not just assay transcripts, but to also identify genomic features such as chromatin marks, epigenetic features, and protein binding sites. My research has involved integrating these data to reveal insights in gene regulation. What has become clear from the aggregate of these experiments is that a host of interacting mechanisms are involved in regulating the transcriptome.
Despite the high resolution and reliability of current sequencing experiments, there remains debate about whether newly identified genomic features have a function. One powerful way to address this question is through comparative genomics. By observing which features are conserved in evolutionary time, we can infer functional significance. My recent work has used comparative transcriptomics of 20 Drosophila species to elucidate the evolutionary dynamics of the transcriptome of this model system, and helped to define a regulatory timecourse of gene expression through development.
The major boon of RNA-Seq experiments is identification of differential splicing, which permits the production of multiple distinct mature transcripts from a single parent transcript, vastly diversing the transcriptome. However, methods to rigorously identify and quantify splicing differences from RNA-Seq experiments remain under developed. To come closer to reaching RNA-Seq’s potential, I recently published a software package to help separate true splicing differences from noise, and showed that many putative novel splicing events may arise from technical artifact.
Integrating high-throughput experiments together have enabled insights into transcriptional regulation. Our recent work combining RNA-Seq with DNA methylation profiling has shown that splicing decisions can be influenced by epigenetic modifications. Other recent work revealed a link between splicing and the structure and function of nuclear bodies. I have also analyzed the transcriptional response to DNA damage and replication stress, and examined transcription from heterochromatic regions previously thought to be transcriptionally silent, including centromeres.
The amount of data from ENCODE and other collaborative projects is on the petabyte scale, and growing daily. These published data provide a great resource to enhance the utility of new experimental data. My major goal is to use computational methods to translate all of this information into clinically relevant knowledge about genome function.
TET-catalyzed oxidation of intragenic 5-methylcytosine regulates CTCF-dependent alternative splicing
Design of RNA splicing analysis null models for post hoc filtering of Drosophila head RNA-Seq data with the splicing analysis kit (Spanki)
Evidence for compensatory upregulation of expressed X-linked genes in mammals, Caenorhabditis elegans and Drosophila melanogaster
David M. Sturgill, Ph.D.
Dr. Sturgill began research in bioinformatics at Virginia Tech in 2001, where he built an automated annotation pipeline for bacterial genomes called GenoMosaic with Cynthia Gibas. He then joined the Brian Oliver lab at the NIDDK, investigating gene regulation using sexual dimorphism in Drosophila as a model. As part of the 12 Drosophila Genomes Consortium, he published a multiple eukaryotic comparative whole transcriptome analysis (Zhang and Sturgill, Nature, 2007). He completed a Ph.D in 2013 at the University of Maryland, advised by Steve Mount, in the University's Computational Biology, Bioinformatics, and Genomics (CBBG) program. Participating in the modENCODE consortium, he published an expanded high-resolution comparative transcriptomics analysis in 20 Drosophila species (Chen and Sturgill, Genome Research, 2014), and developed software for splicing analysis of RNA-Seq data (Sturgill, BMC Bioinformatics, 2013). Pursuing interests in mammalian genetics, clinical applications of sequencing, and epigenetics, he joined the LRBGE in 2014.