Overview
Computational Biology of Gene Expression. We study mechanisms of gene regulation that occur post-transcriptionally using a combination of experimental and computational methods. A major goal is to understand the RNA splicing code: how the precise locations of exons and splice sites are identified in primary transcripts, and how this code is altered in mammalian development and differentiation. We use in vivo screens and computational methods to identify short RNA sequences that enhance or silence splicing, and study the roles of these sequences in the control of splicing decisions genome-wide. We also study the roles that microRNAs play in gene regulation, and we are developing methods to reliably predict microRNA regulatory targets and to understand their functions in gene networks. Eventually, we hope to gain a better understanding of how regulation at different steps in the expression of genes is integrated to produce the appropriate spectrum and levels of mRNA and protein isoforms under particular conditions.
Research Summary
Splicing enhancers and silencers. The RNA splicing machinery is directed to particular locations in RNA transcripts by exon and intron sequences that either enhancer or silence splicing at nearby splice sites. We have developed a computational approach to identify exonic splicing enhancers (ESEs) called RESCUE-ESE (in collaboration with Phil Sharp’s lab). Applying this method to large databases of human exons identified ten sequence motifs, all of which have ESE activity in vivo. This method predicts which point mutations will disrupt ESE activity and can be used to predict the splicing phenotypes of mutations in human exons. Adapting this approach to predict ESEs and intronic splicing enhancers (ISEs) in available vertebrate genomes identified substantial differences in ISEs, but not ESEs, between mammals and fish, suggesting that the components of the splicing machinery which recognize introns and those which recognize exons are evolving at very different rates.
To identify silencers of splicing, we have developed a cell fluorescence-based screening method using a GFP-derived splicing reporter system. A large-scale screen for exonic splicing silencers (ESSs) identified over 100 ESS decamer sequences, which can be clustered into seven groups. The identified sequences can generally function in a different exon context and in a second cell line, suggesting that these ESSs act broadly to inhibit splicing. Core ESS motifs derived from these decamers can be used to substantially improve the accuracy of splicing simulation algorithms. The biased distributions of these motifs in decoy exons known as ‘pseudoexons’, in intronic regions adjacent to splice sites, and in alternatively spliced exons suggested roles for ESSs in suppression of pseudoexon splicing, in splice site definition, and in regulated exon and splice site choice. Experiments are underway to test these hypotheses. We are also adapting our splicing reporter system to screen for other splicing regulatory elements, including intronic splicing silencers (ISSs), a poorly understood class of elements.
Alternative Splicing, Polyadenylation and Evolution. Combinatorial use of different combinations of exons and splice sites through alternative splicing plays a major role in expanding protein diversity and regulating gene expression in animals and plants. We are studying this process genome-wide using computational approaches to identify candidate ‘alternative conserved exons’ (ACEs), exons which are subject to evolutionarily conserved patterns of alternative splicing, in human and mouse. The splicing patterns of candidate ACEs have been verified in many cases by RT-PCR and sequencing, and are being studied on a large scale using splicing-sensitive DNA microarrays. An initial study identifed over 2,000 high-confidence ACEs in the human genome and found that the functions of genes containing predicted ACEs are biased toward transcription factors, RNA binding factors, and developmental regulatory genes. A strong bias toward expression in many brain regions was also observed for the set of ACE-containing genes, consistent with alternative splicing playing important roles in the nervous system. Collaborative projects with the Tom Cooper lab (Baylor) and the Paula Grabowski lab (U. Pittsburgh) are exploring the determinants of muscle-specific and brain region-specific patterns of alternative splicing. Experiments are underway to study conserved and non-conserved alternative splicing and alternative polyadenylation events genome-wide in well-defined neuronal and immune cell lineages. We are also studying the evolution of introns, focusing on understanding the processes by which introns are gained and lost in evolution.
MicroRNA Genes. A family of genes called microRNAs (miRNAs) encode very small (~22 nt) regulatory RNAs. In collaboration with the David Bartel lab we have developed an algorithm called MiRscan that uses aspects of RNA structure and cross-species conservation to predict miRNA genes in genomes. A characteristic conserved sequence motif was found upstream of almost all nematode miRNA genes that is likely to play a role in transcription or processing of miRNAs, and we have used this motif in a new version of MiRscan to identify additional miRNA genes in nematodes.
MicroRNA Regulatory Targets. MicroRNAs can play important gene regulatory roles in nematodes, insects, and plants by basepairing to mRNAs to specify posttranscriptional repression of these messages. To identify which specific mRNAs are regulated by particular miRNAs, we have recently developed the TargetScan and TargetScanS algorithms, again in close collaboration with the David Bartel lab. TargetScanS uses conservation across 4 or 5 vertebrate genomes to predict more than 13,000 regulatory relationships for the conserved vertebrate miRNAs by identifying mRNAs with conserved pairing to the 5´ ‘seed’ region of miRNAs, sometimes supplemented with additional sequence determinants, and evaluating the number and quality of these complementary sites. Rigorous tests using control cohorts of sequences support the validity of a large proportion of these predictions. Eleven predicted targets (out of 15 tested) have also been supported experimentally using a HeLa cell reporter system. The predicted regulatory targets of mammalian miRNAs are enriched for genes involved in transcriptional regulation, development and cell growth but also encompass a very broad range of other functions. Some individual miRNAs or miRNA clusters appear to target functionally coherent groups of mRNAs leading to specific hypotheses about function. We are working to gain a better understanding of the precise sequence requirements for miRNA-directed regulation of gene expression, and to improve our predictions of miRNA targets. We are also beginning to explore the roles of miRNAs in gene regulatory pathways and networks.
Selected Publications
Yeo, G., Van Nostrand, E., Holste, D., Poggio, T. and Burge, C. B. Identification and analysis of alternative splicing events conserved between human and mouse. Proc. Natl. Acad. Sci USA 102, 2850-2855 (2005).
Lewis, B., Burge, C. B., and Bartel, D. P. Conserved seed pairing, often flanked by adenosines, indicates thousands of human microRNA targets. Cell 120, 15-20 (2005).
Wang, Z., Rolish, M., Yeo, G., Tung, V., Mawson, M. and Burge, C. B. Systematic identification and analysis of exonic splicing silencers. Cell 119, 831-845 (2004).
Yeo, G., Hoon, S., Venkatesh, B. and Burge, C. B.Variation in sequence and organization of splicing regulatory elements in vertebrate genes. Proc. Natl. Acad. Sci USA 101, 15700-15705 (2004).
Nielsen, C., Friedman, B., Birren, B., Burge, C. B. and Galagan, J. Patterns of intron gain and loss in fungi. PLoS Biol. 2, e422 (2004).
Fairbrother, W. G., Holste, D., Burge, C. B. and Sharp, P. A. Single nucleotide polymorphism-based validation of exonic splicing enhancers. PLoS Biol. 2, e268 (2004).
Lewis, B. P., Shih, I-h., Jones-Rhoades, M. W., Bartel, D. P. and Burge, C. B.Prediction of mammalian microRNA targets. Cell 115, 787-798 (2003).
Lim, L. P., Glasner, M., Yekta, S., Burge, C. B. and Bartel, D. P. Vertebrate microRNA genes. Science 299: 1540 (2003).
Lim, L. P., Lau, N. C., Weinstein, E. G., Abdelhakim, A., Yekta, S., Rhoades, M. W., Burge, C. B. and Bartel, D. P. The microRNAs of Caenorhabditis elegans. Genes & Dev. 17: 977-990 (2003).
Fairbrother, W., Yeh, R.-F.,Sharp, P. A. and Burge, C. B. Predictive identification of exonicsplicing enhancers in human genes. Science 297: 1007-1013 (2002).
Search PubMed for Burge lab publications.