The biological roles of RNA, beyond encoding proteins, have expanded in the last decade to include a diversity of important gene regulatory functions in nearly all living things. At the same time, genome sequencing efforts have produced a wealth of data that can be mined through comparative genomics in order to study the evolution non-coding RNAs (ncRNAs), as well as identify previously unknown non-coding RNAs. We use a combination of computational and experimental tools to both discover new structured non-coding RNAs, as well as examine how these RNAs and their protein partners evolve.

We are particularly interested in RNA structures that bind proteins to control gene expression. This type of interaction coordinates the synthesis of the ribosomal protein subunits in E. coli. While many of these structures are well known in E. coli, they are not always conserved in other bacteria. We expect to idenify other ncRNAs present in different phyla of bacteria that perform similar biological functions. Our goals are to identify how ribosomal proteins are controlled in non-model species of bacteria, experimentally validate and putative regulatory structures discovered, and understand how such diverse mechanisms for gene regulation can evolve over time.


The predominant methodology for the discovery of ncRNAs is comparative genomics. Using the massive amounts of sequence information generated by microbial sequencing projects, and various metagenomic projects we discover new structured mRNA elements that are hypothesized to control gene expression. We are targeting genomic regions associated with proteins known to bind RNA as potential sites of novel ncRNA elements using a combination of existing bioinformatic tools. The putative ncRNA structures we identify are characterized by an alignment of sequences sharing a common secondary structure and genomic location across many species of bacteria. Of special interest to us at the moment is data generated by the Human Microbiome Project. This NIH-funded project is a large sequencing effort targeted toward understanding the microbial communities associated with the human body. We are interested in what ncRNAs might be found by analyzing this data and how these might correlate with various desase states.


To validate the biological function of the putative ncRNA elements discovered by comparative genomics we use a combination of in vitro and genetic appraoches. In vitro approaches include filter-binding and electrophoretic mobility shift assays to demonstrate direct RNA-protein interactions in addition to studies of the native RNA transcripts. In organisms with established genetics, genetic approaches to over-express regulatory proteins are combined with reporter genes and qRT-PCR to detect changes in gene expression.


Homologous proteins in different species of bacteria have been shown to bind mRNA structures with no apparent similarity to accomplish the same biological function. In other cases the same RNA structure appears to have arisen multiple times to bind homologous proteins. We are interested in understanding the different evolutionary paths of these RNA structures through examining both their natural phylogenies and through laboratory evolution.