Hidden Treasures in RNA-Seq Data Sets
With the advent of high-throughput transcriptomics it has become clear that in essence the complete genomic DNA is transcribed, i.e., there is no "Junk DNA". Non-coding RNAs, however, by no means form a structurally, functionally, or mechanistically homogenous group. Instead, a plethora of non-coding RNA classes, distinguished both by their mode of function through interactions proteins, other RNAs, or DNA and the processing pathways that generated them continues to be discovered. We are only beginning to understand how these transcripts function, how they have originated, and how they evolve. These patterns in different in many respect from the textbook knowledge on protein coding genes. Even though many long non-coding RNAs are evolutionarily very old, they experience little selection pressure at the sequence level. Multiple processing steps with distinct functions for the different stages of the same molecule appear to be the rule rather than the exception. Computational analysis methods play a central role in gaining an understanding of the transcriptional universe, in particular since analysis pipeline are more often than not built upon models of the RNA universe that are incomplete or even plain wrong and hence are blind to treasures hidden in available experimental data sets.
About the Speaker
Peter F. Stadler received his Ph.D. in Chemistry from the University of Vienna in 1990 following studies in chemistry, mathematics, physics and astronomy. After a PostDoc at the Max Planck Institute for Biophysical Chemistry in Goettingen he returned to Vienna to work in the area to theoretical biochemistry. Since 1994 he is External Professor at the Santa Fe Institute, a research center focussed on Complex Systems. In 2002 he moved to the University of Leipzig as Full Professor of Bioinformatics. Since 2010 he is External Scientific Member of the Max Planck Society affiliated with the MPI for Mathematics in the Sciences. The general theme of his research is the search for a consistent understanding of biological processes (with an emphasis on (molecular) evolution) at the genotypic, phenotypic, and dynamical level. The techniques range from the analysis of the dynamical systems arising in chemical kinetics and population genetics, to large scale simulations of RNA evolution and the analysis of viral sequence data, to knowledge- based protein potentials, and to algebraic combinatorics applied to the study of fitness landscapes.