Consensus shapes: an alternative to the Sankoff algorithm for RNA consensus structure prediction

Bioinformatics. 2005 Sep 1;21(17):3516-23. doi: 10.1093/bioinformatics/bti577. Epub 2005 Jul 14.

Abstract

Motivation: The well-known Sankoff algorithm for simultaneous RNA sequence alignment and folding is currently considered an ideal, but computationally over-expensive method. Available tools implement this algorithm under various pragmatic restrictions. They are still expensive to use, and it is difficult to judge if the moderate quality of results is because of the underlying model or to its imperfect implementation.

Results: We propose to redefine the consensus structure prediction problem in a way that does not imply a multiple sequence alignment step. For a family of RNA sequences, our method explicitly and independently enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences. For each sequence, it delivers the thermodynamically best structure which has this common shape. Since the shape space is much smaller than the structure space, and identification of common shapes can be done in linear time (in the number of shapes considered), the method is essentially linear in the number of sequences. Our evaluation shows that the new method compares favorably with available alternatives.

Availability: The new method has been implemented in the program RNAcast and is available on the Bielefeld Bioinformatics Server.

Contact: jreeder@TechFak.Uni-Bielefeld.DE, robert@TechFak.Uni-Bielefeld.DE SUPPLEMENTARY INFORMATION: Available at http://bibiserv.techfak.uni-bielefeld.de/rnacast/supplementary.html

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computer Simulation
  • Consensus Sequence
  • Models, Chemical*
  • Models, Molecular*
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • RNA / analysis
  • RNA / chemistry*
  • Sequence Alignment / methods*
  • Sequence Analysis, RNA / methods*
  • Sequence Homology, Nucleic Acid
  • Software*

Substances

  • RNA