Rhs elements of Escherichia coli K-12: complex composites of shared and unique components that have different evolutionary histories

J Bacteriol. 1993 May;175(10):2799-808. doi: 10.1128/jb.175.10.2799-2808.1993.

Abstract

The complete sequences of the RhsB and RhsC elements of Escherichia coli K-12 have been determined. These sequence data reveal a new repeated sequence, called H-rpt (Hinc repeat), which is distinct from the Rhs core repetition that is found in all five Rhs elements. H-rpt is found in RhsB, RhsC, and RhsE. Characterization of H-rpt supports the view that the Rhs elements are composite structures assembled from components with very different evolutionary histories and that their incorporation into the E. coli genome is relatively recent. In each case, H-rpt is found downstream from the Rhs core and is separated from the core by a segment of DNA that is unique to the individual element. The H-rpt's of RhsB and RhsE are very similar, diverging by only 2.1%. They are 1,291 bp in length, and each contains an 1,134-bp open reading frame (ORF). RhsC has three tandem copies of H-rpt, all of which appear defective in that they are large deletions and/or have the reading frame interrupted. Features of H-rpt are analogous to features typical of insertion sequences; however, no associated transposition activity has been detected. A 291-bp fragment of H-rpt is found near min 5 of the E. coli K-12 map and is not associated with any Rhs core homology. The complete core sequences of RhsB and RhsC have been compared with that of RhsA. As anticipated, the three core sequences are closely related, all having identical lengths of 3,714 bp each. Like RhsA, the RhsB and RhsC cores constitute single ORFs that begin with the first core base. In each case, the core ORF extends beyond the core into the unique sequence. Of the three cores, RhsB and RhsA are the most similar, showing only 0.9% sequence divergence, while RhsB and RhsC are the least similar, diverging by 2.9%. All three cores conserve the 28 repetitions of a peptide motif noted originally for RhsA. A secondary structure is proposed for this motif, and the possibility of its having an extracellular binding function is discussed. RhsB contains one additional unique ORF, and RhsC contains two additional unique ORFs. One of these ORFs includes a signal peptide that is functional when fused to TnphoA.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Base Composition
  • Base Sequence
  • Biological Evolution
  • Chromosome Mapping
  • Cloning, Molecular
  • DNA Transposable Elements / genetics*
  • Escherichia coli / genetics*
  • Genes, Bacterial / genetics*
  • Molecular Sequence Data
  • Open Reading Frames / genetics*
  • Protein Sorting Signals / genetics
  • Protein Structure, Secondary
  • Repetitive Sequences, Nucleic Acid / genetics*
  • Sequence Analysis, DNA
  • Sequence Deletion
  • Sequence Homology, Nucleic Acid

Substances

  • Bacterial Proteins
  • DNA Transposable Elements
  • Protein Sorting Signals

Associated data

  • GENBANK/L02370
  • GENBANK/L02371
  • GENBANK/L02372
  • GENBANK/L02373