Genome sequences of Escherichia coli B strains REL606 and BL21(DE3)

J Mol Biol. 2009 Dec 11;394(4):644-52. doi: 10.1016/j.jmb.2009.09.052. Epub 2009 Sep 26.

Abstract

Escherichia coli K-12 and B have been the subjects of classical experiments from which much of our understanding of molecular genetics has emerged. We present here complete genome sequences of two E. coli B strains, REL606, used in a long-term evolution experiment, and BL21(DE3), widely used to express recombinant proteins. The two genomes differ in length by 72,304 bp and have 426 single base pair differences, a seemingly large difference for laboratory strains having a common ancestor within the last 67 years. Transpositions by IS1 and IS150 have occurred in both lineages. Integration of the DE3 prophage in BL21(DE3) apparently displaced a defective prophage in the lambda attachment site of B. As might have been anticipated from the many genetic and biochemical experiments comparing B and K-12 over the years, the B genomes are similar in size and organization to the genome of E. coli K-12 MG1655 and have >99% sequence identity over approximately 92% of their genomes. E. coli B and K-12 differ considerably in distribution of IS elements and in location and composition of larger mobile elements. An unexpected difference is the absence of a large cluster of flagella genes in B, due to a 41 kbp IS1-mediated deletion. Gene clusters that specify the LPS core, O antigen, and restriction enzymes differ substantially, presumably because of horizontal transfer. Comparative analysis of 32 independently isolated E. coli and Shigella genomes, both commensals and pathogenic strains, identifies a minimal set of genes in common plus many strain-specific genes that constitute a large E. coli pan-genome.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA, Bacterial / chemistry
  • DNA, Bacterial / genetics*
  • Escherichia coli / genetics*
  • Genome, Bacterial*
  • Interspersed Repetitive Sequences
  • Molecular Sequence Data
  • Polymorphism, Genetic
  • Prophages / genetics
  • Sequence Analysis, DNA*

Substances

  • DNA, Bacterial

Associated data

  • GENBANK/CP000819
  • GENBANK/CP001509
  • GENBANK/EU078592