An estimate of the sequencing error frequency in the DNA sequence databases

DNA Seq. 1992;2(6):343-6. doi: 10.3109/10425179209020815.

Abstract

We have examined vector sequences fortuitously present in the EMBL sequence database as contaminating parts of submitted sequences, and found a sequencing error frequency of 3.55% in this subset of release 27 of the database. We discuss the possibility that this value may be representative for corresponding errors in the database as a whole.

MeSH terms

  • Base Sequence
  • DNA*
  • Databases, Factual / standards*
  • Genetic Vectors
  • Reproducibility of Results

Substances

  • DNA