Automatic policing of biochemical annotations using genomic correlations

Nat Chem Biol. 2010 Jan;6(1):34-40. doi: 10.1038/nchembio.266. Epub 2009 Nov 22.

Abstract

With the increasing role of computational tools in the analysis of sequenced genomes, there is an urgent need to maintain high accuracy of functional annotations. Misannotations can be easily generated and propagated through databases by functional transfer based on sequence homology. We developed and optimized an automatic policing method to detect biochemical misannotations using context genomic correlations. The method works by finding genes with unusually weak genomic correlations in their assigned network positions. We demonstrate the accuracy of the method using a cross-validated approach. In addition, we show that the method identifies a significant number of potential misannotations in Bacillus subtilis, including metabolic assignments already shown to be incorrect experimentally. The experimental analysis of the mispredicted genes forming the leucine degradation pathway in B. subtilis demonstrates that computational policing tools can generate important biological hypotheses.

MeSH terms

  • Algorithms
  • Automation
  • Bacillus subtilis / genetics*
  • Bacillus subtilis / metabolism
  • Biochemistry / methods*
  • Computational Biology / methods*
  • Databases, Genetic
  • Electronic Data Processing
  • Genomics*
  • Leucine / chemistry
  • Models, Genetic
  • Models, Statistical
  • Phylogeny
  • Quality Control
  • ROC Curve
  • Saccharomyces cerevisiae / genetics

Substances

  • Leucine

Associated data

  • PubChem-Substance/85267074
  • PubChem-Substance/85267075
  • PubChem-Substance/85267076
  • PubChem-Substance/85267077
  • PubChem-Substance/85267078
  • PubChem-Substance/85267079
  • PubChem-Substance/85267080
  • PubChem-Substance/85267081
  • PubChem-Substance/85267082
  • PubChem-Substance/85267083
  • PubChem-Substance/85267084
  • PubChem-Substance/85267085
  • PubChem-Substance/85267086
  • PubChem-Substance/85267087
  • PubChem-Substance/85267088
  • PubChem-Substance/85267089
  • PubChem-Substance/85267090
  • PubChem-Substance/85267091
  • PubChem-Substance/85267092
  • PubChem-Substance/85267093
  • PubChem-Substance/85267094
  • PubChem-Substance/85267095
  • PubChem-Substance/85267096
  • PubChem-Substance/85267097
  • PubChem-Substance/85267098
  • PubChem-Substance/85267099
  • PubChem-Substance/85267100
  • PubChem-Substance/85267101
  • PubChem-Substance/85267102
  • PubChem-Substance/85267103