Aggregation propensity of the human proteome

PLoS Comput Biol. 2008 Oct;4(10):e1000199. doi: 10.1371/journal.pcbi.1000199. Epub 2008 Oct 17.

Abstract

Formation of amyloid-like fibrils is involved in numerous human protein deposition diseases, but is also an intrinsic property of polypeptide chains in general. Progress achieved recently now allows the aggregation propensity of proteins to be analyzed over large scales. In this work we used a previously developed predictive algorithm to analyze the propensity of the 34,180 protein sequences of the human proteome to form amyloid-like fibrils. We show that long proteins have, on average, less intense aggregation peaks than short ones. Human proteins involved in protein deposition diseases do not differ extensively from the rest of the proteome, further demonstrating the generality of protein aggregation. We were also able to reproduce some of the results obtained with other algorithms, demonstrating that they do not depend on the type of computational tool employed. For example, proteins with different subcellular localizations were found to have different aggregation propensities, in relation to the various efficiencies of quality control mechanisms. Membrane proteins, intrinsically disordered proteins, and folded proteins were confirmed to have very different aggregation propensities, as a consequence of their different structures and cellular microenvironments. In addition, gatekeeper residues at strategic positions of the sequences were found to protect human proteins from aggregation. The results of these comparative analyses highlight the existence of intimate links between the propensity of proteins to form aggregates with beta-structure and their biology. In particular, they emphasize the existence of a negative selection pressure that finely modulates protein sequences in order to adapt their aggregation propensity to their biological context.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amyloid / chemistry
  • Amyloid / metabolism
  • Computational Biology
  • Databases, Protein
  • Humans
  • Multiprotein Complexes
  • Protein Folding
  • Protein Interaction Mapping
  • Proteome / chemistry*
  • Proteome / genetics
  • Proteome / metabolism*
  • Proteomics / statistics & numerical data
  • Subcellular Fractions / metabolism

Substances

  • Amyloid
  • Multiprotein Complexes
  • Proteome