TOP-IDP-scale: a new amino acid scale measuring propensity for intrinsic disorder

Protein Pept Lett. 2008;15(9):956-63. doi: 10.2174/092986608785849164.

Abstract

Intrinsically disordered proteins carry out various biological functions while lacking ordered secondary and/or tertiary structure. In order to find general intrinsic properties of amino acid residues that are responsible for the absence of ordered structure in intrinsically disordered proteins we surveyed 517 amino acid scales. Each of these scales was taken as an independent attribute for the subsequent analysis. For a given attribute value X, which is averaged over a consecutive string of amino acids, and for a given data set having both ordered and disordered segments, the conditional probabilities P(s(o) | x) and P(s(d) | x) for order and disorder, respectively, can be determined for all possible values of X. Plots of the conditional probabilities P(s(o) | x) and P(s(o) | x) versus X give a pair of curves. The area between these two curves divided by the total area of the graph gives the area ratio value (ARV), which is proportional to the degree of separation of the two probability curves and, therefore, provides a measure of the given attribute's power to discriminate between order and disorder. As ARV falls between zero and one, larger ARV corresponds to the better discrimination between order and disorder. Starting from the scale with the highest ARV, we applied a simulated annealing procedure to search for alternative scale values and have managed to increase the ARV by more than 10%. The ranking of the amino acids in this new TOP-IDP scale is as follows (from order promoting to disorder promoting): W, F, Y, I, M, L, V, N, C, T, A, G, R, D, H, Q, K, S, E, P. A web-based server has been created to apply the TOP-IDP scale to predict intrinsically disordered proteins (http://www.disprot.org/dev/disindex.php).

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Computational Biology
  • Data Interpretation, Statistical
  • Databases, Protein*
  • Protein Conformation
  • Protein Folding
  • Proteins / chemistry*

Substances

  • Amino Acids
  • Proteins