An ensemble framework for identifying essential proteins

BMC Bioinformatics. 2016 Aug 25;17(1):322. doi: 10.1186/s12859-016-1166-7.

Abstract

Background: Many centrality measures have been proposed to mine and characterize the correlations between network topological properties and protein essentiality. However, most of them show limited prediction accuracy, and the number of common predicted essential proteins by different methods is very small.

Results: In this paper, an ensemble framework is proposed which integrates gene expression data and protein-protein interaction networks (PINs). It aims to improve the prediction accuracy of basic centrality measures. The idea behind this ensemble framework is that different protein-protein interactions (PPIs) may show different contributions to protein essentiality. Five standard centrality measures (degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, and subgraph centrality) are integrated into the ensemble framework respectively. We evaluated the performance of the proposed ensemble framework using yeast PINs and gene expression data. The results show that it can considerably improve the prediction accuracy of the five centrality measures individually. It can also remarkably increase the number of common predicted essential proteins among those predicted by each centrality measure individually and enable each centrality measure to find more low-degree essential proteins.

Conclusions: This paper demonstrates that it is valuable to differentiate the contributions of different PPIs for identifying essential proteins based on network topological characteristics. The proposed ensemble framework is a successful paradigm to this end.

Keywords: Centrality measure; Ensemble learning; Essential protein; Gene expression; Protein-protein interaction networks.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Gene Expression
  • Genome, Fungal
  • Open Reading Frames
  • Protein Interaction Mapping / methods*
  • Protein Interaction Maps
  • Proteins / chemistry
  • Proteins / metabolism*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae / metabolism
  • Saccharomyces cerevisiae Proteins / chemistry
  • Saccharomyces cerevisiae Proteins / metabolism

Substances

  • Proteins
  • Saccharomyces cerevisiae Proteins

Associated data

  • GEO/GSE3431