Translating bioinformatics in oncology: guilt-by-profiling analysis and identification of KIF18B and CDCA3 as novel driver genes in carcinogenesis

Bioinformatics. 2015 Jan 15;31(2):216-24. doi: 10.1093/bioinformatics/btu586. Epub 2014 Sep 18.

Abstract

Motivation: Co-regulated genes are not identified in traditional microarray analyses, but may theoretically be closely functionally linked [guilt-by-association (GBA), guilt-by-profiling]. Thus, bioinformatics procedures for guilt-by-profiling/association analysis have yet to be applied to large-scale cancer biology. We analyzed 2158 full cancer transcriptomes from 163 diverse cancer entities in regard of their similarity of gene expression, using Pearson's correlation coefficient (CC). Subsequently, 428 highly co-regulated genes (|CC| ≥ 0.8) were clustered unsupervised to obtain small co-regulated networks. A major subnetwork containing 61 closely co-regulated genes showed highly significant enrichment of cancer bio-functions. All genes except kinesin family member 18B (KIF18B) and cell division cycle associated 3 (CDCA3) were of confirmed relevance for tumor biology. Therefore, we independently analyzed their differential regulation in multiple tumors and found severe deregulation in liver, breast, lung, ovarian and kidney cancers, thus proving our GBA hypothesis. Overexpression of KIF18B and CDCA3 in hepatoma cells and subsequent microarray analysis revealed significant deregulation of central cell cycle regulatory genes. Consistently, RT-PCR and proliferation assay confirmed the role of both genes in cell cycle progression. Finally, the prognostic significance of the identified KIF18B- and CDCA3-dependent predictors (P = 0.01, P = 0.04) was demonstrated in three independent HCC cohorts and several other tumors. In summary, we proved the efficacy of large-scale guilt-by-profiling/association strategies in oncology. We identified two novel oncogenes and functionally characterized them. The strong prognostic importance of downstream predictors for HCC and many other tumors indicates the clinical relevance of our findings.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Biomarkers, Tumor / metabolism
  • Blotting, Western
  • Carcinogenesis
  • Carcinoma, Hepatocellular / genetics
  • Carcinoma, Hepatocellular / mortality
  • Carcinoma, Hepatocellular / pathology
  • Cell Cycle
  • Cell Cycle Proteins / genetics
  • Cell Cycle Proteins / metabolism
  • Cell Proliferation
  • Computational Biology / methods*
  • Disease Progression
  • Flow Cytometry
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Kinesins / genetics*
  • Kinesins / metabolism
  • Liver Neoplasms / genetics
  • Liver Neoplasms / mortality
  • Liver Neoplasms / pathology
  • Neoplasms / genetics*
  • Neoplasms / mortality
  • Neoplasms / pathology
  • Oligonucleotide Array Sequence Analysis
  • Prognosis
  • Survival Rate
  • Tumor Cells, Cultured
  • Tumor Stem Cell Assay

Substances

  • Biomarkers, Tumor
  • CDCA3 protein, human
  • Cell Cycle Proteins
  • KIF18B protein, human
  • Kinesins