Incorporating epistasis interaction of genetic susceptibility single nucleotide polymorphisms in a lung cancer risk prediction model

Int J Oncol. 2016 Jul;49(1):361-70. doi: 10.3892/ijo.2016.3499. Epub 2016 Apr 25.

Abstract

Incorporation of genetic variants such as single nucleotide polymorphisms (SNPs) into risk prediction models may account for a substantial fraction of attributable disease risk. Genetic data, from 2385 subjects recruited into the Liverpool Lung Project (LLP) between 2000 and 2008, consisting of 20 SNPs independently validated in a candidate-gene discovery study was used. Multifactor dimensionality reduction (MDR) and random forest (RF) were used to explore evidence of epistasis among 20 replicated SNPs. Multivariable logistic regression was used to identify similar risk predictors for lung cancer in the LLP risk model for the epidemiological model and extended model with SNPs. Both models were internally validated using the bootstrap method and model performance was assessed using area under the curve (AUC) and net reclassification improvement (NRI). Using MDR and RF, the overall best classifier of lung cancer status were SNPs rs1799732 (DRD2), rs5744256 (IL-18), rs2306022 (ITGA11) with training accuracy of 0.6592 and a testing accuracy of 0.6572 and a cross-validation consistency of 10/10 with permutation testing P<0.0001. The apparent AUC of the epidemiological model was 0.75 (95% CI 0.73-0.77). When epistatic data were incorporated in the extended model, the AUC increased to 0.81 (95% CI 0.79-0.83) which corresponds to 8% increase in AUC (DeLong's test P=2.2e-16); 17.5% by NRI. After correction for optimism, the AUC was 0.73 for the epidemiological model and 0.79 for the extended model. Our results showed modest improvement in lung cancer risk prediction when the SNP epistasis factor was added.

MeSH terms

  • Adult
  • Area Under Curve
  • Case-Control Studies
  • Epistasis, Genetic*
  • Female
  • Genetic Predisposition to Disease
  • Humans
  • Integrin alpha Chains / genetics*
  • Interleukin-18 / genetics*
  • Logistic Models
  • Lung Neoplasms / epidemiology
  • Lung Neoplasms / genetics*
  • Lung Neoplasms / pathology
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide / genetics
  • Receptors, Dopamine D2 / genetics*
  • Risk Factors

Substances

  • DRD2 protein, human
  • IL18 protein, human
  • ITGA11 protein, human
  • Integrin alpha Chains
  • Interleukin-18
  • Receptors, Dopamine D2