Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions

Stat Methods Med Res. 2018 Mar;27(3):785-797. doi: 10.1177/0962280216643116. Epub 2016 Apr 25.

Abstract

Background All methods routinely used to generate safety signals from pharmacovigilance databases rely on disproportionality analyses of counts aggregating patients' spontaneous reports. Recently, it was proposed to analyze individual spontaneous reports directly using Bayesian lasso logistic regressions. Nevertheless, this raises the issue of choosing an adequate regularization parameter in a variable selection framework while accounting for computational constraints due to the high dimension of the data. Purpose Our main objective is to propose a method, which exploits the subsampling idea from Stability Selection, a variable selection procedure combining subsampling with a high-dimensional selection algorithm, and adapts it to the specificities of the spontaneous reporting data, the latter being characterized by their large size, their binary nature and their sparsity. Materials and method Given the large imbalance existing between the presence and absence of a given adverse event, we propose an alternative subsampling scheme to that of Stability Selection resulting in an over-representation of the minority class and a drastic reduction in the number of observations in each subsample. Simulations are used to help define the detection threshold as regards the average proportion of false signals. They are also used to compare the performances of the proposed sampling scheme with that originally proposed for Stability Selection. Finally, we compare the proposed method to the gamma Poisson shrinker, a disproportionality method, and to a lasso logistic regression approach through an empirical study conducted on the French national pharmacovigilance database and two sets of reference signals. Results Simulations show that the proposed sampling strategy performs better in terms of false discoveries and is faster than the equiprobable sampling of Stability Selection. The empirical evaluation illustrates the better performances of the proposed method compared with gamma Poisson shrinker and the lasso in terms of number of reference signals retrieved.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adverse Drug Reaction Reporting Systems / statistics & numerical data*
  • Algorithms*
  • Bayes Theorem*
  • Biostatistics / methods
  • Computer Simulation
  • Databases, Pharmaceutical / statistics & numerical data
  • Drug-Related Side Effects and Adverse Reactions
  • France
  • Humans
  • Logistic Models
  • Pharmacovigilance*