Towards evidence-based computational statistics: lessons from clinical research on the role and design of real-data benchmark studies

BMC Med Res Methodol. 2017 Sep 9;17(1):138. doi: 10.1186/s12874-017-0417-2.

Abstract

Background: The goal of medical research is to develop interventions that are in some sense superior, with respect to patient outcome, to interventions currently in use. Similarly, the goal of research in methodological computational statistics is to develop data analysis tools that are themselves superior to the existing tools. The methodology of the evaluation of medical interventions continues to be discussed extensively in the literature and it is now well accepted that medicine should be at least partly "evidence-based". Although we statisticians are convinced of the importance of unbiased, well-thought-out study designs and evidence-based approaches in the context of clinical research, we tend to ignore these principles when designing our own studies for evaluating statistical methods in the context of our methodological research.

Main message: In this paper, we draw an analogy between clinical trials and real-data-based benchmarking experiments in methodological statistical science, with datasets playing the role of patients and methods playing the role of medical interventions. Through this analogy, we suggest directions for improvement in the design and interpretation of studies which use real data to evaluate statistical methods, in particular with respect to dataset inclusion criteria and the reduction of various forms of bias. More generally, we discuss the concept of "evidence-based" statistical research, its limitations and its impact on the design and interpretation of real-data-based benchmark experiments.

Conclusion: We suggest that benchmark studies-a method of assessment of statistical methods using real-world datasets-might benefit from adopting (some) concepts from evidence-based medicine towards the goal of more evidence-based statistical research.

Keywords: Clinical trial; Comparison study; Good practice; Method evaluation.

MeSH terms

  • Animals
  • Benchmarking*
  • Biomedical Research*
  • Clinical Trials as Topic
  • Datasets as Topic
  • Evidence-Based Medicine*
  • Humans
  • Statistics as Topic*