Threats to the Validity of Clinical Trials Employing Enrichment Strategies for Sample Selection
Introduction
A new drug product may not be marketed legally in the United States until it is the subject of an approved New Drug Application (NDA). To gain approval, a sponsor must, among other requirements of the Federal Food, Drug, and Cosmetic Act (the Act), submit “substantial evidence” of the product's effectiveness to the Food and Drug Administration (FDA). Such evidence should allow an expert clinician to conclude that the drug will have the effects the sponsor claims it will have. To obtain the required evidence of effectiveness, sponsors typically conduct trials employing parallel group, randomized, placebo, and/or dose-comparison concurrent control designs. If such a clinical study detects a statistically significant difference that favors the investigational drug over the control condition on a valid measure of clinical outcome, the study is ordinarily accepted as a source contributing to the substantial-evidence requirement of the Act.
The Act, importantly, makes no demands in regard to the minimum size of a treatment effect or the minimum proportion of the patient population that must respond to a drug for it to be declared effective. In addition, no requirements are imposed regarding the “representativeness” of a study sample vis-à- vis the population of patients from which it is drawn. The typical commercial drug effectiveness trial, therefore, is not unlike what Schwartz and Lellouch [1]call an “explanatory” study. Its primary aim is to provide proof that a drug has a therapeutic effect in at least some patients. Such trials are contrasted with those of “pragmatic” design that seek to obtain valid estimates of the treatment's expected effect under conditions of actual use in the population.
Subject selection criteria employed in clinical trials of commercial drugs, therefore, are designed primarily to recruit cohorts containing patients who are cooperative, compliant, and likely to respond to the investigational drug if it is, in fact, efficacious. Thus, definitive effectiveness trials typically exclude patients with extremely mild or advanced disease, those at the extremes of age, those in poor health, and those with bad habits (such as smokers and those who use illicit drugs or consume excessive amounts of alcohol). These exclusions are acceptable because, despite the limitations they impose on the generalizability of a study's results, they ordinarily pose no threat to a study's internal validity. In sum, if a randomized controlled clinical trial conducted in a sample of patients reliably known to suffer from a disease detects a statistically significant between-treatment difference favorable to an investigational drug on a valid measure of clinical outcome, the study will ordinarily be accepted as a source of evidence documenting that the investigational drug has a beneficial effect in at least some patients in the population, provided that fraud and/or systematic bias can be reasonably excluded as alternative explanations for the difference [2].
Unfortunately, some methods of sample enrichment may undermine the internal validity of a study. This paper reviews some of the issues and illustrates them with findings from a large multicenter consortium trial [3]of the cholinesterase inhibitor tacrine as a treatment for dementia.
Section snippets
A brief history of the enrichment/rerandomization design and its applications
Although the execution of any set of sample selection criteria technically can be considered to constitute a sample enrichment maneuver, the label “enrichment design” typically is reserved for those clinical trial designs that select subjects for participation in a randomized comparison phase of a study on the basis of their prior response, often during a preliminary open titration phase of the same study, to one or more of the investigational treatments being evaluated. The first use of the
The tacrine consortium study
The Summers et al [13]report on tacrine's effects on dementia in 17 patients appeared in November 1986. Within days, the lay media heralded tacrine as being capable of doing for dementia patients what levodopa had done for individuals with Parkinson's disease—it was not a cure, but it was a breakthrough just short of a miracle. Consequently, the public placed enormous pressure on the FDA to release tacrine for early widespread use. In this milieu of heightened expectations, a consortium of
Concerns That the Treatment Blind May Have Been Broken
If an investigational drug has a set of characteristic untoward effects, there is often concern that the treatment blind may have been broken. In the typical randomized placebo-controlled trial, the risk of blind-breaking is lessened because each subject is exposed to only one of the treatments being evaluated. Designs that expose subjects to all the treatments being compared, in contrast, would appear to increase the risk that a subject will correctly guess his/her actual treatment assignment
Conclusions
Although their use has been advocated in the archival literature [e.g., 5, 7, 18], so-called enrichment or reradomization designs in which subjects are selected for entry into randomized, blinded, controlled phases of experiments on the basis of their prior response to the investigational drug in an open phase have limitations. In particular, exposure to an experimental treatment during an open qualification phase may invalidate drug-placebo comparisons made during a later randomized, blinded,
References (18)
- et al.
Explanatory and pragmatic attitudes in therapeutic trials
J Chron Dis
(1967) - et al.
Randomized discontinuation trialsutility and efficiency
J Clin Epidemiol
(1993) - et al.
Randomizing responders
Controlled Clin Trials
(1991) - et al.
Commentarythe qualification period
J Clin Epidemiol
(1991) - et al.
The run-in period in clinical trialsthe effect of misclassification on efficiency
Controlled Clin Trials
(1990) Hazards of inferencethe active control investigation
Epilepsia
(1989)- et al.
A double-blind, placebo-controlled multicenter study of tacrine for Alzheimer's disease
N Engl J Med
(1992) - et al.
A clinical trial design avoiding undue placebo treatment
J Clin Pharmacol
(1975) - et al.
Methodological problems in studies of depressive disorderutility of the discontinuation design
J Clin Psychopharmacol
(1981)
Cited by (92)
ROC curves and nonrandom data
2017, Pattern Recognition LettersCitation Excerpt :Alternatively, one could discretize the classifier’s output and treat a as a latent variable (in the same manner as p). This paper’s results have implications for the practice of sample enrichment, which typically involves removing cases from the data [14]. In addition to possible effects from changes to the distribution of positive and negative cases (as discussed at the end of Section 2), if the way in which cases are removed is correlated with the cases’ propensity to be positive or with the classifier’s output, the empirical ROC curves constructed with that sample will be biased.
Early phase drug development for treatment of chronic pain - Options for clinical trial and program design
2012, Contemporary Clinical TrialsCitation Excerpt :Possibly, single-blind administration of the active drug in the initial enrichment phase could reduce bias in analgesic EE trials [19]. Possible carryover effects from the initial phase to the randomized phase should also be taken into account, e.g. drug withdrawal effects in those subjects that are randomized to withdrawal of the active drug [25]. Therefore, washout of the active treatment in patients randomized to discontinuation of the active drug may be considered [19].
Considerations for improving assay sensitivity in chronic pain clinical trials: IMMPACT recommendations
2012, PainCitation Excerpt :Recommendations were limited to characteristics of chronic pain RCTs in general and not factors unique to specific conditions, for example, glycemic control in trials of painful diabetic peripheral neuropathy (DPN). Considerations of the likelihood that potential modifications to RCTs might increase the rate of false-positive results [57,61] or study costs were generally based on expert opinion, given the lack of data addressing these critical issues. In considering the following recommendations, it must be emphasized that modifications to the research methods of an RCT that are intended to increase its assay sensitivity may have an impact on its generalizability.