Relative indices of treatment effect may be constant across different definitions of response in schizophrenia trials
Introduction
In psychiatry “hard” outcomes such as death are not readily available or appropriate indices of treatment effectiveness. Instead, continuous outcomes based on rating scales are often employed but it is sometimes not easy to interpret the meaning of these scores (Norman et al., 2001). For example, in a hypothetical drug trial of acute phase treatment of schizophrenia, a statistically significant difference on a certain disease severity measure of 70 vs 80 may be reported for the drug and placebo arms, respectively, at the end of the trial. However, what these 70 or 80, or what this 10-point difference, on this scale mean clinically may often not be transparent.
On the other hand, a categorical approach can be more interpretable, for example, if the response or remission rates are reported to be 50% vs 30% in the two arms. Trialists have therefore often included “response” rates defined as a threshold decrease on the continuous outcome (Altman and Royston, 2006). Unfortunately, for many of these continuous outcomes, there usually is no validated or even agreed-upon cutoff to define “response.” In the case of schizophrenia trials, the Brief Psychiatric Rating Scale (BPRS) (Overall and Gorham, 1962) and the Positive and Negative Syndrome Scale (PANSS) (Kay et al., 1987) are the two most frequently used scales but investigators have used various percentage improvements from 20% through 50% to define response (Beasley et al., 1996b, Marder and Meibach, 1994, Peuskens and Link, 1997, Small et al., 1997).
Such lack of consensus in the definition of response poses several related difficulties. First, there is suspicion that the trialists choose their cutoff not because it is clinically appropriate but because it is more likely to result in “statistically significant” differences. In recent reports of schizophrenia trials there is a tendency to use 20% reduction as a cutoff, apparently in the belief that a lower cutoff increases the ability to find statistically significant differences between drugs. However, 20% reduction represents something less than “minimal improvement” (Leucht et al., 2005a, Leucht et al., 2005b). A statistically significant difference in the rates of patients showing borderline or greater improvement (but not necessarily in moderate or greater improvement) would certainly not be clinically meaningful. Second, in order to obtain unbiased and generalizable estimates of the true treatment effects, we need comprehensive meta-analyses of relevant trials. However, if “response” is defined variably across different trials addressing a similar clinical question, we cannot be sure if we could safely combine them in a meta-analysis.
These problems would be greatly ameliorated if one could find a measure of effect that remained more or less constant across a range of thresholds. The odds ratio (OR), relative risk (RR) and risk difference (RD) or its inverse, the number needed to treat (NNT), are the representative indices of treatment effectiveness for dichotomous outcomes. When the outcome is dichotomous, the results of a trial can be summarized as in the following 2*2 table.
The more clinically interpretable indices are RR and RD. RR is the ratio of the response rates in the treatment and control arms; it is therefore (a/(a + b))/(c/(c + d)). RD is the difference in the response rates in the treatment and control arms; it is therefore a/(a + b) − c/(c + d). NNT, which is the inverse of RD, shows the number of patients one would need to treat in order to have one more response in the treatment arm that would not have happened if on the control arm. It therefore nicely summarizes the amount of effort that both clinicians and patients need to expend in order to obtain one more response. For example, a treatment that produces a response rate of 50% in comparison with a placebo response rate of 30% would be translated into an NNT of 1/(0.5 − 0.3) = 5. In other words, one would need to treat 5 patients in order to produce one more responder over what would have happened on placebo. On the other hand, OR is intuitively difficult to understand because it is the ratio of the odds of showing response over not showing response in the treatment and control arms; hence it is (a/b)/(c/d) or ad/bc. OR, however, has some strong mathematical properties because OR of non-response is the inverse of OR of response, whereas such a relationship does not hold for RR (Deeks, 2002).
In the following analyses, we examined individual patient data from several clinical trials of schizophrenia to see if any of OR, RR or RD may remain constant across different definitions of response, so that it can be used as the generalizable index of treatment effectiveness.
Section snippets
Database
Individual patient data from 10 trials comparing olanzapine vs haloperidol (5 comparisons, baseline n = 2974) (Beasley et al., 1996b, Beasley et al., 1997, Keefe et al., 2006, Lieberman et al., 2003, Tollefson et al., 1997), amisulpride vs haloperidol (4 comparisons, baseline n = 1198) (Carriere et al., 2000, Colonna et al., 2000, Moller et al., 1997, Puech et al., 1998), and olanzapine vs placebo (2 comparisons, baseline n = 502) (Beasley et al., 1996a, Beasley et al., 1996b) that administered
Visual inspection of the constancy of OR, RR and RD across different definitions of response
Fig. 1, Fig. 2, Fig. 3 depict the OR, RR and RD corresponding to the various definitions of response using 10% through 90% reduction in the BPRS or PANSS total scores for the comparisons olanzapine vs haloperidol, amisulpride vs haloperidol and olanzapine vs placebo, respectively. Visual inspection of these graphs indicates that both OR and RR appear to remain relatively constant, especially for the ranges of 10% through 70% reduction.
For the extreme ranges of 80% or 90% reduction, there were
Discussion
Based on individual patient data of 4278 patients with schizophrenia participating in trials of acute phase antipsychotic treatment, we examined empirically whether OR, RR or RD remains constant across different definitions of response on the BPRS and the PANSS. We found that both OR and RR remain relatively constant across plausible ranges of definitions of response and that OR, in particular, was able to predict RR, RD and NNT very accurately using mathematical formulae and estimates of the
Role of funding source
This work required no external funding.
Contributors
TAF conceived the study. TAF, SW and SL undertook the statistical analyses. TAF wrote the first draft of the manuscript. TA, SW and SL provided essential critical comments. All the authors have approved the final manuscript.
Conflict of interest
TAF received research funds and speaking fees from Astellas, Dai-Nippon Sumitomo, Eli Lilly, GlaxoSmithKline, Janssen, Meiji, Otsuka, Pfizer and Schering–Plough. TA received research funds and speaking feeds from Astellas, AstraZeneca, BMS, Daiichi-Sankyo, Dai-Nippon Sumitomo, Eisai, Eli Lilly, GlaxoSmithKline, Janssen, Kyowa-Hakko, Meiji, Otsuka, Pfizer, SanofiAventis, Shionogi and Yakult. SW has no conflict of interest to declare. SL received speaker/consultancy/advisory board honoraria from
Acknowledgments
We would like to thank David L. Streiner and Gordon H. Guyatt for their very helpful advice and comments on the earlier drafts of this paper. We would also like to thank Eli Lilly and SanofiAventis for letting us use their individual patient database without any influence on the design, conduct or reporting of this study.
References (33)
- et al.
Olanzapine versus haloperidol: acute phase results of the international double-blind olanzapine trial
Eur. Neuropsychopharmacol.
(1997) - et al.
Olanzapine versus placebo and haloperidol: acute phase results of the North American double-blind olanzapine trial
Neuropsychopharmacology
(1996) - et al.
Amisulpride has a superior benefit/risk profile to haloperidol in schizophrenia: results of a multicentre, double-blind study (the Amisulpride Study Group)
Eur. Psychiatry
(2000) - et al.
Meta-analysis in clinical trials
Control. Clin. Trials
(1986) - et al.
One-year double-blind study of the neurocognitive efficacy of olanzapine, risperidone, and haloperidol in schizophrenia
Schizophr. Res.
(2006) - et al.
Efficacy and safety of ustekinumab, a human interleukin-12/23 monoclonal antibody, in patients with psoriasis: 76-week results from a randomised, double-blind, placebo-controlled trial (PHOENIX 1)
Lancet
(2008) - et al.
Second-generation versus first-generation antipsychotic drugs for schizophrenia: a meta-analysis
Lancet
(2009) - et al.
What does the PANSS mean?
Schizophr. Res.
(2005) - 2008. Review Manager (RevMan). The Nordic Cochrane Centre, The Cochrane Collaboration,...
- et al.
The cost of dichotomising continuous variables
BMJ
(2006)
Olanzapine versus placebo: results of a double-blind, fixed-dose olanzapine trial
Psychopharmacology (Berl.)
Long-term safety and efficacy of amisulpride in subchronic or chronic schizophrenia. Amisulpride Study Group
Int. Clin. Psychopharmacol.
Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes
Stat. Med.
Olanzapine for schizophrenia
Cochrane Database Syst. Rev.
Can we individualize the ‘number needed to treat’? An empirical study of summary effect measures in meta-analyses
Int. J. Epidemiol.
Abatacept for rheumatoid arthritis refractory to tumor necrosis factor alpha inhibition
N Engl J. Med.
Cited by (33)
Let us not rush back to odds ratios without a recommendation to convert them to interpretable measures
2021, Journal of Clinical EpidemiologyChlorpromazine versus every other antipsychotic for schizophrenia: A systematic review and meta-analysis challenging the dogma of equal efficacy of antipsychotic drugs
2014, European NeuropsychopharmacologyCitation Excerpt :We contacted pharmaceutical companies producing chlorpromazine (SanofiAventis, GlaxoSmithKline, Bayer) and sent our data extraction forms to first authors of each included study with a request for missing information and a possibility for corrections. The primary outcome was response to treatment, a priori defined in our protocol as at least 50% reduction of rating scales such as the Positive and Negative Syndrome Scale (PANSS) (Kay et al., 1987), the Brief Psychiatric Rating Scale (BPRS) (Overall and Gorham, 1962) or at least “much improved” on the Clinical Global Impressions Scale (CGI) (Guy, 1976) because these cut-offs have been demonstrated to be clinically meaningful (Leucht and Engel, 2006; Leucht et al., 2012, 2005; Levine et al., 2008), but as these were rarely indicated, we often used the authors׳ definitions which is appropriate as long as relative risks or odds ratios are the effect size measures (Furukawa et al., 2011). We also analysed the mean overall change in symptom rating scales, based on the following hierarchy: change in PANSS total score, change in BPRS total score, values of these scales at study endpoint, and then, if any of the previous measures were not available, other scales for overall schizophrenic symptomatology as long as the instrument had been published in a peer-reviewed journal, because unpublished rating scales tend to overestimate differences (Marshall et al., 2000).
GRADE guidelines: 13. Preparing summary of findings tables and evidence profiles - Continuous outcomes
2014, Zeitschrift fur Evidenz, Fortbildung und Qualitat im GesundheitswesenA 6-week randomized, double-blind, placebo-controlled, comparator referenced trial of vabicaserin in acute schizophrenia
2014, Journal of Psychiatric ResearchCitation Excerpt :To allow for comparisons between site and central raters, a BPRS total score was derived from the central raters' PANSS assessment. The PANSS includes all items of the BPRS along with an additional 12 items designed to assess broader psychopathology (Furukawa et al., 2011). Derived BPRS scores were computed by summing the relevant 18 items from the PANSS: positive symptoms items 2 through 7, negative symptoms items 1 and 2, and items 1 through 10 of the general subscale.