Relative indices of treatment effect may be constant across different definitions of response in schizophrenia trials

https://doi.org/10.1016/j.schres.2010.10.016Get rights and content

Abstract

Background

In randomized controlled trials of antipsychotics, various cutoffs have been used to define response on continuous outcome measures.

Aims

To find a summary effect measure that remains constant across different definitions of response.

Method

We conducted secondary analyses of individual patient data from 10 randomized controlled trials of second-generation antipsychotics for schizophrenia (n = 4278) by applying a meta-analytic approach to produce odds ratios (OR), risk ratios (RR) and risk differences (RD) and their 95% confidence intervals (CI) for different definitions of response, using cutoffs of 10% through 90% reduction on the symptom severity rating scales. Constancy of these indices was examined through visual inspection, by way of I-squared statistics to quantify heterogeneity, and by way of coefficients of variation. If any of these indices were found to remain reasonably constant, we next examined the concordance between the number needed to treat (NNT) predicted from them and the observed NNT.

Results

OR and RR remained reasonably constant across various definitions of response, especially for those using thresholds of 10% through 70% reduction. The NNTs predicted from OR and RR agreed well with the observed NNTs, with ANOVA intraclass correlation coefficients of 0.96 (95% CI: 0.92 to 0.98) and 0.86 (0.72 to 0.93), respectively.

Conclusions

The relative measures of treatment effectiveness remain reasonably constant across different scale-derived definitions of response and, in conjunction with varying control event rates, can give accurate estimates of NNTs for individuals with schizophrenia.

Introduction

In psychiatry “hard” outcomes such as death are not readily available or appropriate indices of treatment effectiveness. Instead, continuous outcomes based on rating scales are often employed but it is sometimes not easy to interpret the meaning of these scores (Norman et al., 2001). For example, in a hypothetical drug trial of acute phase treatment of schizophrenia, a statistically significant difference on a certain disease severity measure of 70 vs 80 may be reported for the drug and placebo arms, respectively, at the end of the trial. However, what these 70 or 80, or what this 10-point difference, on this scale mean clinically may often not be transparent.

On the other hand, a categorical approach can be more interpretable, for example, if the response or remission rates are reported to be 50% vs 30% in the two arms. Trialists have therefore often included “response” rates defined as a threshold decrease on the continuous outcome (Altman and Royston, 2006). Unfortunately, for many of these continuous outcomes, there usually is no validated or even agreed-upon cutoff to define “response.” In the case of schizophrenia trials, the Brief Psychiatric Rating Scale (BPRS) (Overall and Gorham, 1962) and the Positive and Negative Syndrome Scale (PANSS) (Kay et al., 1987) are the two most frequently used scales but investigators have used various percentage improvements from 20% through 50% to define response (Beasley et al., 1996b, Marder and Meibach, 1994, Peuskens and Link, 1997, Small et al., 1997).

Such lack of consensus in the definition of response poses several related difficulties. First, there is suspicion that the trialists choose their cutoff not because it is clinically appropriate but because it is more likely to result in “statistically significant” differences. In recent reports of schizophrenia trials there is a tendency to use 20% reduction as a cutoff, apparently in the belief that a lower cutoff increases the ability to find statistically significant differences between drugs. However, 20% reduction represents something less than “minimal improvement” (Leucht et al., 2005a, Leucht et al., 2005b). A statistically significant difference in the rates of patients showing borderline or greater improvement (but not necessarily in moderate or greater improvement) would certainly not be clinically meaningful. Second, in order to obtain unbiased and generalizable estimates of the true treatment effects, we need comprehensive meta-analyses of relevant trials. However, if “response” is defined variably across different trials addressing a similar clinical question, we cannot be sure if we could safely combine them in a meta-analysis.

These problems would be greatly ameliorated if one could find a measure of effect that remained more or less constant across a range of thresholds. The odds ratio (OR), relative risk (RR) and risk difference (RD) or its inverse, the number needed to treat (NNT), are the representative indices of treatment effectiveness for dichotomous outcomes. When the outcome is dichotomous, the results of a trial can be summarized as in the following 2*2 table.

The more clinically interpretable indices are RR and RD. RR is the ratio of the response rates in the treatment and control arms; it is therefore (a/(a + b))/(c/(c + d)). RD is the difference in the response rates in the treatment and control arms; it is therefore a/(a + b)  c/(c + d). NNT, which is the inverse of RD, shows the number of patients one would need to treat in order to have one more response in the treatment arm that would not have happened if on the control arm. It therefore nicely summarizes the amount of effort that both clinicians and patients need to expend in order to obtain one more response. For example, a treatment that produces a response rate of 50% in comparison with a placebo response rate of 30% would be translated into an NNT of 1/(0.5  0.3) = 5. In other words, one would need to treat 5 patients in order to produce one more responder over what would have happened on placebo. On the other hand, OR is intuitively difficult to understand because it is the ratio of the odds of showing response over not showing response in the treatment and control arms; hence it is (a/b)/(c/d) or ad/bc. OR, however, has some strong mathematical properties because OR of non-response is the inverse of OR of response, whereas such a relationship does not hold for RR (Deeks, 2002).

In the following analyses, we examined individual patient data from several clinical trials of schizophrenia to see if any of OR, RR or RD may remain constant across different definitions of response, so that it can be used as the generalizable index of treatment effectiveness.

Section snippets

Database

Individual patient data from 10 trials comparing olanzapine vs haloperidol (5 comparisons, baseline n = 2974) (Beasley et al., 1996b, Beasley et al., 1997, Keefe et al., 2006, Lieberman et al., 2003, Tollefson et al., 1997), amisulpride vs haloperidol (4 comparisons, baseline n = 1198) (Carriere et al., 2000, Colonna et al., 2000, Moller et al., 1997, Puech et al., 1998), and olanzapine vs placebo (2 comparisons, baseline n = 502) (Beasley et al., 1996a, Beasley et al., 1996b) that administered

Visual inspection of the constancy of OR, RR and RD across different definitions of response

Fig. 1, Fig. 2, Fig. 3 depict the OR, RR and RD corresponding to the various definitions of response using 10% through 90% reduction in the BPRS or PANSS total scores for the comparisons olanzapine vs haloperidol, amisulpride vs haloperidol and olanzapine vs placebo, respectively. Visual inspection of these graphs indicates that both OR and RR appear to remain relatively constant, especially for the ranges of 10% through 70% reduction.

For the extreme ranges of 80% or 90% reduction, there were

Discussion

Based on individual patient data of 4278 patients with schizophrenia participating in trials of acute phase antipsychotic treatment, we examined empirically whether OR, RR or RD remains constant across different definitions of response on the BPRS and the PANSS. We found that both OR and RR remain relatively constant across plausible ranges of definitions of response and that OR, in particular, was able to predict RR, RD and NNT very accurately using mathematical formulae and estimates of the

Role of funding source

This work required no external funding.

Contributors

TAF conceived the study. TAF, SW and SL undertook the statistical analyses. TAF wrote the first draft of the manuscript. TA, SW and SL provided essential critical comments. All the authors have approved the final manuscript.

Conflict of interest

TAF received research funds and speaking fees from Astellas, Dai-Nippon Sumitomo, Eli Lilly, GlaxoSmithKline, Janssen, Meiji, Otsuka, Pfizer and Schering–Plough. TA received research funds and speaking feeds from Astellas, AstraZeneca, BMS, Daiichi-Sankyo, Dai-Nippon Sumitomo, Eisai, Eli Lilly, GlaxoSmithKline, Janssen, Kyowa-Hakko, Meiji, Otsuka, Pfizer, SanofiAventis, Shionogi and Yakult. SW has no conflict of interest to declare. SL received speaker/consultancy/advisory board honoraria from

Acknowledgments

We would like to thank David L. Streiner and Gordon H. Guyatt for their very helpful advice and comments on the earlier drafts of this paper. We would also like to thank Eli Lilly and SanofiAventis for letting us use their individual patient database without any influence on the design, conduct or reporting of this study.

References (33)

  • C.M. Beasley et al.

    Olanzapine versus placebo: results of a double-blind, fixed-dose olanzapine trial

    Psychopharmacology (Berl.)

    (1996)
  • L. Colonna et al.

    Long-term safety and efficacy of amisulpride in subchronic or chronic schizophrenia. Amisulpride Study Group

    Int. Clin. Psychopharmacol.

    (2000)
  • J.J. Deeks

    Issues in the selection of a summary statistic for meta-analysis of clinical trials with binary outcomes

    Stat. Med.

    (2002)
  • L. Duggan et al.

    Olanzapine for schizophrenia

    Cochrane Database Syst. Rev.

    (2005)
  • T.A. Furukawa et al.

    Can we individualize the ‘number needed to treat’? An empirical study of summary effect measures in meta-analyses

    Int. J. Epidemiol.

    (2002)
  • M.C. Genovese et al.

    Abatacept for rheumatoid arthritis refractory to tumor necrosis factor alpha inhibition

    N Engl J. Med.

    (2005)
  • Cited by (33)

    • Chlorpromazine versus every other antipsychotic for schizophrenia: A systematic review and meta-analysis challenging the dogma of equal efficacy of antipsychotic drugs

      2014, European Neuropsychopharmacology
      Citation Excerpt :

      We contacted pharmaceutical companies producing chlorpromazine (SanofiAventis, GlaxoSmithKline, Bayer) and sent our data extraction forms to first authors of each included study with a request for missing information and a possibility for corrections. The primary outcome was response to treatment, a priori defined in our protocol as at least 50% reduction of rating scales such as the Positive and Negative Syndrome Scale (PANSS) (Kay et al., 1987), the Brief Psychiatric Rating Scale (BPRS) (Overall and Gorham, 1962) or at least “much improved” on the Clinical Global Impressions Scale (CGI) (Guy, 1976) because these cut-offs have been demonstrated to be clinically meaningful (Leucht and Engel, 2006; Leucht et al., 2012, 2005; Levine et al., 2008), but as these were rarely indicated, we often used the authors׳ definitions which is appropriate as long as relative risks or odds ratios are the effect size measures (Furukawa et al., 2011). We also analysed the mean overall change in symptom rating scales, based on the following hierarchy: change in PANSS total score, change in BPRS total score, values of these scales at study endpoint, and then, if any of the previous measures were not available, other scales for overall schizophrenic symptomatology as long as the instrument had been published in a peer-reviewed journal, because unpublished rating scales tend to overestimate differences (Marshall et al., 2000).

    • GRADE guidelines: 13. Preparing summary of findings tables and evidence profiles - Continuous outcomes

      2014, Zeitschrift fur Evidenz, Fortbildung und Qualitat im Gesundheitswesen
    • A 6-week randomized, double-blind, placebo-controlled, comparator referenced trial of vabicaserin in acute schizophrenia

      2014, Journal of Psychiatric Research
      Citation Excerpt :

      To allow for comparisons between site and central raters, a BPRS total score was derived from the central raters' PANSS assessment. The PANSS includes all items of the BPRS along with an additional 12 items designed to assess broader psychopathology (Furukawa et al., 2011). Derived BPRS scores were computed by summing the relevant 18 items from the PANSS: positive symptoms items 2 through 7, negative symptoms items 1 and 2, and items 1 through 10 of the general subscale.

    View all citing articles on Scopus
    View full text