Investigating the minimal important difference in ambulation in multiple sclerosis: A disconnect between performance-based and patient-reported outcomes?

https://doi.org/10.1016/j.jns.2014.10.021Get rights and content

Highlights

  • We estimated the Minimally Important Difference (MID) on patient-reported outcomes (PRO).

  • We anchored the MID with an objective measure of ambulation, the accelerometer.

  • We found that cross-sectionally, accelerometer and PROs were correlated.

  • Change scores over time for accelerometer and PROs were, however, not associated.

  • These findings contradict a central assumption of clinical research studies.

Abstract

Objective

We sought to estimate the MID on two patient-reported outcome (PRO) measures that are frequently used in multiple sclerosis (MS) clinical research: the MS Walking Scale and the MS Impact Scale-29. We anchored the Minimally Important Differences with an objective measure of ambulation, the accelerometer.

Methods

This secondary analysis used longitudinal data from an observational study of symptoms and physical activity in 269 people with Relapsing–Remitting Multiple Sclerosis. Participants completed a battery of PRO questionnaires, and then wore an accelerometer for seven days at each data collection time point every six months for 2.5 years. Statistical analysis first defined Change Groups on the basis of the performance-based accelerometer scores, anchored to 0.5 standard deviation change; then change was defined on the basis of published and linked MIDs for the PROs.

Results

The performance-based (accelerometer) and PRO-based change distributions were stable over time. Raw scores among the accelerometer and PRO measures were associated with large effect sizes, and PRO change scores were associated with each other but not with accelerometer change scores.

Conclusions

These findings contradict a central assumption that may underlie clinical research studies: that a cross-sectional correlation implies that change in PROs will correspond with change in behavior/performance. Possible explanations related to accuracy of the performance-based measure, as well as response shift effects on the PROs are discussed.

Introduction

The use of patient-reported outcomes (PROs) in medical outcome research has grown in prominence and sophistication in the past two decades. Increasingly recognized as a source of important information that is not redundant with information reported by clinicians [1] or family-member caregivers [2], PROs provide the patient's perspective on symptom experience, symptom impact, and quality of life. Often using evaluative measurement tools which emphasize the subjective and idiographic nature of the variable human experience of health and illness, PRO tools face an increasingly rigorous validation process that characterizes and quantifies their reliability, validity, and responsiveness [3], [4]. Technological advances in statistical software have facilitated these psychometric analyses, enabling the implementation of both classical- and item-response theory-based analyses that quantify aspects of reliability, validity and responsiveness in highly specific ways [5], [6].

With this growth in technological prowess, the field of PRO research has developed thoughtful methods for evaluating the responsiveness of measurement tools to facilitate the interpretation of these measures [7]. Responsiveness is a key aspect of validity and recent guidelines for assessing responsiveness are useful in distinguishing types of responsiveness and how to evaluate it [8]. This growing research base on responsiveness has suggested that responsiveness is a highly contextual characteristic, affected by who is being measured for what outcomes in what research or clinical context (where) using what mode of data collection (how) and at what stage of the disease trajectory (when) [9]. Work has focused on understanding how much change is large enough to be discernible and regarded as important [10]. Referred to as the Minimally Important Difference (MID), this has been defined as “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side effects and excessive cost, a change in the patient's management” [11]. The MID may be estimated by taking an initial or baseline assessment and a follow-up assessment, and at follow-up asking the patient how much their condition has changed (i.e., a transition rating or global rating of change) [10]. Using this transition rating as an anchor, one can estimate the mean change in the assessment that corresponds to getting worse or getting better. The methodological challenge of using such patient-reported transition ratings is the potential biases due to response shift, recall bias, and implicit theories of change [12], [13], [14].

These potential biases have perhaps alerted investigators to examine the consistency of MIDs across studies and to note variability and inconsistency in meaningful-change metrics. Even in measures of relatively concrete behaviors, such as ambulation, there seems to be variability in the amount of change that corresponds to a person's impression of clinically-important change [15]. For example, past research on MID of the Multiple Sclerosis Walking Scale-12 (MSWS) [16] has yielded varying MID estimates, ranging from 4 to 10 points on a 100-point scale [15], [17], [18]. Differences between patient groups or studies in what constitutes an important change could impair the comparability of PRO data on the same instrument(s) across studies [19].

In response to the challenge of ‘moving goal posts’ [20], [21], we sought to estimate the MID on two PRO measures that are frequently used in multiple sclerosis (MS) clinical research: the MSWS [16] and the MS Impact Scale-29 (MSIS) [22]. We anchored the MIDs with an objective measure of ambulation, the accelerometer [23]. We used the well-documented robustness of the half-standard deviation of the accelerometer change score as a benchmark for clinically important change [24] to estimate the MID of the MSWS and the MSIS. We then investigated relationships between accelerometer change and PRO change over time, and examined self-efficacy as a psychosocial factor that may explain discrepancies between objective and patient-reported change.

Section snippets

Sample

This secondary analysis used data from an observational study of symptoms and physical activity over 2.5 years in people with Relapsing–Remitting Multiple Sclerosis (RRMS) [25]. The procedures were approved by an Institutional Review Board and all participants who volunteered provided written informed consent. The sample was recruited through a research advertisement posted on the National MS Society (NMSS) website and distributed through 12 mid-western chapters of the NMSS. Those who were

Sample

The baseline sample consisted of 223 women and 46 men. The participants were mostly Caucasian (91%), well educated (83% had some college education or were college graduates), and reported a median household income that exceeded $40,000/year (68%). The mean age was 45.9 years (standard deviation [SD] 9.6), and the mean MS disease duration was 8.8 years (SD 7.0). The median PDDS score was 2 (interquartile range 3.0), and the MSWS score was 36.0 (SD 28.2). Those scores indicated that the sample, on

Conflict of interest

The authors have no conflict of interest to disclose related to this scientific work.

Acknowledgments

This work was funded in part by a grant from the National Multiple Sclerosis Society (PI: RW Motl; RG 3926A2/1). We are grateful for helpful input from Bruce H. Dobkin, MD, FRCP.

References (49)

  • C.E. Schwartz et al.

    Methodological approaches for assessing response shift in longitudinal health-related quality-of-life research

    Soc Sci Med

    (1999)
  • Y. Li et al.

    Classification and regression tree analysis to identify complex cognitive paths underlying quality of life response shifts: a study of individuals living with HIV/AIDS

    J Clin Epidemiol

    (2009)
  • J.M. Sonder et al.

    Do patient and proxy agree? Long-term changes in multiple sclerosis physical impact and walking ability on patient-reported outcome scales

    Mult Scler

    (Apr 7 2014)
  • FDA

    Guidance for Industry

  • B.B. Reeve et al.

    ISOQOL recommends minimum standards for patient-reported outcome measures used in patient-centered outcomes and comparative effectiveness research

    Qual Life Res

    (Oct 2013)
  • S.E. Embretson et al.

    Item response theory for psychologists

    (2000)
  • R.K. Hambleton

    Principles and selected applications of item response theory

  • C.B. Terwee et al.

    On assessing responsiveness of health-related quality of life instruments: guidelines for instrument evaluation

    Qual Life Res

    (2003)
  • P.M. Fayers et al.

    Don't middle your MIDs: regression to the mean shrinks estimates of minimally important differences

    Qual Life Res

    (Feb 2014)
  • G. Norman

    Hi! How are you? Response shift, implicit theories and differing epistemologies

    Qual Life Res

    (2003)
  • A.K. Kvam et al.

    Minimal important differences and response shift in health-related quality of life; a longitudinal study in patients with multiple myeloma

    Health Qual Life Outcomes

    (2010)
  • N. Schwartz et al.

    Autobiographical memory and the validity of retrospective reports

    (1994)
  • R.W. Motl et al.

    Validity of minimal clinically important difference values for the multiple sclerosis walking scale-12?

    Eur Neurol

    (2014)
  • J.C. Hobart et al.

    Measuring the impact of MS on walking ability: the 12-item MS Walking Scale (MSWS-12)

    Neurology

    (2003)
  • Cited by (10)

    • Real-world walking in multiple sclerosis: Separating capacity from behavior

      2018, Gait and Posture
      Citation Excerpt :

      Daily step counts are the best known measure of real-world walking capacity, as shown by statistically significant correlations to the six-minute walk (6 MW), timed 25-foot walk (T25FW), and MS Walking Scale (MSWS-12) [10,11]. However, daily counts explain less than half of the variance in these outcomes [10,11], and they do not reliably change when patient-reported walking ability changes [12]. Although daily counts have been established as valid walking outcomes, they are most accurately viewed as imprecise measures of both walking capacity and activity behaviors.

    View all citing articles on Scopus
    View full text