To the Editor:
The Medical Outcome Health Survey Short-form 36 (SF-36) has been validated in psoriatic arthritis (PsA)1 and its psychometric assumptions tested2,3, but no data on its minimal important difference (MID) exist. MID is the smallest change in a patient-reported outcome (PRO) score that patients perceive as a meaningful change4. It differentiates treatment response and may serve as threshold for therapy change. We examineed the MID and responsiveness to change of SF-36 scales in patients with psoriatic arthritis (PsA) undergoing treatment with tumor necrosis factor-α (TNF-α) blockers.
Twenty consecutive patients with active PsA fulfilling the CASPAR criteria5 were recruited to receive TNF-α blockers. After 12 weeks, 9 patients continued TNF-α blockers (Group 1); 11 patients discontinued due to financial constraints. PRO including pain, patient’s global health assessment (PGA), Health Assessment Questionnaire (HAQ)6, and SF-367 were collected at baseline and Weeks 12, 24, 36, and 52. At each visit, patients answered an anchor question on general health status, “How do you rate your current health status as compared to last visit: (1) much better, (2) slightly better, (3) similar, (4) slightly worse, and (5) much worse.” The differences in PRO scores between 2 consecutive visits (e.g., Week 12 – Week 0, Week 24 – Week 12) were calculated. In 4 followups, 78 sets of PRO scores were analyzed. MID estimates were calculated as the mean change of PRO variables in those who rated their disease as “slightly better” or “slightly worse” in the anchor question, which represented a minimal change that is perceived by patients as relevant8. The change in anchor and the change in PRO scores should have a correlation coefficient of at least 0.39. The responsiveness of PRO were compared using the effect size (ES) and the standardized response mean (SRM) in patients who rated “slightly better” and “better” in the anchor question. The ES is small if < 0.2, medium if 0.3–0.5, and large if > 0.510. Exploratory analyses stratified by sex and severity of disease (HAQ > 1.0) were performed on the MID for various PRO.
Table 1 shows the demographic characteristics of patients at baseline and Week 12 and 52. PRO in both groups improved to a similar extent at Week 12, but worsened in Group 2 after Week 12. We found that 34.6%, 21.3%, 10.3%, 26.9%, and 6.4% of patients rated their general health status as “much better,” “slightly better,” “similar,” “slightly worse,” and “much worse” compared to last visit in the anchor question. The changes in SF-36 scales and PRO between the “slightly better” and “slightly worse” groups were significantly different (Table 2). Pain, PGA, HAQ, and the SF-36 scales for physical function (PF), bodily pain (BP), general health (GH), and physical component summary score (PCS) had the desirable Spearman’s rho of > 0.3. The correlations were less for VT, role emotional (RE), mental health (MH), and mental component summary score (MCS). The ES and SRM for PF, BP, GH, and PCS were moderate (range 0.35 to 0.59). In the exploratory analyses, no difference in MID values was noted when stratified by sex, except in the social functioning (SF) scale. The MID estimates for improvement were larger in males (data not shown). There was no difference in MID when stratifying by HAQ > 1.0.
MID has been defined as “the smallest difference in score in the domain of interest which patients perceive as beneficial and which would mandate, in the absence of troublesome side-effects and excessive cost, a change in the patient’s management”4. Determining MID is especially important because statistical significance is not equivalent to clinical significance. A change of therapy may be necessary when patients are not improving to a meaningful extent. Our study is the first to determine MID and responsiveness for SF-36 in patients with PsA as classified by the CASPAR criteria undergoing TNF-α blocker therapy. This provided additional evidence of the validity of the SF-36 in PsA. Improvement after TNF-α blockers and deterioration on stopping were expected and therefore we observed the MID estimates on both ends. Desirable correlations were observed between the anchor health status and PF, BP, GH, PCS, and HAQ. Although the correlations between the anchor question and some PRO variables were < 0.3, they were still significantly associated with the outcome of interest. This illustrates that changes in pain, PGA, and physical function may be more accurate at detecting small changes in the patient’s perceived status. Vitality and mental health may be affected by other factors not exclusively related to PsA activity. Fatigue (the opposite of vitality) was shown to vary with time and was associated with other factors like sex, psychological distress, and medical comorbidities in PsA11,12. The SF-36 MCS was also weaker in distinguishing drug or placebo effect in a phase III drug trial13. The MID scores were asymmetrical in worsening and improvement for pain, PGA, and HAQ. These asymmetrical changes in MID have been reported in other studies14,15. There is a smaller change in scores for patients to perceive deterioration than for improvement, which may be related to the high expectation of treatment effect from patients undergoing a drug trial.
For the responsiveness analyses, moderate ES and SRM were seen for SF-36 BP and PCS. Comparing measurements for physical function, HAQ, and SF-36 PCS, the latter had better ES and SRM. The SF-36 was also reported by Husted, et al to be more responsive as compared to HAQ and the Arthritis Impact Measurement Scale16.
There are several limitations to our study. The small sample size yielded large SD in the MID estimates and introduced unreliability to a certain extent. The results were subject to recall bias when patients were asked about their change compared to last visit8,9. MID results may change when a different anchor is used. For example, participants may have selected “slightly better” even if they experienced a substantial change because they were unwilling to call their change “much better.” Although this health status anchor question has not been validated in PsA, similar general health status anchors have been employed in studies in other connective tissue diseases15,17,18,19. Further, MID may change with baseline disease severity15 and in different clinical settings17. Our study was performed in Han Chinese patients with long disease duration, from a tertiary referral center, and with severe peripheral arthritis. These results may not be generalizable to PsA patients from other ethnic groups with milder disease severity, shorter disease duration, or predominant axial manifestation. Despite all these limitations, the MID estimates for pain, PGA, and HAQ were similar to those of other studies15,17,18,19,20. The SF-36 PCS, pain, and PGA proved responsive to short-term changes in PsA.
Acknowledgment
The authors thank their research assistants Lorraine Tseung and Raymond Lau for their contributions in data collection and entry.