Abstract
Objective. To assess patient-reported variables as predictors of change in disease activity and disability in early rheumatoid arthritis (RA).
Methods. Cases were recruited to the Yorkshire Early Arthritis Register (YEAR) between 1997 and 2009 (n = 1415). Predictors of the 28-joint Disease Activity Score (DAS28) and the Health Assessment Questionnaire-Disability Index (HAQ-DI) at baseline and change over 12 months were identified using multilevel models. Baseline predictors were sex, age, symptom duration, autoantibody status, pain and fatigue visual analog scales (VAS), duration of early morning stiffness (EMS), DAS28, and HAQ-DI.
Results. Rates of change were slower in women than men: DAS28 fell by 0.19 and 0.17 units/month, and HAQ-DI by 0.028 and 0.023 units/month in men and women, respectively. Baseline pain and EMS had small effects on rates of change, whereas fatigue VAS was only associated with DAS28 and HAQ-DI at baseline. In patients recruited up to 2002, DAS28 reduced more quickly in those with greater pain at baseline (by 0.01 units/mo of DAS28 per cm pain VAS, p = 0.024); in patients recruited after 2002, the effect for pain was stronger (by 0.01 units/mo, p = 0.087). DAS28 reduction was greater with longer EMS. In both cohorts, fall in HAQ-DI (p = 0.006) was greater in patients with longer EMS duration, but pain and fatigue were not significant predictors of change in HAQ-DI.
Conclusion. Patient-reported fatigue, pain, and stiffness at baseline are of limited value for the prediction of RA change in disease activity (DAS28) and activity limitation (HAQ-DI).
The use of patient-reported outcomes (PRO) to assess treatment response in rheumatoid arthritis (RA) is well established. The core set of outcomes recommended for assessment of RA treatment by the Outcome Measures in Rheumatology Clinical Trials (OMERACT) group includes patient-reported variables such as pain and fatigue1,2. Measurement of these subjective indicators of health status can aid clinical assessment3 and there is evidence that they can be useful to help predict RA remission. For example, a study involving 103 patients with RA from Japan found an inverse association between remission and greater pain and fatigue at baseline after a 7-year followup4. Similarly, greater baseline pain was associated with reduced odds of remission at 6 and/or 12 months in the French early inflammatory arthritis ESPOIR cohort5. Thus, patient-reported measures may be rapid and cost-effective tools for the prediction of outcome in RA. However, before these variables can be useful in a clinical setting, further evidence to support their application is needed. Our present study evaluated patient-reported measures [fatigue, pain, and early morning stiffness (EMS)] alongside traditional predictors of outcome to investigate their value in predicting the rate of change in disease activity and disability in an early RA cohort.
MATERIALS AND METHODS
Subjects
The Yorkshire Early Arthritis Register (YEAR) is an observational inception cohort whose subjects were aged over 18 years with a consultant-made diagnosis of recent-onset RA. Our present study used data from 1415 participants recruited to the YEAR between 1997 and 2009 with inflammatory symptom durations of ≤ 24 months. Details of the YEAR were published previously6. Briefly, data on patients with RA were collated from 14 rheumatology outpatient centers across Yorkshire, UK. Participants were treated according to a regionally agreed protocol that recommended sequential escalation of treatment with disease-modifying antirheumatic drugs (DMARD). When data collection began in 1997, the first-line DMARD was sulfasalazine, but this changed to methotrexate (MTX), with a one-off dose of intramuscular methylprednisolone (120 mg) given at baseline, when the data collection and treatment protocols were altered in 2002. Deviations from the treatment protocol were made at the discretion of the treating rheumatologist. For our present analysis, patient data were not included if the symptom duration exceeded 24 months, or was missing. All patients provided written consent for inclusion into the study and ethical approval was granted by the Northern and Yorkshire Research Ethics Committee (MREC /99/3/48).
Data collection
Data were collected at baseline, 3, 6, 9 (after 2002), and 12 months by a clinician or research nurse. Identified details included sex, date of birth, date of symptom onset, swollen and tender joint counts from a score of 28 (SJC and TJC), and duration of EMS in minutes. Participants completed self-assessment tools, which included visual analog scales (VAS) to indicate their assessment of pain (0–100 mm scale, where 0 = no pain and 100 = pain as bad as it can be) and fatigue (0 = no abnormal fatigue and 100 = fatigue as bad as it can be). The disability index component of the Health Assessment Questionnaire (HAQ) was completed at each visit and is referred to as the “HAQ-DI” from here onward. The SJC and TJC of 28 joints and C-reactive protein (CRP) were used to calculate the 3-variable Disease Activity Score (DAS28-CRP)7 for each visit. Laboratory analyses undertaken at individual recruitment centers included CRP at all visits and IgM rheumatoid factor (RF) at baseline. RF was measured using standard nephelometric assays, and anticitrullinated protein antibodies (ACPA) were determined retrospectively on stored samples using previously described methods8.
Data analysis
Baseline demographic and disease characteristics were summarized in terms of means and SD (continuous variables) and percentages (categorical variables). Multilevel models (random intercepts, fixed slopes) were constructed to evaluate baseline predictors of DAS28 and HAQ-DI measured at baseline, 6 months, and 12 months. These were 2-level models in which repeated measurements over time (level 1) were nested within patients (level 2). These models included an indicator for “cohort” (before or after 2002, when the treatment protocol changed) and a variable indicating month, which was treated as a continuous covariate. Interactions were added between each predictor and cohort to show whether associations with baseline DAS28 or HAQ-DI differed by cohort, and between each predictor and month, to show whether the predictor was associated with change in DAS28 over time. Additionally, 3-way interaction terms between each predictor, month, and cohort were added to analyze whether changes over time differed by cohort. The interaction terms were sequentially discarded in order of least significance until only significant terms (where p ≤ 0.1) remained in the model; 2-way interactions were retained irrespective of significance if both variables were included in a significant 3-way interaction. Linear change was assumed over time. Pseudo-adjusted R2 was calculated as the adjusted R2 between observed and predicted values of each outcome. R2 estimates obtained in each imputed dataset were averaged after using Fisher’s r-to-z transformation. We considered whether random slopes were more suitable than fixed slopes. Formally testing for random slopes using the standard likelihood ratio approach is not currently supported for multiply-imputed datasets in our chosen analysis package. We compared the coefficients between models that included fixed or random slopes for time and found them to be very similar, and the conclusions regarding which main effects and interactions were statistically significant remained unaffected; therefore, we opted to retain the simpler model.
Continuous rather than dichotomous outcomes (e.g., remission or non-remission, HAQ above or below a threshold value) were used to retain statistical power. To this end, because HAQ-DI represents an ordinal scale9, this variable was transformed using Rasch analysis so that it could be analyzed as an interval-scaled variable10. As well as traditionally reported predictors of RA outcome — including sex, antibody status, and age — patient-reported pain, fatigue, and duration of EMS were also included as predictors in the models. Continuous variables were centered at the mean prior to analysis. EMS was not normally distributed and was therefore divided into 5 about equal-sized groups: < 30, 30–59, 60–119, 120–179, and ≥ 180 min. Correlation between RF and ACPA status was 0.56 and considered low enough for both variables to be included in the models simultaneously.
Missing data
Missing data were accounted for using multiple imputation (MI) by chained equations and 50 imputed datasets, the results from which were combined according to Rubin’s rules11. Predictive mean matching with 10 nearest neighbors was used to impute continuous variables; for RF and ACPA, logistic regression was used. Fifty imputations were chosen for our analysis to achieve ≥ 95% relative efficiency of the MI estimates11,12,13, given the amount of missing data (40% missing and 43% missing for the 6- and 12-month analyses, respectively). Auxiliary variables were selected from the dataset and included in the imputation models if they correlated with predictor or outcome variables (Pearson correlation ≥ 0.7), or predicted missingness (significant predictors in logistic regression analyses). The order of imputation (which included auxiliary variables) was TJC28; SJC28; CRP at baseline, 3, 6, and 12 months; HAQ at baseline, 6, and 12 months; and baseline pain VAS, fatigue VAS, EMS, age, sex, symptom duration, RF, and ACPA. Summary statistics of the imputed datasets were examined and compared with those of the complete dataset to check that imputed values were reasonable.
All analyses were conducted using Stata 13 (Stata Statistical Software: Release 14.1).
RESULTS
Baseline characteristics and missing data
Numbers of cases recruited to the YEAR and included in the final analysis are shown in Figure 1. From a total of 1415 cases, 690 were recruited between 1997 and 2002 and 725 were recruited after 2002. Baseline characteristics and rates of missingness for variables included in the analysis are given in Table 1. The YEAR may be considered consistent with other early RA cohorts, with 66% of patients being female, an average age at onset of 58 years, and 71% RF-positive. These summary statistics were similar for cases recruited before and after 2002; however, mean baseline DAS28 was lower for cases recruited after 2002 (4.8 compared with 5.4) and similar differences were seen in baseline HAQ-DI (1.18 compared with 1.28). Baseline pain and fatigue VAS were also slightly higher in the earlier cohort, with mean pain VAS 6.3 cm pre-2002 and 5.3 cm post-2002, and fatigue VAS 4.8 cm and 4.5 cm, respectively. In 21% of cases, some variables were missing at baseline. Cases with no missing data were slightly older (58.6 vs 57.7 yrs) and reported slightly more baseline fatigue with higher DAS28 and HAQ-DI values.
Change in DAS28
Table 2 gives the results of the multilevel model of change in DAS28. Baseline DAS28 was higher in patients recruited prior to 2002, older patients, and those with longer disease duration, greater pain, fatigue, and longer duration of EMS. On average, DAS28 reduced by 0.19 units/month in men and the rate of reduction was 0.02 units/month slower in women. Reduction in DAS28 per month was slightly faster in older patients (by 0.01 units per decade of baseline age). All of the statistically significant effects of baseline variables on change in DAS28 were small. At 12 months, the estimated differences between patients according to sex, age (80 yrs compared with 50 yrs), and cohort (for values of pain VAS ranging from 4 cm to 8 cm) did not exceed 0.6 DAS28 units. Pseudo-adjusted R2 for the DAS28 model was 0.30 (95% CI 0.28–0.32).
The association of baseline pain and stiffness with change in DAS28 differed depending on whether patients were recruited before or after 2002 (overall test of significance for stiffness was p = 0.022 and pain, p = 0.087). In both cohorts, greater pain VAS at baseline was associated with a slightly greater fall in DAS28 per month; this trend was stronger for patients recruited after 2002 (Figures 2A and 2B). In the earlier cohort, baseline EMS was not associated with rate of change in DAS28, but in the later cohort, longer duration of EMS was associated with greater reduction in DAS28 (Figures 2C and 2D).
Repeating the final model using only cases with complete data yielded similar results to those obtained through MI, although with lower power. The effect of symptom duration on change over time was reduced in the MI analysis compared with complete cases, whereas the interaction between baseline pain, cohort, and change over time was more apparent in the MI analysis.
Change in HAQ-DI
Table 3 shows the results of the multilevel model of HAQ-DI. As shown in Figure 3, higher baseline DAS28 and longer EMS duration were associated with slightly greater reduction in HAQ-DI and the effect of pain varied with cohort. Baseline HAQ-DI was higher in women than men by 0.217 units in cases recruited after 2002 and 0.091 units in cases recruited pre-2002, but the rate of change in HAQ-DI by sex was consistent between cohorts: average reduction was 0.028 units/month in men and 0.023 in women. As Figure 3C illustrates, reduction in HAQ-DI was between 0.006 and 0.012 units/month faster in patients with EMS ≥ 30 mins compared with < 30 mins (combined test of significance for all EMS categories p = 0.023), and was 0.004 units/month faster per unit of baseline DAS28. Baseline pain was not associated with reduction in HAQ-DI in patients recruited up to 2002 (0.001 HAQ units/cm), but there was a slightly stronger trend in the later cohort (0.003 HAQ units/cm). Pseudo-adjusted R2 for the HAQ model was 0.24 (95% CI 0.22–0.26).
DISCUSSION
Our study examined predictors of change in DAS28 and HAQ–DI in early RA, including patient-reported measures (pain, fatigue, and EMS) alongside traditional predictors of prognosis: sex, age, and antibody status. The rate of reduction in DAS28 was greater with increased age at baseline and slower in women than men. It was also faster in those with greater pain or EMS at baseline, especially for our patients recruited after 2002. However, effects attributable to statistically significant variables were small. The measurement error of DAS is 0.6, and therefore a reduction from baseline of twice this (> 1.20) is considered a good response14. In comparison, our present analyses predicted a reduction in DAS28 of 0.05 units/month in cases recruited after 2002 with EMS duration of ≥ 180 compared with < 30 min (about 0.3 units after 6 mos and 0.6 units after 12 mos). Further, fall in DAS28 was only 0.02 units/month faster per cm of baseline pain VAS in the later cohort where the effect was strongest. The effects of predictor variables on decrease in HAQ-DI were also small: the rate of change in HAQ-DI was 0.005 units/month slower in women than men, and 0.004 units/month faster per unit of baseline DAS28. For those who reported ≥ 180 compared with < 30 min of baseline EMS, decrease in HAQ-DI was 0.012 units greater per month. In our present analyses, fatigue did not significantly affect the rate of change in HAQ-DI, and pain had only a limited effect, restricted to the later cohort. Pain, fatigue, and EMS as predictors of change in disease activity and disability are therefore unlikely to have direct clinical applications.
These findings are consistent with previously reported associations of patient-reported symptoms and other outcomes. Recent data from the ESPOIR cohort found only a moderate correlation of fatigue and pain VAS with simultaneous DAS28 measurement, among other PRO15. Female sex is frequently identified as an independent predictor of adverse outcome in RA, including non-remission16 and lesser reduction in DAS2817, and the results from our present study were consistent with this. Although some studies have found an association between increasing age at baseline and non-remission, this effect is not consistent between studies16 and therefore our findings of only slightly faster reduction in DAS28 with increasing age at baseline were not surprising. There have also been several reported associations of increased age at RA onset with less favorable HAQ-DI18,19,20, and although we did not find an association between rate of change in HAQ-DI and age, baseline HAQ-DI was higher for older patients.
Whether the findings of our study can be applied in the context of modern RA management is influenced by contemporary treatment approaches. Current treat-to-target recommendations for RA management were published in 201021, after recruitment to the YEAR ended. However, our findings may still be applicable for certain patients, for example, those who cannot take full doses of MTX or other DMARD because of comorbidities or intolerance. The effect of treat-to-target on change in DAS28 and HAQ is an area for further study.
We are not aware of any other studies that have analyzed the use of PRO to predict change in DAS28 and HAQ-DI in RA. However, several studies have highlighted the contribution of noninflammatory pain to overall disease activity scores. The pain index of DAS28 (DAS28-P), described by researchers from the Early Rheumatoid Arthritis Network22, is the proportion of overall DAS28 derived from its subjective components. Improvement in pain measured using the Medical Outcomes Study Short Form-36 (SF-36) questionnaire after 1 year was less likely in patients with higher baseline DAS28-P22. Recently, in patients from a Danish cohort with RA completing the painDETECT questionnaire (designed to classify pain into low, medium, or high likelihood of being non-nociceptive), those whose scores indicated non-nociceptive pain had greater overall DAS28 and DAS28-P, measured at the time of questionnaire completion23. Therefore, any association of baseline pain with subsequent change in DAS28 (as seen predominantly in our post-2002 cohort) may reflect an association with the subjective DAS28 components rather than inflammation alone.
EMS is a disabling symptom that fluctuates with RA disease activity24, and helps to differentiate patients with RA from noninflammatory arthralgia25. In a prospective study that examined the effect of severity of EMS on early retirement, greater EMS at baseline was correlated with simultaneous measurements of DAS28, pain, and function, and those with severe stiffness at baseline were more likely to retire from employment within 3 years of followup26. Further findings from our study included an absence of association between EMS and radiographic progression, which was later supported by evidence from the Leiden Early Arthritis Clinic and ESPOIR cohorts in which prolonged EMS (> 60 min) was not associated with poor prognosis in terms of radiographic outcome after 3–7 years, or failure to achieve remission after 5–10 years or followup27. Although our study reported an association of greater EMS at baseline with greater rate of reduction in DAS28 in some patients in conflict with previous reports, the size of the effect was small (up to 0.06 units fall in DAS28 per month) and is unlikely to be clinically significant.
Data on fatigue and pain were identified in the form of VAS. Other methods of assessment are available to measure these variables, but the VAS was chosen because it was simple and quick for patients to complete alongside the other questionnaires that formed part of our study. A systematic review of scales to measure fatigue in RA identified 23 different scales, of which 6, including the VAS, had reasonable evidence of validation28. The review found evidence that a VAS performs reasonably well in terms of construct validity and discrimination, but there was little evidence to demonstrate reliability and a lack of a standardized format. However, although the VAS has its limitations, no other measures are superior in terms of validation, and further, the single-item VAS likely performs as well as other, more detailed measures of fatigue29. Therefore, we feel that the use of VAS was justified.
A significant limitation of our study was the quantity of missing data: 40% and 43% cases had missing values for the 6- and 12-month analyses, respectively. Despite clear evidence that modern missing data management techniques such as MI are superior, traditional approaches such as analysis restricted to cases with no missing data (complete case analysis and weighted complete case analysis) are still reported. Not only does this technique lead to a loss of statistical power when cases with missing data are dropped, complete case analysis is also more likely to give biased estimates30,31. The MI models created for our present analyses were carefully constructed, which involved scrutiny of the dataset to identify auxiliary variables, inclusion of all variables in the analysis model within the imputation model, and comparison of results to a complete case model. Because of the large quantity of missing data, we cannot rule out bias in the results of the analyses because of missingness; however, simulation studies have demonstrated that MI is superior to complete case analysis, even when the quantity of missing data is large32. Nevertheless, potential bias because of missing data should be considered when interpreting our findings. For example, we found no relationship between RF and ACPA positivity and adverse outcome, in contrast to previous reports that indicated an inverse association between autoantibodies and future remission33,34. Evidence for the relationship between autoantibodies and HAQ has been mixed, with some evidence of an association between autoantibodies and worse disability20,35, and some evidence to indicate there is no relationship between antibodies and HAQ36,37. The quantity of missing data was large for ACPA (39% of cases), so this is a potential source of bias.
An additional strength is the use of DAS28 and HAQ-DI as continuous rather than categorical or dichotomous (remission/non-remission) outcomes, thus improving statistical power. Although our study considered 3 separate patient-reported measures as predictors of outcome, it was not possible to assess the prediction value of several other similar variables. These include the RAPID338 and SF-3639, which were not collected in the YEAR, and the VAS of global health status, which was collected in the YEAR but was not included in the statistical models because it was strongly correlated with pain VAS. Our present study was also limited to examining the predictive value of PRO collected at the baseline visit. It is possible that trends in the change of these variables would be more useful as predictors of outcome and therefore could be an area of interest for future study.
Our study showed that PRO at baseline, such as pain, fatigue, and stiffness, are not useful for the prediction of rate of change in disease activity and disability.
APPENDIX 1
List of study collaborators. Yorkshire Early Arthritis Register consortium membership: Management Team: Professor Paul Emery, Professor Philip Conaghan, Professor Ann W. Morgan [Leeds Institute of Rheumatic and Musculoskeletal Medicine (LIRMM), Leeds Teaching Hospitals National Health Service (NHS) Trust], Professor Anne-Maree Keenan, and Dr. Elizabeth M.A. Hensor (LIRMM). Medical Staff: Dr. Mark Quinn (York District Hospital), Dr. Andrew Gough (Harrogate District Hospital), Dr. Michael Green (York District Hospital, Harrogate District Hospital), Dr. Richard Reece (Huddersfield Royal Infirmary), Dr. Lesley Hordon (Dewsbury District and General Hospital), Dr. Philip S. Helliwell (LIRMM, St. Luke’s Hospital), Dr. Richard Melsom (St. Luke’s Hospital), Dr. Sheelagh Doherty (Hull Royal Infirmary), Dr. Ade Adebajo (Barnsley District General Hospital), Dr. Andrew Harvey, Dr. Steve Jarrett (Pinderfields Hospital), Dr. Gareth Huston (LIRMM), Dr. Amanda Isdale (York District Hospital), Dr. Mike Martin (Leeds Teaching Hospitals NHS Trust), Dr. Zunaid Karim (Pinderfields Hospital), Professor Dennis McGonagle (LIRMM, Calderdale Royal Hospital), Dr. Colin Pease, Dr. Sally Cox (Leeds Teaching Hospitals NHS Trust), Dr. Victoria Bejarano (LIRMM), Dr. Jackie Nam, Dr. Edith Villeneuve, and Dr. Sarah Twigg (LIRMM, Leeds Teaching Hospitals NHS Trust). Nursing Staff: Claire Brown, Christine Thomas, David Pickles, Alison Hammond (LIRMM), Beverley Nevill (York District Hospital, Harrogate District Hospital), Alan Fairclough, Caroline Nunns (Huddersfield Royal Infirmary), Anne Gill, Julie Green (York District Hospital), Belinda Rhys-Evans, Barbara Padwell (Leeds Teaching Hospitals NHS Trust), Julie Madden, Lynda Taylor (Calderdale Royal Hospital), Sally Smith, Heather King (Leeds Teaching Hospitals NHS Trust), Jill Firth (St. Luke’s Hospital), Jayne Heard (Hull Royal Infirmary), and Linda Sigsworth (St. Luke’s Hospital). Support Staff: Diane Corscadden, Karen Henshaw, Lubna-Haroon Rashid, Stephen G. Martin, Dr. James I. Robinson, Dr. Lukasz Kozera, Dr. Agata Burska, Sarah Fahy, and Andrea Paterson (LIRMM).
Footnotes
Supported by the Arthritis Research Campaign (now Arthritis Research UK), National Institute for Health Research. The work of Dr. S. Twigg is supported by a National Institute for Health Research (NIHR) clinical lectureship and this project is supported by the NIHR Leeds Musculoskeletal Biomedical Research Unit. The Yorkshire Early Arthritis Register (YEAR) was in part supported by a program grant from Arthritis Research UK and the NIHR-Leeds Musculoskeletal Biomedical Research Unit.
- Accepted for publication April 26, 2017.