Abstract
Objective. The Health Assessment Questionnaire Disability Index (HAQ) is a widely used outcome measure in rheumatoid arthritis (RA), whereas the SF-12v2 Health Survey (SF-12) was introduced recently. We investigated how the HAQ and SF-12 were associated with socio-demographic, lifestyle, and disease- and treatment-related factors in patients with RA.
Methods. In RA patients from 11 Danish centers, clinical and patient-reported data, including the HAQ and SF-12, were collected. Three multiple linear regression models were estimated, with the HAQ, SF-12 physical component score (PCS), and SF-12 mental component score (MCS) as outcome and sociodemographic, lifestyle, and RA-related treatment and comorbidity characteristics as explanatory variables.
Results. In total, 3156 (85%) of 3704 invited patients participated — 75% women, 76% rheumatoid factor-positive, median age 61 years (range 15–93 yrs), disease duration 7 years (range 0–68 yrs), Disease Activity Score on 28 joints (DAS28) 2.97 (range 0.96–8.61), HAQ score 0.63 (range 0–3), SF-12 PCS 56 (range 6–99), and SF-12 MCS 57 (range 16–99). Variation in HAQ was associated with 12 of 15 possible variables (R2 0.41), in PCS and MCS with 6 of 15 variables (R2 0.02 and 0.05). Patients with moderate to high DAS28 and ≥ 3 comorbid conditions had consistently worse HAQ and SF-12 scores compared to the reference groups, while weekly exercise was associated with better scores compared to no exercise.
Conclusion. The HAQ was more sensitive to differences in demographic, lifestyle, and disease- and treatment-related factors than the SF-12. The established clinical value and feasibility of the HAQ highlights its advantages over the SF-12 in describing health status in RA.
- RHEUMATOID ARTHRITIS
- SF-12v2 HEALTH SURVEY
- HEALTH ASSESSMENT QUESTIONNAIRE DISABILITY INDEX
- HEALTH STATUS
Rheumatoid arthritis (RA) is a chronic disabling disease affecting physical, mental, and social aspects of patients’ lives. Traditional clinical disease markers such as joint counts and serum C-reactive protein (s-CRP) quantify some of the objective aspects of RA, but do not embrace the full spectrum of disease consequences. In order to comply with the demand for a broader assessment of RA, patient-reported outcome (PRO) instruments, such as the SF-36 Health Survey (SF-36) and the Health Assessment Questionnaire Disability Index (HAQ), are frequently used to supplement the clinical disease markers. In a clinical trial of antiinflammatory agents, it was concluded that currently used efficacy endpoints were only weak predictors of change in HAQ and SF-36 scores1, which indicates that these measures provide important supplementary information.
The challenge of interpreting PRO relates to their association with various factors that vary between individuals, some directly measurable and others less measurable. Much work has thus been directed at investigating the associations between various PRO and physical and mental (disease and treatment, lifestyle), social (demography, socioeconomic status), psychosocial (coping strategies), and societal (culture, religion) factors to gain a better understanding of PRO in RA.
The HAQ is widely accepted as a patient-report measure of physical function in RA, while the SF-12v2 Health Survey (SF-12), a shorter and less time-consuming version of the SF-36, has recently been introduced as a generic measure of physical and mental health status. In contrast to the extensive SF-36 literature that warrants its use as a PRO measure of health status in various diseases as well as in healthy populations, the SF-12 has been less well studied.
Studies have investigated the association of various factors with the HAQ and SF-36, and found these measures to be related to many different factors2–8. However, such findings may not be transferable to the SF-12.
Our aim was to investigate the association of an established measure of physical function (the HAQ) and a less studied measure of health status (SF-12) with sociodemographic, lifestyle, and disease- and treatment-related factors in RA.
MATERIALS AND METHODS
Patients and data collection
A cross-sectional study involving 11 Danish outpatient rheumatology clinics was undertaken from July 2006 to July 2007. All patients with a diagnosis of RA as defined by the American College of Rheumatology 1987 criteria9 were eligible for inclusion, while no exclusion criteria were specified. Clinical and patient-reported data were recorded by the physicians and patients on separate forms during routine visits in the clinic. Reasons for nonparticipation were logged. Clinical data included disease duration, disease activity [swollen and tender joint counts, s-CRP, physician’s global RA assessment on a visual analog scale (VAS)], disease severity [IgM-rheumatoid factor (IgM-RF) status, presence of joint erosions on conventional radiographs, and rheumatoid nodules], and treatment (disease modifying antirheumatic drugs, biologicals, glucocorticoids). Patient-reported data included sociodemography (sex, age, marital status, education), lifestyle factors [smoking, body mass index (BMI), exercise habits], and disease-related factors (patient global VAS for RA, extraarticular features, joint surgery, and presence of comorbidity from a list of 17 chronic diseases). Finally, health status and daily functioning according to the validated Danish SF-12 questionnaire and the HAQ were recorded.
Missing data in the RA-related variables were replaced by predicted values based on regression models with sex, age, disease duration, and s-CRP as explanatory variables.
Questionnaires
The SF-12 (QualityMetric, Lincoln, RI, USA) is a multi-dimensional health status profile instrument covering both physical and mental aspects of health10,11. It includes 12 questions with predefined answer categories that can be combined into 8 dimensions of health: physical functioning (PF), physical role limitations (RP), bodily pain (BP), general health perceptions (GH), vitality (VT), social functioning (SF), emotional role limitations (RE), and mental health (MH) can be derived. The physical dimensions (PF, RP, BP, and GH) can be summed into a physical component score (PCS) and the mental dimensions (VT, SF, RE, and MH) into a mental component score (MCS) ranging from 0 (poor health) to 100 (perfect health). The minimum clinically important difference (MCID) has not been determined. We applied the Danish 4-week recall version 2 and employed the SF Health Outcomes™ Scoring Software in both scoring and handling of missing data. A fee applies for use of the SF-1212.
The HAQ is primarily used as a measure of functional disability in RA and has also been used across medical disciplines and normal aging populations. It includes 20 questions in 8 categories of functioning (dressing, arising, eating, walking, hygiene, reach, grip, and usual activities), and the response options range from 0 (no difficulty) to 3 (unable to do). The highest scores of each category are summed and divided by 8, resulting in a possible range of total scores (HAQ score) from 0 (no difficulty) to 3 (unable to do)13. The HAQ is in the public domain, and the score can be calculated by the clinician during the visit. An improvement of 0.22 is generally accepted as the MCID14,15. We applied the Danish translation16 and computed the score without including aids or help from other people. Missing items were replaced by predicted values when a minimum of 16 (80%) of the questions were answered, otherwise the observation was excluded. The predicted item values were based on an ordinal logistic regression model with sex, age, disease duration, and s-CRP as explanatory variables before the total score was calculated.
Data quality
The physicians’ and patients’ questionnaires were scanned using Cardiff Teleform (v 6.0; Cardiff Software, San Marcos, CA, USA) and exported into the DANBIO registry via an Access database17. Quality assessment was initially performed on 200 randomly selected questionnaires by manually comparing all responses with the recorded scores. A 1% data error on each single data item had been determined as the upper limit, and items exceeding the limit were to be checked in all questionnaires. Data validation including identification of outliers and assessment of distributions was performed by inspection of graphics and frequency tables. All alterations from the scanned data set were documented.
Statistical analyses
Statistical analyses were performed with Stata version 9.0, and a p value ≤ 0.05 was chosen as the level of statistical significance. We employed multiple linear regression analyses; missing observation analyses comprised demographic, disease, and treatment-related comparisons between patients included in the analyses and those excluded due to incomplete HAQ and SF-12 data. Correlation between outcome variables (HAQ and SF-12) was determined by Spearman’s rho.
The possible explanatory variables were arranged in 5 groups: sociodemography (sex, age, marital status, education), lifestyle (BMI, smoking, exercise habits), RA-related [disease duration, Disease Activity Score for 28 joints based on s-CRP (DAS28), IgM-RF, joint erosions, rheumatoid nodules, VAS scores for physician’s global RA assessment, joint surgery, extraarticular features], treatment [use of methotrexate (MTX), biologicals, and glucocorticoids], and the number of comorbidities. We assumed nonlinearity in the continuous explanatory variables and these were categorized based on graphical examination and formal testing. Most of the variables were categorical (e.g., sex, marital status, education, smoking) or had well accepted clinical categories (BMI, DAS28). Thus, we transformed only age and disease duration, as interpretation of the coefficients would have been less precise across the range of the variable and less intuitive, as they would reflect a 1-year change had we kept them continuous. Collinearity between explanatory variables was assessed in a Spearman rank correlation matrix prior to estimating the models; in pairs with correlation coefficients > 0.30, one or both variables were excluded from further analysis based on clinical judgment. This led to the exclusion of joint erosions, rheumatoid nodules, and physician’s global VAS score (coefficients 0.31–0.48). Sex, age, and disease duration were included regardless of the level of association with other variables because of an a priori hypothesis of an independent influence on health status. Therefore, joint surgery and disease duration were both retained despite an intercorrelation of 0.42. Within each of the 5 groups of explanatory variables, multiple linear regression models were estimated with the HAQ and SF-12 (PCS and MCS) as outcome, and ultimately, the 3 models were fitted by entering the significant variables from each group. A variable was retained in the model if the estimated parameter for one of its categories was significant. The variable selection was verified by simultaneous entry of all explanatory variables, manual backwards selection followed by reentry of previously removed variables.
According to Danish law, no ethical approval was needed for this study. Oral and written information was passed to the participants; written consent is not required in questionnaire surveys, as a returned questionnaire is assumed to indicate consent. The DANBIO registry is approved by the National Board of Health and the Danish Data Protection Agency.
RESULTS
Patients
In total, 3704 patients were recruited, 3156 (85%) completed the questionnaire part, while 548 did not respond due to reluctance (29%), physical or mental difficulties (13%), language barriers (5%), or other reasons (53%). Demographic and disease- and treatment-related characteristics for respondents versus nonrespondents are shown in Table 1. The nonrespondents were older and had longer disease duration; they were less likely to receive MTX and biologicals, but more likely to receive glucocorticoids.
Patient characteristics by responder status (n = 3704). Values are median (interquartile range) unless otherwise stated.
Questionnaire results
HAQ and SF-12 scores are shown in Table 2. The mean HAQ score was slightly higher than the median, and the graphic inspection showed that the data were right-skewed. Seventeen percent of respondents scored 0 (best possible score). Twenty-two percent missed a few items of the HAQ. The mean SF-12 scores were 54 for PCS and 58 for MCS and were similar to the median scores. The small standard deviations (PCS 17, MCS 18) and narrow interquartile ranges (PCS 44–67, MCS 47–74) indicated limited variation of the SF-12 scores. There were no ceiling or floor effects, but 15% and 16% of the respondents missed at least one item of the PCS and MCS, respectively.
HAQ and SF-12 scores after imputation of missing items.
A total of 380 patients were excluded from the regression analyses due to incomplete HAQ and SF-12 data (missing > 20% of the items). They were older (7 yrs), more of them had radiographic erosions (70% vs 63%), and more had received glucocorticoids (26% vs 19%) (Table 1).
The HAQ score was significantly but weakly associated with the PCS and MCS scores (Spearman’s rho −0.15 and −0.25), while the correlation between MCS and PCS was −0.40.
The Spearman rank correlation matrix revealed no correlation coefficients above 0.50, and all values except 5 were below 0.30 (data not shown). In 4 of these 5 pairs (correlation coefficients 0.31–0.48) one variable in the pair was excluded from the regression analysis due to insignificance. In the last remaining pair (disease duration and joint surgery, correlation coefficient 0.42) both variables remained in the model.
In univariable analyses of the 15 possible explanatory variables, the HAQ was able to significantly discriminate across all, except smoking; whereas the PCS and MCS could discriminate across 5 and 12 variables, respectively. The discriminative ability of the PCS and MCS was present in 4 and 5 groups of explanatory variables, respectively (data not shown). Generally, the HAQ score differences across the categorized explanatory variables were larger compared to the SF-12 score differences (Figure 1).
HAQ and SF-12 median score differences across sex, age, treatment, joint surgery, exercise, and comorbidity groups.
Regression analyses
The final regression models with the HAQ, PCS, and MCS as outcomes are presented in Table 3. Patients with moderate to high DAS28 and more than 3 comorbid conditions had consistently worse HAQ and SF-12 scores compared to the reference group. Joint surgery (≥ 2 procedures) worsened the HAQ (0.55) and MCS (−4) scores, whereas it improved the PCS scores (3) compared to the reference group. Weekly exercise was associated with a better outcome across instruments, compared to not exercising regularly.
Final multiple linear regression models with the HAQ, SF-12, physical component score (PCS), and mental component score (MCS) as outcome. Explanatory variables that did not reach statistical significance in any of the models were excluded from the table. Number of observations = 2776.
Twelve of the 15 possible explanatory variables were significantly associated with the HAQ score (adjusted R2 0.41). Women and older patients (age > 75 yrs) had worse HAQ scores compared to the reference group.
Six of the 15 explanatory variables were significantly associated with the PCS score (adjusted R2 0.02). Glucocorticoid use within the last month and extraarticular features were associated with a worse outcome compared to the references.
Six of the 15 explanatory variables distributed across all 5 groups were significantly associated with the MCS score (adjusted R2 0.05). Women had worse MCS scores than men.
DISCUSSION
We investigated the association between the HAQ and SF-12 and a wide range of potential explanatory variables, finding insight into the interpretation of the instruments. We conducted a large, nationwide cross-sectional study on a population of RA outpatients with a highly satisfactory response rate of 85%.
We found that the HAQ was strongly associated with the majority of the investigated demographic, lifestyle, and disease- and treatment-related factors, while the SF-12 was less associated with such factors.
Strengths of our study include the large sample size (roughly 20% of the estimated total number of RA out-patients in Denmark), a high response rate, and good data quality. Moreover, the patients were recruited from clinics in different geographic areas and environments (university vs general hospitals), which indicates that they adequately represent an RA outpatient population.
Limitations of the study relate to the transferability of results and unequal distribution of missing observations. Our study population had a relatively low disease activity and HAQ score, which to some degree prevents conclusions being transferred to RA patients with more active disease. However, since the markers of disease severity indicate comparability with RA populations presented in clinical trials, the low disease activity and HAQ score in our study may reflect a modern RA treatment strategy, with fairly good disease control in many patients.
The respondents differed from the nonrespondents in age, disease duration, and treatment. Nonrespondents were older, had longer disease duration, and received less intensive treatment with MTX and biologicals. A similar pattern characterized the patients excluded from the regression analyses due to missing HAQ and SF-12 data, although this was less strong. In univariable linear models, the HAQ score increased by 0.01 and 0.02 per year of age and disease duration, respectively, while the SF-12 score changes were insignificant. The small differences in age and disease duration seen in our study are not likely to directly affect the results; however, possible indirect effects should be considered. These might include systematic differences in disease severity, comorbidity, and lifestyle.
A number of different methods have been used to identify possible predictors of health outcomes in RA, and the results vary accordingly. Literature reports regarding the SF-12 in inflammatory rheumatic disease mainly concern the validity of the instrument11,18 and how the score is influenced by comorbid conditions19,20. To our knowledge, no studies have determined the MCID or investigated the association of the SF-12 with demographic, socioeconomic, lifestyle, and disease-related factors. However, Kosinski, et al observed that improvements of 3 to 4.4 points (PCS) and 2.2 to 4.7 points (MCS) represent the MCID for the SF-3615, and other studies on the SF-36 have shown that severe disease and low socioeconomic status are associated with worse scores, while the influence of demographic factors is unclear. These studies provided adjusted R2 values of 0.44–0.62 in the physical subscales and 0.25–0.56 in the mental subscales of the SF-36 when including different disease-related, socioeconomic, and psychosocial factors and comorbidity2,5,21,22 in linear regression models. We found the SF-12 was surprisingly unaffected by the majority of disease-related and sociodemographic and lifestyle factors, as illustrated by very low R2 values of 0.02–0.05. The majority of the variation in the SF-12 thus remained unexplained, which may be an indication that important sensitivity is lost compared to the SF-36. Moreover, when aggregating the 8 dimensions into 2 component scores (the PCS and MCS), the potential variation in the dimensions with a direct relation to RA may have been evened out, and the aggregation may also explain the low observed correlation with the HAQ.
The most consistent results regarding the HAQ have shown that female sex, older age, longer disease duration, lower socioeconomic status, comorbidity, and possibly inappropriate lifestyle choices are associated with a worse score2–4,6,8,23–27. The studies using multiple linear regression models revealed R2 values of 0.22–0.60 when including different disease-related, socioeconomic, and demographic factors, which is comparable to our result of 0.41. Our study has thus established most of these findings, although we did not find socioeconomic status to be significantly associated with the HAQ. We used the education level as an indicator for socioeconomic status, whereas other studies have used specific measures, such as the Carstairs score, and this may have contributed to the different results.
For clarity it should be noted that the HAQ is often incorrectly described as “disease-specific” and the SF-6/12/36 as “generic,” whereas both instruments are “generic” and each has been used in many and diverse disease areas. The HAQ measures the single dimension of physical function and can be accurately described as “unidimensional,” having a single principal component. The SF-12/36 assesses 8 dimensions of health status and can best be described as “multidimensional” or as a “profile.”
An unexpected difference between the HAQ and PCS should be noted: the effect of joint surgery was inverted. This phenomenon gives rise to further questions: does the HAQ measure physical function given a surgical intervention, while the SF-12 measures health status in spite of an intervention? To our knowledge, such effects have not been described previously and need to be studied further.
The clinical value of the HAQ is established in this study, whereas the SF-12 has some important shortcomings: (1) the scores do not seem to be sensitive to variation between patient groups; (2) the scores are not easily interpretable; and (3) the scores are not easily calculated during a clinic visit.
In summary, the HAQ appeared to be more sensitive to differences in demographic, lifestyle, and disease- and treatment-related factors than the SF-12, which possibly reflects the different focus of the instruments and a loss in sensitivity when reducing the number of questions from 36 (SF-36) to 12 (SF-12). The established clinical value and feasibility of the HAQ highlights its advantages over the SF-12 in describing health status in RA.
Acknowledgments
We thank the Departments of Rheumatology at Gråsten, Hvidovre, Hjørring, Hørsholm, Vejle, Bispebjerg, Holbæk, Odense, Slagelse, Herlev, and Frederiksberg Hospitals for their vital contribution to the study.
Footnotes
-
Supported by an unrestricted research grant from Schering-Plough A/S. Dr. Sørensen was supported by a grant from the National Board of Health.
- Accepted for publication May 12, 2009.