Abstract
Objective. Studies of pain in systemic sclerosis (SSc) have used a variety of measures, including single-item measures and the 15-item short-form McGill Pain Questionnaire (MPQ-SF). The objective of our study was to compare the performance of the MPQ-SF to a single-item pain numerical rating scale (NRS) and determine whether the MPQ-SF effectively differentiates between sensory and affective components of pain in SSc.
Methods. A cross-sectional, multicenter study of 1091 patients from the Canadian Scleroderma Research Group Registry who completed the MPQ-SF and pain NRS. Correlations of MPQ-SF total scores and pain NRS scores with relevant outcome measures (disability, quality of life, depressive symptoms) were compared. To assess whether the MPQ-SF differentiated between sensory and affective factors, confirmatory factor analysis modeling was used, and correlations of sensory and affective factor scores with other outcome measures were compared.
Results. MPQ-SF total score and the pain NRS correlated similarly with other outcome measures, as did the sensory and affective scores. MPQ-SF sensory and affective factors were highly correlated (0.92), and a single-factor model fit as well as a 2-factor (sensory and affective) model.
Conclusion. The substantial overlap between sensory and affective subscales of the MPQ-SF and the similarity of the MPQ-SF and NRS pain measures compared to other patient-reported outcomes suggest that the 15-item MPQ-SF does not provide tangible advantages compared to the single-item pain NRS. These findings support recommendations to use a single-item NRS pain measure in SSc as it is less burdensome to patients than the MPQ-SF.
Systemic sclerosis (SSc), or scleroderma, is a rare chronic autoimmune connective tissue disease that affects multiple organ systems. SSc is characterized by thickening and fibrosis of the skin and internal organs due to an increase in the production and deposits of collagen1,2. Patients are typically classified as having either limited or diffuse cutaneous SSc. Diffuse SSc is characterized by more widespread organ system involvement, and thus patients with diffuse SSc generally have a worse prognosis than those with limited SSc3. Disease onset most commonly occurs between the ages of 30 and 50 years, and women are at 4–5 times increased risk compared to men4. Median survival time from diagnosis is about 11 years, and patients are 3.7 times more likely to die within 10 years of diagnosis than are individuals without SSc, even after controlling for sex, age, and race4. Overall, SSc has a substantial negative effect on quality of life4,5,6,7,8,9,10,11,12,13, and patients with SSc report numerous painful symptoms, including skin breakdown, digital ulcers, Raynaud’s phenomenon, musculoskeletal pain, and gastrointestinal symptoms14,15.
Existing studies of pain in SSc have used a variety of pain assessment tools, including single-item pain visual analog scales (VAS)14,16, pain numerical rating scales (NRS)15, and the 15-item short-form McGill Pain Questionnaire (MPQ-SF)1,17. The 2003 OMERACT workshop on SSc (OMERACT 6) recommended the single-item VAS to measure painful symptoms of Raynaud’s phenomenon and digital ulcers in SSc16. The MPQ-SF has not been specifically validated in SSc. Research from other settings, however, has suggested that the MPQ-SF may provide a more informative assessment than single-item measures of pain. Longer measures are often more reliable than short measures, and thus have stronger validity characteristics, which would suggest a potential advantage of the MPQ-SF compared to single-item pain measures. Another possible advantage is that the MPQ-SF provides separate factor scores for sensory and affective pain dimensions18,19,20. Indeed, psychophysical studies show that the affective and sensory dimensions of pain relate differently to nociceptive stimulus intensity and are separately influenced by various psychological factors21. However, questions have been raised about the degree to which the MPQ-SF factor scores substantively differentiate these 2 dimensions18. Studies that have used confirmatory factor analysis (CFA) to assess the factor structure of the MPQ-SF have not directly compared 2-factor models to a single-factor model, but they have reported very high correlations between the sensory and affective factors (0.77 to 0.92)22,23,24, which would suggest that they may be too highly associated to be substantively differentiated.
No studies have compared the performance of single-item measures of pain to the MPQ-SF among patients with SSc. In terms of feasibility and patient burden, single-item measures are clearly advantageous compared to the 15-item MPQ-SF. On the other hand, using the MPQ-SF would be a potentially better option if it performed differently as a measure of general pain intensity than single-item measures or if there were evidence that it produces substantively different sensory and affective pain ratings.
Rare diseases, such as SSc, are increasingly investigated in large, multicenter cohort studies with many different measures administered to patients. In this context, it is important to reduce the burden on patients as much as possible without compromising measurement. Thus, our study had 2 main objectives. The first was to compare the MPQ-SF to a single-item measure of pain among patients with SSc. To determine whether the MPQ-SF may be a more robust measure of pain than single-item methods, we compared correlations of the MPQ-SF total score versus correlations of NRS pain ratings to related outcome measures of depressive symptoms, disability, and mental and physical functioning. The second objective was to assess the degree to which the MPQ-SF provides substantively distinguishable measures of sensory and affective pain. To do this, we assessed whether a 2-factor CFA model of pain (sensory and affective) provided a substantively better fit to the data than a 1-factor model among patients with SSc. In addition, we compared the correlations of sensory and affective subscale scores to other related outcome measures to determine the degree that they produced similar versus substantively different results.
MATERIALS AND METHODS
Patient sample
The study sample consisted of patients enrolled in the Canadian Scleroderma Research Group (CSRG) Registry. Patients in the registry were recruited from 15 centers across Canada. To be eligible for the Registry, patients must have a diagnosis of SSc made by the referring rheumatologist, be 18 years of age or older, and be fluent in English or French. At each annual registry visit, patients undergo an extensive clinical history, physical evaluation, and laboratory investigations and complete a series of self-report questionnaires. Patients from all centers provided informed consent, and the research ethics board of each center approved the data collection protocol. Only data from patients’ initial Registry visit were included in our study.
Measures
Pain was assessed with the MPQ-SF and a pain NRS. Outcome measures used for comparisons included the Center for Epidemiologic Studies Depression scale (CES-D), the Health Assessment Questionnaire–Disability Index (HAQ-DI), and the Mental Component Summary (MCS) and Physical Component Summary (PCS) scores of the Short-form 36 Health Survey Questionnaire. These measures were chosen because they are common outcomes of pain in patients with chronic diseases, including SSc. In addition, we used the Present Pain Intensity Scale (PPI) of the MPQ-SF to define a subsample of patients with substantial pain, for sensitivity analyses.
Demographic and disease variables
Demographic information was based on self-report. Patients’ medical histories and disease characteristics were obtained through clinical histories and examinations by study physicians. Limited skin disease was defined as skin involvement distal to the elbows and knees with or without face involvement25. SSc disease duration was determined as the time from onset of non-Raynaud’s symptoms based on a clinical history obtained by study rheumatologists. Skin involvement was assessed using the modified Rodnan total body skin score, with scores ranging from 0 to 5126.
The Short-form McGill Pain Questionnaire (MPQ-SF)
The MPQ-SF20 consists of a checklist of 15 adjectives, including 11 sensory and 4 affective descriptors of pain. Each descriptor is rated on a 4-point Likert scale for pain intensity that ranges from 0 (no pain) to 3 (severe pain), providing a total score ranging from 0 to 45, as well as subscale scores ranging from 0 to 33 for the sensory subscale and from 0 to 12 for the affective subscale. The MPQ-SF has been shown to have excellent psychometric properties across a wide range of patient groups20,22,23,24, but has not been evaluated in SSc.
Present Pain Intensity scale (PPI)
The PPI is a single-item measure of general pain intensity. Patients rate their present level of pain on a 5-point Likert scale that ranges from 0 (no pain) to 5 (excruciating). The PPI is administered as part of the MPQ-SF20, although scored separately.
Pain numerical rating scale (NRS)
The study used an 11-point pain NRS. Patients rated their level of current pain intensity according to the statement, “In the past week, how much pain have you had due to illness?”. Scores on the 11-point pain NRS range from 0 (no pain) to 10 (very severe pain). The pain NRS has strong validity and reliability across patient groups and performs similarly to the pain VAS27. An advantage of the pain NRS is its increased ease of scoring compared to a VAS27.
The Center for Epidemiological Studies Depression Scale (CES-D)
The CES-D is a 20-item scale that measures the frequency of depressive symptoms over the past week on a Likert scale that ranges from 0 (rarely or none of the time) to 3 (most or all of the time). Total scores range from 0 to 60, with standard cutoffs of > 16 for “possible depression” and > 23 for “probable depression”28. Among a sample of 470 patients with SSc from the CSRG, the CES-D had good reliability and convergent validity with other self-report measures29.
The Health Assessment Questionnaire–Disability Index (HAQ-DI)
The HAQ-DI is a 20-item self-report questionnaire designed to assess functional ability in patients with arthritis6. The HAQ is the most widely used instrument among patient-reported measures of functional status in SSc. It has been shown to have good face and construct validity and reliability, high sensitivity to change and to predict survival in SSc12,30,31,32. A higher score on the HAQ-DI indicates a greater level of disability, with a total score range of 0 (no disability) to 3 (severe disability).
The Short-form 36 Health Survey Questionnaire (SF-36)
The SF-3633,34 is the most widely used and evaluated health outcomes measure. It has been shown to be valid and reliable in multiple populations, including SSc9,10,30. The SF-36 comprises 8 domains: physical functioning, social functioning, role limitations related to physical problems, role limitations related to emotional problems, mental health, vitality, bodily pain, and general health perceptions. Each domain can be scored separately, with scores that can range from 0 (worst health state) to 100 (best health state). Domain scores are summarized into a Physical Component Summary (PCS) score and a Mental Component Summary (MCS) score. The PCS and MCS are scored with norm-based scoring based on a general population sample in order to produce T scores for each individual patient (mean of 50 and SD of 10). Our study used version 2 of the SF-36.
Data analyses
To determine whether the MPQ-SF total score is more robustly associated than the single-item NRS is with the set of outcome measures, Pearson’s bivariate correlations with 95% CI were computed. In addition to 95% CI for the correlations, 95% CI for the difference between the MPQ-SF and NRS correlations (Δr) with each of the other outcome measures were also calculated.
To assess the degree to which a 2-factor model (sensory and affective) of the MPQ-SF more accurately describes the data than a single-factor model, and thus to assess the degree to which the sensory and affective factors measure distinct components of pain among patients with SSc, we conducted CFA using Mplus35. CFA is well suited to estimate true interfactor correlations because item error variances are explicitly modeled. In exploratory factor analysis, for example, the use of raw item scores would be expected to reduce interfactor correlations due to measurement error in the items. Because MPQ-SF item responses are measured on an ordinal Likert scale, the weighted least-squares estimator with a diagonal weight matrix and robust standard errors and a mean and variance adjusted chi-square statistic was used with delta parameterization35. Modification indices were used to identify pairs of items within scales for which model fit would improve if error estimates were freed to covary and for which there appeared to be theoretically justifiable shared method effects36. Chi-square goodness-of-fit and 3 fit indices were used to assess model fit, including the Tucker-Lewis Index (TLI)37, the comparative fit index (CFI)38, and the root mean-square error of approximation (RMSEA)39. Since the chi-square test is highly sensitive to sample size and can lead to the rejection of well-fitting models, practical fit indices were emphasized40. Guidelines proposed by Hu and Bentler41 have suggested that models with TLI and CFI close to 0.95 or higher and the RMSEA close to 0.06 or lower are representative of good-fitting models. A CFI ≥ 0.9042 and RMSEA ≥ 0.0843 may also be considered to represent reasonably acceptable model fit.
We used the Mplus DIFFTEST procedure to assess the differences in fit between 2-factor and single-factor MPQ-SF models. A negligible difference between the fit of the 2 models would indicate that the 2-factor model does not improve substantively upon the single-factor model. The DIFFTEST is a chi-square-based procedure, however, so it is sensitive to sample size. Cheung and Rensvold44 recommended comparing the change in goodness-of-fit indices, which are not affected by sample size, between 2 models to determine whether there are substantive differences in model fit. Consistent with Cheung’s recommendations, we compared the CFI between the single-factor and 2-factor models with a difference of ≤ 0.01 indicative of substantively similar models44.
Pearson’s bivariate correlations with 95% CI were also computed to assess whether the sensory and affective subscales of the MPQ-SF were differentially associated with other relevant outcome measures, including measures of depressive symptoms (CES-D), mental and physical functioning (SF-36-MCS, SF-36-PCS), and disability (HAQ-DI). In addition to 95% CI for the correlations, 95% CI for the difference between the sensory and the affective subscale correlations (Δr) with each of the other outcome measures were calculated too.
Many patients in the sample had low pain scores, and their inclusion in analyses could potentially affect results. Therefore, we conducted a sensitivity analysis (available from the authors upon request) by repeating all analyses using only data from patients who both reported a discomforting to excruciating level of pain on the PPI (score 2–5) and endorsed at least 1 item on the MPQ-SF.
RESULTS
Sample characteristics
A total of 1091 patients completed all items of the MPQ-SF. Mean age was 55.3 years (SD 7.4, range 18–88, n = 1089 with age data). About 92% (n = 908 of 986) were white, 86% (n = 938 of 1091) were women, 70% (n = 704 of 1012) were married or living as married, 27% (n = 274 of 1008) had college education or higher, and 41% (n = 414 of 1003) had a full-time or part-time job. Mean duration since diagnosis of SSc was 10.9 years (SD 9.5, range 0–53.8, n = 1047), and 37.6% of patients (n = 402 of 1068) had diffuse SSc. Mean total body skin score was 10.2 (SD 9.6, range 0–48.0, n = 1052). Patients reported a mean total score of 6.1 on the MPQ-SF (SD 7.4, range 0–42), including 4.7 on the MPQ-SF sensory subscale (SD 5.8, range 0–33) and 1.4 on the MPQ-SF affective subscale (SD 2.1, range 0–12). Mean score was 3.6 (SD 2.8, range 0–10, n = 1025) on the pain NRS. Means and SD for other outcome variables are presented in Table 1. Of 1091 patients who completed the MPQ-SF, 985 also had complete data for the NRS. A total of 459 patients with PPI score ≥ 2 endorsed at least 1 MPQ-SF item and were included in the sensitivity analysis (supplementary data available from the authors upon request).
Comparison of MPQ-SF correlations with other outcomes versus NRS correlations
As shown in Table 1, correlations of the MPQ-SF and the NRS with the CES-D, HAQ, and SF-36 component scores were moderate, ranging from 0.32 to 0.59. The differences between MPQ-SF correlations and the NRS (Δr) and each of the other measures were small and varied between −0.05 and 0.15. The NRS had significantly higher correlations than the MPQ-SF with the physical component of the SF-36 (Δr = 0.15, 95% CI 0.12 to 0.29). Results were similar for the sensitivity analysis (available from the authors upon request).
Confirmatory factor analysis
CFA was used to test the hypothesized 2-factor (sensory and affective) structure for the MPQ-SF. Four pairs of item error covariances were freed based on modification indices. In each case, both members of the pair demonstrated shared method or format features. Error variances were freed to covary for (1) throbbing with shooting, (2) shooting with sharp, (3) shooting with stabbing, and (4) stabbing with sharp. Model fit for the 2-factor model in the total sample was good [chi-square (55) = 198.3, p < 0.001, CFI = 0.97, TLI = 0.99, RMSEA = 0.05]. The correlation between the sensory and affective factors was 0.92. Model fit was also good for the single-factor model [chi-square (54) = 234.2, p < 0.001, CFI = 0.96, TLI = 0.99, RMSEA = 0.06]. As expected, given the large sample size, the chi-square-based DIFFTEST procedure that compared the fit of the 2 models was statistically significant [chi-square (1) = 26.1, p < 0.001]. The differences in goodness-of-fit indices between the 2-factor and single-factor models, however, were negligible (ΔCFI = 0.01, ΔTLI = 0.00, ΔRMSEA = −0.01; Table 2). For patients with significant levels of pain who were included in the sensitivity analysis, results were similar, and the 2 factors correlated at 0.86 with nonsubstantive differences in fit between 1- and 2-factor models (supplementary data available from the authors upon request).
Correlations of sensory and affective subscale scores with NRS and other outcome measures
The raw correlation between the sensory and affective subscale scores was 0.70 (95% CI 0.68 to 0.74, p < 0.001). Similar to analyses with the MPQ-SF total score, correlations between the sensory and affective scores and the CES-D, HAQ, and SF-36 component scores were moderate (0.30 to 0.49). Bivariate correlations between the sensory and affective subscales, the NRS, and other outcome measures were similar (Table 3), and all differences between sensory and affective subscale correlations (Δr) with other outcome measures were < 0.10. A statistically significant difference was found only for the MCS (Δr = 0.09, 95% CI 0.01 to 0.18). Results from sensitivity analyses were similar (available from the authors upon request).
DISCUSSION
We found that the sensory and affective factors of the MPQ-SF were highly correlated in a model that included all patients (r = 0.92) and a model of data only from patients who reported at least “discomforting” pain and endorsed at least 1 item on the MPQ-SF (r = 0.86). Both the very high intercorrelations of these 2 factors and the virtually identical fit indices that were obtained when 2-factor and single-factor models of the MPQ-SF were compared provide evidence that the sensory and affective factors of the MPQ-SF are not sufficiently distinguishable to make a practical difference among patients with SSc. When correlations of the sensory and affective subscales with measures of mental health, physical health, and disability were compared, the affective subscale of the MPQ-SF tended to be somewhat more closely associated with mental health outcome measures, but did not differ from the sensory subscale in comparisons to measures of physical function or disability.
When the correlations of the MPQ-SF total score with mental health, physical health, and disability outcome measures were compared to similar correlations with the pain NRS, there were few significant or substantially meaningful differences. Notably, when differences were detected, the single-item measure tended to produce more robust correlations, which is not what one would expect if the single-item measure was less valid due to poor reliability compared to the 15-item MPQ-SF.
Three previous studies used CFA methods to assess the fit of the standard 2-factor model of the MPQ-SF22,23,24 in nonrheumatology samples. Two of these22,24 were studies of 188 and 373 patients with chronic back pain, and each found that the sensory and affective subscales were highly intercorrelated (0.88 and 0.89), similar to the results of our study. A third study23, conducted in a sample of 338 patients with major burn injury, reported a somewhat lower correlation of 0.77 and somewhat better fit indices for the 2-factor model compared to the single-factor model. Although the burn patients in that study had higher mean MPQ-SF scores (15.1) than those in our study (6.1 in total sample, 11.1 in sample of patients with pain), mean scores were similar to the 2 chronic back pain samples (14.9 and 16.5). This suggests that different intensities of pain across the samples would not likely explain the somewhat different model fit among burn patients. Regardless, the consistency of findings across samples of patients with chronic pain, including patients with SSc in our study and patients with chronic back pain from previous studies22,24, suggests that the sensory and affective subscales are not sufficiently different in these patient groups as to produce incremental benefit compared to the use of a single-item pain NRS.
On the whole, there were no consistent patterns to suggest that the MPQ-SF would be substantively advantageous for measuring pain among patients with SSc compared to a single-item measure of pain. This finding supports recommendations for the use of single-item measures of pain in SSc16. Indeed, without evidence for additional measurement benefits of the MPQ-SF versus single-item measures, single-item measures are preferrable because they limit the response burden of patients and are more cost- and time-effective for researchers and clinicians. It is possible that the adjectives used to describe pain in the MPQ-SF may be useful descriptively or could be helpful in determining pain treatment for patients with SSc, but this should be examined in future research. It is also possible that despite their high correlations, the MPQ-SF and NRS may measure somewhat different components of pain, but this also would need to be demonstrated.
A number of limitations should be considered in evaluating these results. Our study was based on a convenience sample of patients with SSc. The sample included here tended to have a stable pattern of disease, as indicated by the mean duration of 10.9 years since time of diagnosis. Patients who are not treated by a rheumatologist, who have more severe SSc so as to limit participation in the CSRG Registry, or who die early in the course of their disease may be undersampled in the CSRG Registry. Therefore, our sample may have included an over-representation of healthier patients with SSc. On the other hand, our patient sample was drawn from a large number of centers across Canada, and the demographic and disease characteristics are consistent with other outpatient samples4.
Another limitation is that the MPQ-SF has not been validated in SSc. Related to this, given the cross-sectional design of this study, we were unable to conduct test-retest measures of reliability or to test for responsiveness to treatment of the MPQ-SF, its subscales, and the single-item pain measures. Other measures of pain or sleep, for instance, which may have been informative, were not available. This study did not assess the relative reliability, extent of measurement error, or responsiveness of the MPQ-SF compared to a single-item measure of pain in SSc. The study was observational and did not include a pain treatment component, so data could not be stratified based on treatment offered. Finally, the study did not account for different sources of pain, and it is possible that patients with pain caused by reflux, joint pain, or other sources might have different profiles.
The results of our study showed that the MPQ-SF did not perform substantively differently than a single-item pain NRS compared to other important patient-reported outcomes. Further, the sensory and affective factors of the MPQ-SF were too highly correlated to meaningfully distinguish separate constructs among patients with SSc. These results support recommendations for the use of single-item pain measures in SSc and suggest that the MPQ-SF would add to patient burden without improving measurement substantively. These results have implications for data collection in cohort studies and clinical trials, as well as for routine pain assessment in clinical settings.
APPENDIX
List of study collaborators. Canadian Scleroderma Research Group recruiting rheumatologists: J. Pope, London, Ontario; M. Baron, Montreal, Quebec; J. Markland, Saskatoon, Saskatchewan; N.A. Khalidi, Hamilton, Ontario; A. Masetto, Sherbrooke, Quebec; E. Sutton, Halifax, Nova Scotia; N. Jones, Edmonton, Alberta; D. Robinson, Winnipeg, Manitoba; E. Kaminska, Hamilton, Ontario; P. Docherty, Moncton, New Brunswick; C.D. Smith, Ottawa, Ontario; J-P. Mathieu, Montreal, Quebec; S. LeClercq, Calgary, Alberta; M. Hudson, Montreal, Quebec; S. Ligier, Montreal, Quebec; T. Grodzicky, Montreal, Quebec; C. Thorne, Newmarket, Ontario; S. Mittoo, Winnipeg, Manitoba; M. Fritzler, Advanced Diagnostics Laboratory, Calgary, Alberta.
Footnotes
-
Supported by a research grant from the Fonds de la Recherche en Santé Québec (B.D. Thombs) and by an American College of Rheumatology Research and Education Rheumatology Investigator Award (B.D. Thombs). Dr. Baron is the director of the Canadian Scleroderma Research Group, which receives grant funding from the Canadian Institutes of Health Research, the Scleroderma Society of Canada and its provincial chapters, Scleroderma Society of Ontario, Sclérodermie Québec, and the Ontario Arthritis Society, and education grants from Actelion Pharmaceuticals and Pfizer Inc. Drs. Thombs and Hudson are supported by New Investigator Awards from the Canadian Institutes of Health Research and Établissement de Jeunes Chercheurs awards from the Fonds de la Recherche en Santé Québec.
- Accepted for publication July 6, 2011.