Abstract
Objective. An OMERACT consensus process recommended domains for investigation in fibromyalgia (FM) clinical trials. We used patient data to investigate variable importance in the determination of patient global and health-related quality of life (HRQOL) in FM and non-FM patients to determine whether variables were valued differently in FM compared with non-FM states.
Methods. We used ACR 2010 diagnostic FM criteria modified for epidemiological and clinical research to identify patients with rheumatoid arthritis (RA; N = 5884) with and without FM, and also characterized previously diagnosed patients with FM (N = 808) as to current criteria status. We measured variable importance by multivariable regression, decomposing regression variance by averaging over model orderings. We examined the distributions of key variables in the various disorders, and the distributions as a function of a FM severity index (fibromyalgianess).
Results. Out of 9 measures, pain, Health Assessment Questionnaire disability index, and fatigue explained more than 50% of explainable variance (50.49%–56.59%). Explained variance was similar across all disorders and diagnostic groups. In addition, the SF-36 physical component summary score varied across disorders as a function of fibromyalgianess.
Conclusion. The main determinants of global severity and HRQOL in FM are pain, function, and fatigue. But these variables are also the main determinants in RA and other rheumatic diseases. The content and impact of FM, whether measured by discrete variables or a fibromyalgianess scale, seems to be independent of diagnosis. These data argue for a common set of variables rather than disease-specific variables. Clinical use is supported and enhanced by simple measures.
Fibromyalgia (FM) is a disorder that can be conceived of as either a distinct syndrome, with its own criteria, or as the representation of the end of a spectrum of polysymptomatic distress and1, therefore, not a distinct disorder2,3. Although clinical and epidemiological evidence favors the second conceptualization4, there are a number of situations where treating FM as a separate disorder has been thought to be helpful, such as in clinical care or in research investigations of severe pain and distress. Even so, available data suggest that patients may move along the continuum of polysymptomatic distress and, in doing so, go from criteria-positive FM to a state in which they no longer satisfy FM criteria5, a condition in which a spectrum disorder may be a more fitting conceptualization. In the American College of Rheumatology (ACR) 2010 diagnostic criteria study, 25% of patients previously diagnosed by rheumatologists did not satisfy ACR criteria at the followup study examination5.
How should the 25% be evaluated? Should they be considered to be patients with FM or not? Should there be one set of evaluations for those with FM and another set for all other patients? More generally, how should patients be evaluated along the entire spectrum of severity that we have elsewhere6 called “fibromyalgianess”?
In 2009, a series of publications emerged from a 5-year OMERACT process that evaluated “domains” in FM7,8, and treated the syndrome as a distinct disorder. Representatives of industry, FM experts, clinical trialists, attendees (N = 23), and patients (N = 4) went through a Delphi consensus process and identified and ranked FM syndrome domain constructs, an endpoint that was later voted on by 121 OMERACT attendees, including those with limited expertise in FM. In a separate OMERACT publication a “preliminary core dataset for clinical trials in fibromyalgia syndrome,” based on the domain deliberations, was identified. The core set included pain, tenderness, fatigue, patient global, multidimensional function, and sleep disturbance, a representation of the domains that were ranked as important by at least 70% of participants7.
The OMERACT process, which addressed only clinical trials, largely validated measures (called domains) that were already being used, and also recommended comprehensive scales such as the Short-Form 36 (SF-36)9,10 and the Fibromyalgia Impact Questionnaire (FIQ)11,12, among others. The latter questionnaires were characterized as evaluating domains of “multidimensional function.” The OMERACT process stated that the identified “domain outcome measures have generally proven to be reliable, discriminative, and feasible”7.
The studied clinical trial questionnaire assessments for use in FM are surprising at the clinical level. Almost none of them are suitable for use in the evaluation and care of clinical patients because of length and complexity, with the exception of the FIQ, which is not suitable for patients who do not have FM.
In this report we use actual patient data to address several questions raised by the OMERACT committee report and recommendations. But we do this in the context of variables used in clinical care rather than clinical trials, and we examine variables used in FM and non-FM patients more generally using the survey-modified ACR diagnostic criteria6. As these criteria do not require tender point examination, they allow large numbers of primary and non-primary FM patients to be studied. First, we ask which variables are most important to patients with FM in overall health-related quality of life (HRQOL) and global severity (patient global). In addition, we ask what is the ranking of the importance of these variables. We assess whether FM variables and results are different in primary FM compared with FM in rheumatoid arthritis (RA). Finally, we ask if the importance ranking of variables is similar in non-FM patients. In the end, our objective is to determine what variables can be recommended for clinicians and clinical studies.
Our primary tool in approaching the above questions is the determination of variable importance in multivariable regression analysis, an approach that can quantify the value of variables as they are used in a typical clinical setting. Importance, however, is not simple to determine when variables are correlated. We used tools developed by Grömping to address variable importance13. She observed that, “Assigning shares of ‘relative importance’ to each of a set of regressors is one of the key goals of researchers applying linear regression, particularly in sciences that work with observational data” and that, “...advances in computational capabilities have led to increased applications of computer-intensive methods like averaging over orderings that enable a reasonable decomposition of the model variance”14. We used this method13 to discover importance and rankings among FM variables in patients with FM and RA. We included RA patients in these analyses as they provide a common substrate (RA), and then examined FM-positive and FM-negative patients with RA.
MATERIALS AND METHODS
Patients and diagnoses
We studied participants in the National Data Bank for Rheumatic Diseases (NDB) longitudinal study of rheumatic disease outcomes15. Participants are volunteers, recruited from the practices of US rheumatologists, who complete mailed or Internet questionnaires about their health at 6-month intervals. They are not compensated for their participation. Diagnoses are made by the patient’s rheumatologist or confirmed by the patient’s physician in the small number of cases that are self-referred. Enrollment in the NDB began in 1998. The NDB utilizes an open-cohort design in which patients are enrolled continuously. Patients in this report completed at least one detailed semiannual questionnaire beginning in July 2009, the date the FM survey criteria variables became available. In the event more than one questionnaire was completed (e.g., assessments of July 2009 and January 2010), we randomly selected one of the 2 questionnaires for analysis.
Study variables
Variables included 21-point visual analog scales (VAS) for pain, patient global assessment (“Considering all of the ways your illness affects you...”), fatigue and sleep problems [“How much of a problem has sleep (i.e., resting at night) been for you in the past week?”], and a vertical 101-point HRQOL VAS (similar to a thermometer) for recording an individual’s rating for their current HRQOL state that was derived from the EuroQol questionnaire16,17, and was anchored at its top and bottom with “perfect health” (100) and “dead” (0). For display in Table 1, it was recoded to 0–10, higher scores indicating better quality of life states. Patients also completed the Short-Form 36 (SF-36) version 1 from which the physical component summary (PCS) score was calculated9,10. The primary time period of the SF-36 questionnaire was 4 weeks.
To measure functional status, we used the Health Assessment Questionnaire disability index (HAQ). Patients also reported on the presence or absence of somatic symptoms, similar to those reported in the ACR 2010 diagnostic criteria study5, and a count of somatic symptoms (0–37) was obtained. We assessed morning stiffness by a 7-category scale: No stiffness, stiffness < 30 min, 30 min to 1 hour, 1 to 2 hours, 2 to 4 hours, 4 to 8 hours, and > 8 hours.
The Widespread Pain Index (WPI), a count of 19 self-reported painful regions, and part of the ACR 2010 FM diagnostic criteria, was obtained as described5,6. We also assessed cognitive impairment (trouble thinking or remembering), fatigue, and waking up tired (unrefreshed) on a 4-point scale using categories of 0: no problem; 1: slight or mild problems; generally mild or intermittent; 2: moderate; considerable problems; often present and/or at a moderate level; and 3: severe; continuous, life-disturbing problems.
RA and FM patients were classified as FM criteria-positive if they satisfied the modified ACR diagnostic criteria for clinical and epidemiologic studies6. Patients originally diagnosed by rheumatologists as having FM are described and analyzed separately according to their FM criteria status at the time of the study. It is possible for patients diagnosed by their rheumatologist at entry not to satisfy the modified ACR diagnostic criteria. The ACR 2010 diagnostic criteria are satisfied with a WPI result ≥ 7 and Symptom Severity Score (SS) ≥ 5 or the WPI is between 3 and 6 and the SS ≥ 9, provided symptoms have been present at a similar level for at least 3 months and the patient does not have a disorder that would otherwise explain the pain5. The survey criteria6 modified the symptom severity scale by substituting for the somatic symptoms item a 0–3 item that represented the sum of the presence or absence of headaches, abdominal pain, or depression symptoms occurring during the previous 6 months. The modified SS score was the sum of the severity of the 3 symptoms (fatigue, waking unrefreshed, cognitive symptoms) plus the sum of the number of the following symptoms occurring during the previous 6 months: headaches, abdominal pain, and depression (0–3). We created a 0–31 fibromyalgianess scale (FS) by summing the WPI and the SS scale, as described6. This scale provides a continuous measure of central characteristics of the FM definition.
To distinguish FM in RA from patients carrying only a general diagnosis of FM, we use the term primary fibromyalgia to characterize this latter group.
Statistical analyses
The primary outcomes of this study include the importance rankings of the predictor variables of HRQOL and patient global. Because of the degree of printed detail associated with these analyses, we show the point estimates for variable importance in Tables 1 and 2 and the confidence intervals separately in Figures 1 and 2. These figures also offer a more accessible view of the relative and comparative group and variable rankings. A simple way of thinking of importance is to consider the amount of “explained variance” contributed by each variable, and then ranking the variables by that amount. We used the method of Grömping to calculate importance statistics13,14. Confidence intervals were based on 1000 bootstrap replicates. Grömping points out that the confidence intervals “can be somewhat liberal”13. The analyses also provide data on the statistical differences between predictor variable importances that we do not show in this report. These data are available in a log file from the first author.
We also describe the distribution of the key variables in Figures 3 and 4 using standard histogram methods. This report does not contain formal statistical testing of hypotheses except as may be implied by confidence intervals or standard deviation. We avoided statistical testing, as the goal of the report was to describe graphically the variable relationships and commonalities. In that sense we were more interested in similarities than differences.
In the graphs of Figure 4 we made some changes to the data to enhance readability. For the right panel we used case weights of 2 (instead of 1) to enhance visibility of the shape of the distribution. In the lower left panel we multiplied the fibromyalgianess score by 8 so that it could more easily be seen against the histogram. The actual range of the scale is 0–31. In this figure the range is seen to be 0 to 248. The fibromyalgianess scale represents a running line smooth of the relations between the fibromyalgianess scale and PCS18.
We used Stata version 11.0 and the R statistical package in the analyses19,20.
RESULTS
The mean age was 63.3 (SD 12.2) years for all RA patients and 58.9 (SD 12.0) years for all FM patients. For gender, 19.8% of RA patients were male and 3.4% of FM patients were male. The median disease duration for RA was 15.3 years and for FM 16.6 years. As expected, patients with FM, with or without RA, had more severe symptoms than those not satisfying FM criteria (Table 1). This may be seen by noting that all variables were more abnormal in the criteria-positive primary FM and the RA FM groups than in the non-FM RA groups, and that, similarly, the primary criteria-positive FM group was more abnormal than the complete FM group. Symptom levels were generally similar in the RA FM group compared with criteria-positive primary FM with the exception that HAQ scores and morning (AM) stiffness were more severe in RA than in non-RA patients.
The most important multivariable predictor of HRQOL was HAQ, followed by pain and fatigue (Table 2, Figure 1). Of the 9 predictor variables, these 3 variables explained more than 50% of the explained variance (50.4%–56.5%). Within variables, there was little difference in importance among the groups except for cognitive symptoms, which appeared to be increased in primary FM compared with RA groups. Overall, the explained variance for HRQOL within the groups was small, ranging from 17% to 34%. The variance was expected to be smaller in the subset groups because of the constricted range of symptom severity. The explained variance for RA FM was 19.7% compared with 17.1% for criteria-positive primary FM. The maximum explained variance in FM, obtained after adding the SF-36 PCS to the model, was 24.7% for both RA FM and criteria-positive primary FM. In general, when the SF-36 PCS is added to the model (Table 2), the explained variance increases by about 5%.
There were several differences noted when we examined the prediction of patient global assessment (Figure 2, Table 3). First, the explained variance more than doubled compared with the HRQOL variable, indicating that VAS HRQOL is not as influenced by clinical variable severity as is patient global. Once again, explained variance was similar in the RA and criteria-positive FM groups. In these analyses the order of variable importance was pain, fatigue, and HAQ, and as with the previous analyses, explained variance was similar for the variables across the different diagnosis groups. There is also an increase in explained variance of up to 5% when SF-36 PCS is added to the model, as shown in Table 3.
We next examined whether the distribution of values was similar in RA FM and in criteria-positive primary FM. Figures 3a and 3b show a high degree of similarity between the key study variable distributions between the 2 groups. The levels of the study variables (Table 1) and distribution of scores (Figures 3a and 3b) suggest strong similarity between RA FM and primary FM.
In Figure 4 we explore the relationships between PCS, discrete FM, and a continuous fibromyalgianess scale. The right panel shows the distribution of PCS scores in primary criteria-positive FM. This distribution is very similar to the distribution of RA FM patients (in black) in the upper left panel. FM can be seen in that panel to occupy the end of the PCS continuum. In the lower left figure, we superimpose a line that describes the relationship between the fibromyalgianess scale and the PCS (see Materials and Methods for details). Overall, the data show the similarity of FM in RA and non-RA, and the continuous relationship of fibromyalgianess and PCS.
DISCUSSION
The issues raised in the OMERACT recommendations provided the impetus to analyze the clinical data of this report. It should be observed that with the possible exception of tenderness, no variable or domain identified in the OMERACT reports is unimportant in other diseases, rheumatic or not. Pain, function, fatigue, sleep, global, and quality of life (multidimensional function) are domains that are measured in all rheumatic disorders. This raises the question of why we might need separate measures for FM. Data from our study show that pain, function, and fatigue are not more important in FM than in RA. Although for reasons of space we did not report osteoarthritis data, we found results in osteoarthritis similar to those reported in this study. In addition, Figure 4 demonstrates the continuous nature of FM content (fibromyalgianess) in RA by showing how fibromyalgianess scores follow the SF-36 PCS scores. As indicated above, we used RA as a common substrate for these analyses to explore the issues of variables and fibromyalgianess, but similar results could be found in other rheumatic diseases. Sleep disturbance has been shown to be prevalent and important in RA21, though it is only sometimes evaluated in clinical trials.
In contrast to the OMERACT report that asked experts to rank variables for selection as core items in clinical trials, we used multivariable methods to describe the level of importance that correlated variables have in predicting global severity in patients with FM. We found that pain, function, and fatigue are the central variables in RA, with or without FM, as well as in patients with criteria-positive and criteria-negative FM. The only important difference among diagnostic groups appears to be that scores are more abnormal in those with FM. Although we did not collect data on tenderness, the very strong correlation between tenderness and the WPI (r = 0.773) and the revision of the FM criteria to include the WPI5 suggest that WPI and tenderness can be substituted. The quality of distress identified by the fibromyalgianess index has been shown to be relevant across multiple rheumatic diseases6.
We found that HAQ, as a measure of physical function, was very important to patients (Tables 2 and 3, Figures 1 and 2). But the OMERACT criteria do not recommend a measure of physical function. The HAQ has the advantage of being used extensively in all rheumatic diseases22. In our work, the HAQ and the function scale of the SF-36 perform equally well in predicting work disability, hospitalization, costs, and mortality. But the HAQ, and its congeners, have singular advantages: feasibility, usefulness in the clinic, and a common metric across rheumatic disease. By contrast, scoring of the SF-36 is complex and the questionnaire is not suitable for use in the clinic. While the FIQ has a function scale and some degree of validation, it is not usable across diseases, and its psychometric properties are largely unknown. Wolfe, et al reported that the FIQ “systematically underestimated functional impairment by its handling of activities not usually performed”23. The FIQ may also have gender bias11, an issue that can be important as the ACR 2010 criteria appear to identify a far greater proportion of men with FM compared with the 1990 classification criteria24.
The OMERACT reports advocated the use of 2 measures that they called multidimensional function. Specifically, they endorsed the multidimensional SF-36 and the FIQ, although it is hard to understand why multidimensional questionnaires should be considered as a separate individual dimension rather than a multidimension. For reasons stated above, both the SF-36 and the FIQ have important limitations for clinicians.
All these issues bring us to a central result of our study, the commonality of response across diagnostic groups, and raise the issue of whether there is any need for a separate FM core set. We suggest that common assessments already in use in other rheumatic diseases, although not necessarily core measures in those disorders, are sufficient to assess patients with FM, perhaps with the addition of the WPI, symptom severity scale, or fibromyalgianess index, variables that became available with the publication of the 2010 criteria5,6. On a practical level, patients who have been diagnosed with FM and then improve substantially create the puzzling problem of using a questionnaire for a condition they no longer have. In addition, if FM is considered part of a continuum (see Figure 4), it seems reasonable that assessments suitable for persons at any position along the continuum should be used.
Finally, we think it is appropriate to raise the issue of identifying assessments for use in clinical care. Clinicians struggle in assessing patients with FM-like conditions accurately and usefully. We submit that the promulgation of simple methods of assessment is more useful for clinicians than a core set for FM clinical trials. In RA, it is common for clinicians to collect pain, global, and HAQ data, and some clinicians also collect fatigue and sleep measures. We believe that understanding and clinical care of FM will be enhanced with a set of common rheumatic disease variables. In addition, the revised ACR 2010 diagnostic criteria provide mechanisms for detailed FM assessments.
We used a VAS HRQOL scale because we thought it might be more detached from clinical values than the EQ-5D itself. We found, in agreement with Harrison, et al25, that the VAS global and VAS HRQOL were not the same. The correlation between these variables was 0.52 in our study and 0.56 in the Harrison study. In addition, in results not shown, the EQ-5D had explained variance similar to that of the patient global. Our analyses are not meant to suggest that the VAS HRQOL should be used. As noted, the purpose was to examine a “less clinical variable.”
Among the limitations of this report are the following. We used a “problem with sleep” scale instead of a “problem with unrefreshed sleep” scale. The latter measure is slightly better with respect to FM characteristics than the more common “problem with sleep” scale. However, the differences are small. The correlation between fatigue and sleep problem was 0.544 and between fatigue and unrefreshed sleep (4-point scale) 0.616. In addition, the 4-item cognitive severity scale, while useful for clinic work, may not be sufficient to identify the full quality of cognitive problems.
In summary, the main determinants of global severity and quality of life in FM are pain, function, and fatigue. But these variables are also the main determinants in RA and other rheumatic diseases. The content and impact of FM, whether measured by discrete variables or a by a fibromyalgianess scale, seem to be independent of diagnosis. These data argue for a common set of variables rather than disease-specific variables. Clinical use is supported and enhanced by simple measures.
- Accepted for publication September 13, 2010.