Abstract
Objective. The American College of Rheumatology (ACR) tender point (TP) criterion is used in diagnosing fibromyalgia syndrome (FM). There has been little research investigating patterns of positive TP. We investigated response patterns of TP in a sample of patients with FM.
Methods. Manual TP survey data were available on 1433 patients with FM. Factor analysis was conducted on ACR TP and control (CON) points. Factor scores were cluster analyzed to identify subgroups based on TP scores. Subgroups were compared on demographic and psychosocial variables.
Results. Factor analysis resulted in 4 TP groupings: neck/shoulder girdle, gluteal/trochanteric, and upper extremity regions, and a set of CON TP. Cluster analysis revealed 3 clusters. Group 1 was high on all 3 TP regions and the CON set; Group 2 moderate on the 3 TP regions, low on the CON set; and Group 3 was relatively low on all 3 TP regions and the CON set. The group highest on the CON and TP regions reported the greatest pain (7.58 ± 1.23; p < 0.001), sleep disturbance (7.05 ± 1.61; p < 0.001), anxiety (10.14 ± 4.57; p < 0.001), and depression (8.42 ± 4.4; p < 0.001).
Conclusion. TP severity ratings varied among cluster groups, suggesting patients with FM are not homogeneous. Variations in TP severity provide information regarding the degree to which FM affects patients’ quality of life. Patients with elevated scores on the CON TP demonstrated a general pattern reflecting lower thresholds for symptom reporting and, perhaps, disease severity. Research is needed to elucidate mechanisms underlying heterogeneity among the FM population.
- FIBROMYALGIA
- TENDER POINTS
- SUBGROUPS
- DIAGNOSIS
Although fibromyalgia syndrome (FM) has been formally recognized for more than 15 years, has a history that long precedes formal recognition, and is believed to be a prevalent disorder with an extensive literature, there is a great deal of concern about the diagnosis itself.
Prior to 1990 various investigators proposed a wide range of criteria for classifying FM. In 1990 a multicenter study was performed to create consistency in the classification criteria1. The authors concluded that the classification of FM could be based on 2 criteria: (1) widespread pain (i.e., 3 of 4 quadrants of the body and axial) of at least 3 months’ duration; and (2) pain reported following palpation of at least 11 of 18 (9 bilateral) specific locations throughout the body, the so-called tender points (TP). The American College of Rheumatology (ACR) criteria had a sensitivity of 88.4% and specificity of 81.1% when differentiating patients with FM from those with low back pain and rheumatoid arthritis1. Thus the criteria appeared to offer an acceptable level of accuracy in diagnosis, and more important, a consistent set of criteria to be used in both research and clinical practice.
Although the ACR criteria provided a basis for the standardization of FM classification, a number of issues have been raised challenging their validity. These can be grouped into 2 general sets, namely, methodological and conceptual. Several of the major methodological and conceptual issues are discussed below.
From a methodological standpoint, questions have been raised that the ACR criteria are dependent on binary responses by patients during palpation of each TP — a patient’s score consists of the number of points (out of a total of 18) that he or she identifies as painful in response to pressure of 4 kg. The ACR suggest a minimum of 11 of 18 TP to meet the criteria for FM. However, this process creates a truncated distribution that may mask potentially important differences in ratings of TP severity.
An alternative strategy recommended to provide more sensitive assessments of TP is the Manual Tender Point Survey (MTPS) that consists of the mean of patients’ ratings of the severity of pain on an 11-point scale (0 = no pain; 10 = worst pain possible) during palpation of each point2. Data from the MTPS indicate that when patients use an 11-point scale to rate their pain, it is possible to identify effects of treatments and make distinctions among different groups of FM patients that are masked when binary ratings are used3.
At a conceptual level, fundamental questions have been raised about the biological significance of TP. Early theories suggested that FM was a soft-tissue disorder with multiple areas of muscle tenderness and chronic widespread pain4,5. In contrast, myofascial trigger points, which were considered distinct from FM tender points, resulted in referred pain to a distant but specific site. Accordingly, the 18 survey sites (standard TP) designated in the ACR criteria ranged across muscle bellies or tendon insertions. Support for this theory was provided by a widely cited study that demonstrated control sites not located over muscle bellies or tendons did not differ between FM and control groups4.
Since 1990, many FM investigators have postulated that central nervous system (CNS) sensitization plays a fundamental role in the pathogenesis of FM, so that patients with FM are more sensitive than others to a wide range of sensory stimulation6,7. A corollary of this hypothesis is that FM patients should show generalized tenderness, rather than selective tenderness over specific muscles. Thus, the CNS sensitization model implies that both control and test sites should be more sensitive, whereas the earlier model stressed the importance of test sites.
Both the CNS sensitization and the soft-tissue disorder models postulate high covariation among responses to the 18 survey sites. However, some researchers have found evidence for patterns in sensitivity among these sites. The original multicenter study that examined many locations found that 18 specific sites reliably differentiated patients with FM from patients with low back pain and rheumatoid arthritis1. Moreover, Turk, et al studied tenderness among individuals with recent whiplash injuries8. They found that 46% of the cohort met the ACR tender point criterion for a diagnosis of FM. However, the muscular sensitivity of these individuals was not randomly distributed, but rather was clustered in neck and shoulder girdle muscles. These results support the view that FM patients might show clustering in their sensitivity to palpation of different survey sites, and that these clusters are meaningfully related disorders that affect specific regions of their bodies.
Given the conceptual and methodological issues surrounding FM TP, it is noteworthy that very little research has been done to explore the relationship among TP within FM patients.
The purpose of our study was to utilize data from a large multicenter trial to explore patterns of variability within the MTPS. Specifically, factor analysis was used to explore the dimensional structure within the 18 ACR TP and 3 control TP. Then cluster analysis was used to determine whether there are differing patterns of responses based on these dimensions identified within the MTPS. We hypothesized that FM patients will show clustering in their sensitivity to palpation of different survey TP sites.
MATERIALS AND METHODS
Sample
Demographic and baseline data from 2 randomized, double-blind, multicenter, placebo-controlled pharmaceutical trials were used in our current study. The first study was conducted at 79 research sites throughout the United States and consisted of 748 FM patients9, and the second was conducted at 84 research sites throughout the United States and consisted of 731 FM patients10. All participants met ACR criteria for FM. MTPS data were available on 1433 (96.8%) of the combined 1479 patients, and factor analysis was conducted on this set of patients. Following factor analysis, the 1433 participants were randomly divided into 2 groups utilizing the SPSS random sample generator. The first sample (n = 690) was utilized in the initial cluster analysis, and the second sample (n = 743) was utilized to cross-validate the cluster solution.
Measures. The Manual TP Survey (MTPS)
The MTPS was developed as a standardized TP procedure to identify positive TP and determine individual TP pain severity2. Patients report TP pain severity on a scale of 0 (no pain) to 10 (most pain possible). In addition to the standard 18 TP included in the ACR criteria, 3 control TP (i.e., mid-forehead, left thumbnail, right dorsum forearm) are assessed.
The Hospital Anxiety and Depression Scale (HADS)
The HADS11 was developed to measure symptoms of anxiety and depression among nonpsychiatric medical inpatients. This self-report questionnaire contains 14 items that generate scores for depressive (0–7) and anxious (0–7) symptoms.
The Fibromyalgia Impact Questionnaire (FIQ)
The FIQ12 is a self-report instrument designed to measure the overall influence of FM over multiple dimensions, such as function, pain, and psychological distress. The FIQ consists of 20 items and is scored from 0 to 100, with higher scores associated with greater effect of FM. The first section consists of 11 questions related to physical functioning, and the average of these 11 items is known as the physical functioning scale of the FIQ, and is reported in the current study.
Mean pain score
Patients were asked to select the number that best described their FM pain during the past 24 h, on an 11-point scale with 0 being “no pain” and 10 being “worst possible pain.”
Mean sleep score
Patients were asked to select the number that best described the quality of their sleep during the past 24 hours, on a scale from 0 to 10, 0 being the “best possible sleep” and 10 the “worst possible sleep.”
Analyses
All data analyses were conducted utilizing SPSS for Windows, version 14 (SPSS, Chicago, IL, USA). Principal components analysis, with oblique rotation, was conducted on the 21 TP [18 ACR and 3 controls (i.e., mid-forehead, left thumbnail, right dorsum forearm)] at baseline. A TP was considered to load significantly on a factor if it met 2 criteria: (1) the loading on the factor was > 0.39; (2) the point did not cross-load on a second factor, where cross-loading was defined as a difference of less than 0.15 between the loading of a TP on 2 different factors.
Cluster analysis was performed to identify TP profiles among the set of patients. The k-means clustering procedure, which allocates data points into a specified number of clusters based on the centroids of each data point, was used to classify patients into unique clusters13. The number of clusters retained was based on 2 criteria: stability (i.e., reproducibility) and inter-pretability. A solution was considered stable if the centroids produced in the second sample were within one-half SD of the centroids produced in the first sample. The cluster groups must also be interpretable, which refers to the alignment of the clusters with clinical reports and experience of working with FM patients.
Cluster analysis was first performed on Sample 1 (n = 690), and then cross-validated on Sample 2 (n = 743). Following determination of cluster groups, data from Sample 1 were used to determine the external validity of the cluster analysis through significance tests that compared groups defined by the cluster solution on a set of relevant clinical variables14. Chi-square tests of significance were utilized on categorical variables and analysis of variance (ANOVA) on continuous variables.
RESULTS
Demographic data for the total sample (n = 1433) and the 2 subsamples are presented in Table 1. The total cohort was predominantly female (94.6%), overweight [mean body mass index (BMI) = 30.75 ± 7.31], and middle-aged (age 49.39 ± 11.24 yrs). On average, patients had been experiencing FM symptoms for 95.64 (± 2.51) months. Statistical analyses comparing the 2 subsamples revealed no significant differences on any of the variables, suggesting samples were each representative of the total cohort.
The frequency and average TP scores for the total cohort are presented in Table 2. Paired t-tests revealed no significant differences between left and right TP of each respective pair of TP, thus only right TP are presented. Each of the 18 TP was positive more than 90% of the time when utilizing a minimum score of 1.
Factor analysis
Principal components analysis was conducted on 21 TP, first forcing a one-factor solution. The total amount of variance accounted for by one factor was 40.17%. Since the one-factor solution did not account for a majority of the variance, a principal components analysis, with oblique rotation, was conducted on the 21 TP (18 ACR and 3 controls). The number of factors was selected based on evaluation of the scree plot, size of the eigenvalues (> 1 considered acceptable for factor retention), and interpretability. Although a 5-factor solution was supported by the eigenvalue > 1 criteria, evaluation of the scree plot, in addition to clinical interpretation, led us to retain the 4-factor solution. The first 3 groups are associated with body regions – neck + shoulder TP formed Factor 1 (NS), the gluteal/trochanteric TP Factor 2 (GT), extremity TP (lateral epicondyle) Factor 3 (UE), and Factor 4 consisted of all 3 control points (CON). The 2 knee TP cross-loaded on both the GT and the UE factor, and thus were deleted. All respective right and left TP loaded on the same factor (Table 3).
Evaluation of the component correlation matrix revealed that although the factors were well delineated using oblique rotation, some were moderately correlated with one another. The highest correlations were between the NS and GT (r = 0.42), UE (r = 0.39), and CON factors (r = 0.37). Remaining correlations were all less than 0.3 (GT/UE, r = 0.29; GT/CON, r = 0.25; UE/CON factor, r = 0.25).
Based on factor analysis, the MTPS (including the 18 ACR TP and 3 control TP) revealed 4 modestly correlated factors. This result supports the possibility that there is some underlying structure within the MTPS, such that NS, GT, and UE TP tend to group somewhat independently. The correlations among the NS and the remaining 3 factors ranged from 0.37 to 0.42, however, suggesting that although the TP formed unique dimensions, they shared some common variance. To further explore the relationship among TP, cluster analysis, which aims to maximize both within-homogeneity of FM patients and between-homogeneity of manual TP, was conducted utilizing the factor scores from the factor analysis.
Development of TP profiles
The k-means clustering procedure was conducted with 4 factor mean scores (NS, GT, UE, and CON) as the clustering variables, with iterations of 2, 3, 4, 5, and 6-cluster solutions. The cluster analyses were repeated on the second sample for cross-validation, with successful replication with up to 3 groups. Comparisons between centroids indicated the mean factor scores of Sample 2 were within one-half SD of the mean factor scores for 11 of 12 comparisons for the 3-cluster solution in Sample 1. In addition, meaningful differences were identified on external variables of interest (Table 4), and thus the 3-cluster solution was retained (Figure 1).
Cluster 1: High on NS, GT, UE and high on CON (HH)
Factor means for each of the clusters are presented in Figure 1. Examination of Figure 1 suggests that Cluster 1, which included 223 respondents (32.3%), is highest on all 3 TP regions (NS, GT, UE) and the CON set. All average factor scores were greater than one-half SD from their respective means (NS = 7.53 ± 1.25, GT = 8.12 ± 1.44, UE = 7.59 ± 1.70, CON = 3.76 ± 1.91), suggesting they were different from the average TP scores. With respect to all other clusters, the first cluster scored at least 25% higher than the other clusters on all standard TP regions (NS, GT, and UE), and over 2.5 times higher on the control set, and was thus labeled the high standard TP/high control factor (HH) group.
Cluster 2: Moderate on NS, GT, UE, and low on CON (ML)
The second cluster comprised 38.8% (n = 268) of the sample, and scores on the standard TP factors were within one-half SD of the means on all standard TP factors (NS = 5.3 ± 1.39, GT = 6.29 ± 1.83, UE = 6.02 ± 1.50), and lower than one-half SD on the control factor (CON = 1.12 ± 1.16). Thus, the second cluster was considered moderate on the TP standard regions, and low on the control set (ML).
Cluster 3: Low on NS, GT, UE, and CON (LL)
About 28.8% of respondents (n = 199) were included in the third cluster group. The scores on both the standard TP regions and CON factor for this cluster were more than one-half SD lower than the mean (NS = 4.27 ± 1.74, GT = 4.19 ± 2.19, UE = 2.34 ± 1.55, CON = 1.15 ± 1.28), and was thus labeled the low standard TP/low control TP (LL) group.
Validation of cluster solution
In addition to cross-validating the cluster solution on a separate sample, significance tests were conducted that compared the clusters obtained in Sample 1 on variables of clinical importance14. Results are described below.
Demographic variables
Table 4 presents demographic characteristics for the 4 cluster profiles in Sample 1. Statistical comparisons indicated there were no significant differences among the 3 clusters in duration of FM (F 2,684 = 1.29, p = 0.28) or BMI (F 2,687 = 0.50, p = 0.61). Age was statistically different among groups (F 2,684 = 5.68, p = 0.004, h2 = 0.02), with post-hoc tests indicating that the LL cluster (46.7 ± 13.16 yrs) was slightly younger than the ML (49.59 ± 10.81 yrs) and the HH clusters (50.27 ± 10.69 yrs).
Psychosocial variables
Psychosocial characteristics for the cluster groups are presented in Table 5. Overall, the HH group was characterized by the greatest degree of pain and psychosocial impairment, whereas the LL group was characterized by the least amount of all 3 groups.
FIQ physical impact scores differed significantly among groups (F 2,685 = 7.57, p = 0.001, h2 = 0.02), with post-hoc results indicating that the HH (1.35 ± 0.69) group reported being significantly more functionally disabled compared to the LL group (1.09 ± 0.64; p < 0.001). Mean pain (F 2,684 = 79.01, p < 0.001, h2 = 0.13) and sleep scores (F 2,684 = 60.48, p < 0.001, h2 = 0.07) also varied among clusters, with post-hoc tests indicating the HH group differed from both the ML and the LL groups (HH > ML ≥ LL, all p values < 0.001). A main effect for the HADS anxiety (F 2,685 = 6.96, p < 0.001, h2 = 0.02) was also detected, with post-hoc tests indicating that the HH group (10.14 ± 4.57) was more anxious, compared to both the ML (8.72 ± 4.36) and LL (8.96 ± 4.23) groups (all p values < 0.001). The HADS depression subscale also varied significantly (F 2,685 = 7.42, p < 0.001, h2 = 0.02), with the HH group (8.42 ± 4.4) reporting more mood compared to both the ML (7.53 ± 4.37) and LL group (6.83 ± 3.84; HH > ML ≥ LL, all significant p values < 0.001).
DISCUSSION
The factor analysis of the MTPS revealed 4 groups of TP: neck/shoulder region (NS), gluteal/trochanteric region (GT), upper extremity region (UE), and a set of control (CON) TP. In order to consider individual response patterns to the MTPS cluster analysis, a classification analysis that groups individual cases (in this study patients with FM) rather than individual variables (in this study TP severity ratings) was conducted. Three unique response patterns were identified, such that compared to other FM patients in the sample: (1) one group was high on all 3 standard TP regions and the control set of TP (HH); (2) one group was moderate on 3 standard TP regions and low on the control TP set (ML); and (3) one group was relatively low on all 3 sets of standard TP regions and the control set (LL) (Figure 1).
The theory that patients with FM might show clustering in their sensitivity to palpation of different TP sites was not supported by the cluster analysis, as indicated by the relative parallel lines between the NS, GT, and UE factors. Notably, however, 3 groups were identified based on severity differences among the standard TP factors (NS, GT, UE). The HH group rated the standard TP regions an average of 7.52 ± 1.46, the ML group averaged 5.87 ± 1.57, while the LL group averaged 3.65 ± 1.83. This supports the notion that there are varying degrees of severity in FM15⇓–17.
The CON factor was generally less tender than other TP factors, confirming the results of the 1990 ACR criteria1. It is notable that the scores for the ML (1.12 ± 1.16) and LL groups (1.15 ± 1.28) on the control factor are within the range of scores reported for patients with chronic headache (control point scores averaged between 0.17 and 1.39 on each of the 3 control points)2, suggesting that for 2 of the 3 cluster groups, sensitivity at control sites is comparable to a chronic pain population, without chronic widespread pain as a defining characteristic.
The HH group, however, averaged scores 3 times as high on the control factor (3.76 ± 1.91). This group also scored the highest scores on all self-report measures administered, including the anxiety and depression measure, pain and sleep analog scales, and the FIQ physical functioning scale. This is consistent with previous reports of increased muscle sensitivity and increased pain and distress18,19. One possibility to account for this finding is that they have the generalized sensitivity postulated by those who view FM as driven by CNS sensitization. However, another possibility is that responses of participants were influenced, at least to some extent, by a “response set” such that what is really being measured is the degree to which a person will agree or disagree with an item, regardless of content20. It has been suggested that the acquiescence is associated with item ambiguity21, so that when respondents are unsure of an item they will tend to answer in the affirmative. A study that alters the scale presentation and balances the number of positive and negative symptom items, may provide insight into the role cognitive bias may play in response to self-report questionnaires in general.
The ML group scored moderately on the 3 standard TP regions and low on the control set. This differential response between the ACR TP and control TP is more compatible with the early theory that FM patients have some special sensitivity in muscles. The LL group, however, reported similar severity scores among the standard and control TP. This pattern of responses is congruent with the generalized theory of sensitivity but with a lower degree of severity compared to the HH group.
Overall, our results provide partial support for the general sensitivity and muscle tenderness theories. Both theories may be valid, as FM may not consist of a homogeneous group of patients but rather there may be subgroups based on important characteristics22. Given the differing patterns of standard and control TP response between the HH and the ML/LL groups, it may be postulated that different mechanisms are involved in the etiology and maintenance of FM symptoms within the FM population. This notion is consistent with the view of researchers that FM may be a broader condition with specific subsets23⇓⇓–26. Potential subsets may include people with regional or generalized symptoms, FM secondary to reactive disease, or FM coexisting with other diseases. Yunus describes evidence that supports the interaction of central sensitivity and psychosocial characteristics to account for the wide and variable symptom presentation among FM patients25. Additional research comparing these subgroups on various psychophysical and biological markers may provide insight into the unique mechanisms of FM, and further explore the hypothesis that different mechanisms are involved in subsets of patients diagnosed with FM.
Our study has several limitations. Most notably, no data on healthy controls or non-FM rheumatology patients were available, thus no comparison may be made regarding the TP scores compared to other groups. Important information could be gleaned from a study design that included both healthy controls and patients with arthritis. The datasets analyzed were obtained from 2 large-scale pharmaceutical trials, which exclude patients with any unstable psychological or medical disorders, other painful disorders, and any evidence of inflammatory rheumatic disease. Thus, the sample may not be representative of the complete FM clinical population. Additionally, although the dataset analyzed in our current study was large and the cluster analysis solution was replicated in the reserved sample, the clusters should be replicated in a second independent sample to verify the findings. Multiple analyses were performed to identify associations between the 3 clusters described in our study and variables such as sleep and pain severity. As a result, it is possible that some of the associations described above may be spurious. Since this was an exploratory study, we decided not to make any adjustment for the performance of multiple analyses. However, observation of the results does indicate that levels of statistical significance generally exceeded p < 0.001 and those may be reasonably valid.
The variation in TP severity will provide information on how severely a patient’s condition is affecting their quality of life. Additional research exploring differences among these groups on other variables of interest, including variability in diffuse noxious inhibitory control, genotype, and psychological variables (e.g., self-efficacy and coping styles) may provide insight into treatment options for different subgroups of patients with FM.
Acknowledgment
The authors express appreciation to Pfizer Inc. for making the data available.
Footnotes
Supported by NIH (NIAMS) grant AR 44724 awarded to D.C. Turk.
- Accepted for publication July 9, 2009.
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.