To the Editor:
We thank Dr. Kirwan for his usual thoughtful and imaginative observations. Considerable variation is also seen beyond physician global assessment in all rheumatoid arthritis (RA) core data set measures. Swollen and tender joint counts performed by a health professional vary widely1–4. Patient questionnaire scores for physical function, pain, global status, and fatigue vary in their ratios in different patients5,6, with patterns according to ethnic group7, indicating that patient perceptions also may vary considerably. Indeed, the different ratios have been suggested to provide clues to recognition of somatization5,6. All measurement is indeed associated with error and variation.
Our report8 was not directed to assess measurement variation and error, but rather to assess the relative efficiency of each of the 7 core data set measures to distinguish active from control treatments in clinical trials. This matter was of interest because most rheumatologists do not perform formal quantitative swollen and tender joint counts at most visits outside clinical trials9 and situations in which this is required (for biological therapies, for example). A joint count requires 90 seconds for performance, while a routine assessment of patient index data 3 (i.e., the RAPID3), an index of only patient questionnaire measures, requires only 10 seconds to score10, and is highly correlated with the Disease Activity Score (DAS28) in clinical trials11 and clinical care12.
The joint count is weighted as more important than patient questionnaire or global measures in the RA core data set13, the DAS2814, and the Clinical Disease Activity Index15. This higher weighting does not appear to be supported by statistical evidence in our studies8,11,12 and others16–18. However, it may well be argued that the specificity and comprehensiveness of joint counts would justify higher weighting in documenting differences between active and control treatment in clinical trials and changes in clinical care.
The goal “to develop a policy on judging severity of RA based on the collective opinions of many rheumatologists, as applied to substantial management decisions, such as when to change DMARD due to perceived lack of efficacy,” as suggested by Dr. Kirwan, is laudable. However, in view of considerable variation in swollen and tender joint counts, despite extensive guidelines regarding their performance, perhaps such a policy might not necessarily be effective. Few studies in the literature define the measurement and psychometric properties of a physician global estimate compared to all other RA core data set measures on questionnaires, laboratory tests, or joint counts. Perhaps that suggests that the approach to a “policy” may be limited.
One reason that patient questionnaire scores appear quite robust is that a patient is a single observer, and any interaction with a health professional necessarily introduces a second observer and variation in measurement. The patient serves as his or her own control, thus reducing measurement error and variation — and management can be effective even without patient visits, as documented in elegant studies by Hewlett and Kirwan19,20.
More knowledge concerning variation and efforts to reduce this variation is desirable. Nonetheless, the primary point of our report was that the joint counts did not have greater value than other core data set measures to distinguish active from control treatments.