The definition of “arthritis” is varied and often contextual. Webster’s Dictionary defines arthritis as “inflammation of joints due to infectious, metabolic, or constitutional causes; also: a specific arthritic condition”1, whereas Wikipedia defines arthritis less specifically as “a group of conditions involving damage to the joints of the body”2. The lay public often thinks of arthritis as a painful condition of the joints or their surrounding structures, a definition based primarily on symptoms. For physicians who treat arthritis or conduct clinical trials to evaluate the efficacy of new therapies, existing clinical and radiographic criteria for arthritis can be applied to an individual patient at a specific point in time. In order to study large numbers of patients, epidemiologists strive for more convenient and accessible definitions of disease such as diagnosis codes or patient self-report, although both these approaches have the inevitable limitation of low, and usually undetermined, sensitivity and specificity relative to a clinical examination.
The manuscript by Dr. Singh in this issue of The Journal3 clearly compares the differences between 2 commonly used sources for assembling cohorts for epidemiologic analysis: administrative databases that incorporate International Classification of Diseases ICD-9 codes and patient self-report. This study evaluated 34,400 veterans who received care within a large veterans’ service network and responded to a mailed survey about quality of life4. Among other questions about demographics, health care insurance status and utilization, comorbid conditions, and standardized quality of life measures, the survey asked participants “Has your doctor ever told you that you have arthritis (including rheumatoid or osteoarthritis)?” Responses to this question were compared with ICD-9 codes for any type of arthritis found in the administrative record and nonsteroidal antiinflammatory drug or disease modifying antirheumatic drug prescriptions from in the US Veterans Affairs (VA) pharmacy database in the year before or after the survey questionnaire. Not surprisingly, there was very low concordance between the 2 different methods of case ascertainment: kappa statistics ranged from 0.19 to 0.32, depending on which administrative definition was used, which corresponds to a “fair” agreement between the 2 methods. In contrast, a kappa statistic of 0.61–0.80 would correspond to “substantial agreement,” and a kappa of 0.81–1.0 to “almost perfect” agreement5. The low concordance rate between administrative coding and self-report remains unchanged regardless of hospitalization status, employment status, education, age, overall health status, and disability.
It is important to recognize that there is no gold standard definition of a term as general as “arthritis.” In cases of specific arthritides such as rheumatoid arthritis, gouty arthritis, and osteoarthritis, standard definitions in the form of clinical criteria are available and could conceivably be considered gold standards to which patient self-reports are compared. Such studies have been performed, and concordance rates are still low for specific rheumatic diagnoses relative to patient self-report6–9. Dr. Singh should be commended for addressing the discordance rates between 2 different case definitions of arthritis rather than commenting exclusively on the accuracy of self-report compared to database. He does not presume the database definition to be the “true” determinant of the presence of arthritis in an individual. Indeed, the dependent variable in the multivariable regression analyses is the discordance rate itself, rather than either definition of arthritis. The fact that the author does not give the numbers of participants in the concordance or discordance groups further underscores that the focus of the study is not on the accuracy of either case definition, but rather on the rate of agreement between the two. Of all of the potential predictors, the association between fewer outpatient visits and higher rates of discordance makes the most sense. One can speculate that, with more physician encounters, patients have a greater chance to discuss all their symptoms and concerns with their care providers. In addition, a better patient-physician rapport, developed over multiple visits, likely results in better adherence with followup testing and medications, and a better understanding by the patient of his diagnoses.
Of note, the definition of “discordance” used by Dr. Singh is 2-fold, since there are 2 possible directions for discordance, which we can call “positive” (under-documentation/over-reporting) and “negative” (over-documentation/under-reporting). Perhaps the expected direction of discordance is positive, in which the patient reports the presence of arthritis but the providers do not. While some of these patients may not meet criteria for any specific type of arthritis, they might have arthralgia or periarticular soft tissue disorders and believe this to be a form of arthritis. The patient-oriented definition of arthritis need not be discounted, as it has clear value in several types of epidemiologic studies including those of pain, physical function, quality of life as well as studies of the indirect costs of disease including issues of loss of work, need for assistance with daily activities, and social participation. Discordance between patient self-report of arthritis and database documentation of “arthralgia” or related terms was not examined in this study; we expect that these may have had higher levels of agreement. Alternately, positive discordance might not reflect true disagreement but instead a failure of the provider to document an agreed-upon diagnosis in the medical record. This could occur if patients received care for arthritis outside of the VA hospitals, if they managed arthritis symptoms using non-prescription medications, if they were seen predominantly by specialists who focused on other comorbid conditions, or if their arthritis was not the primary indication for health care.
In contrast, negative discordance, in which patients report no arthritis when at least one type of arthritis is documented in the health care record, is somewhat more difficult to understand. How could a patient with physician-documented arthritis believe that he does not carry the diagnosis? Given the phrasing of the specific question posed to patients in this study “Has your doctor ever told you that you have arthritis (including rheumatoid or osteoarthritis)”, participants may have reported only the 2 specific forms of arthritis explicitly referenced in the question. More generally, perhaps patients whose symptoms are well-controlled do not feel that they currently have the disease; or maybe certain forms of monophasic or intermittent arthritis (gout or septic arthritis) are not considered to be arthritis by some patients. It is also possible that many subjects, in this case elderly male veterans, feel that their symptoms are due to old age, a natural consequence of trauma to the joints, or “wear and tear” of the joints rather than arthritis per se. Indeed, increasing difficulties with activities of daily living was associated with higher odds of patient under-reporting, suggesting that patients minimize or disregard diagnoses of arthritis when other conditions are present to account for disability such as cerebral vascular accidents, complications of diabetes, or advanced pulmonary disease. It would be interesting to investigate the other comorbidities present in the group of participants who under-reported a diagnosis of arthritis.
As with any large epidemiologic study, there are limitations in this particular dataset that could draw criticism. The population was almost exclusively male, Caucasian, and elderly; therefore, the results may not be generalizable to a broader population-based study that includes a wider spectrum of ages, gender, ethnicities, and geographic locations. The VA system does not always provide comprehensive health care for its patients, and therefore may not capture all health care interactions and diagnoses if patients seek health care outside of the VA system. Forty percent of potential respondents did not return the completed survey and thus could not be included in the analysis. This all too common problem with self-administered questionnaires is one of the most troubling, as respondents and non-respondents are likely to be different in numerous respects and the results of the study may have been vastly different if all potential participants had been reached (nonresponse bias). The results of this study would be stronger if any available characteristics of non-respondents (presumably from the VA dataset) had been compared to those of the respondents to better understand the inherent differences in the 2 groups. Several of these limitations, and a better explanation for the causes of positive and negative discordance, might have been addressed if self-reports and administrative data had been compared to direct chart review (to capture diagnoses not included in the billing coding) or to in-person interviews and physical examination of study participants (so that followup questions can be asked and physical findings can be confirmed). Clearly, these tasks present logistical and financial complications that limit the size of the population under study.
As is, this study defines the role of patient self-report and administrative data for sourcing arthritis cases in epidemiologic research and the differences we might expect with each. We agree with Dr. Singh’s basic premise that neither source of information is intrinsically superior to the other, but rather that each may be suited to different types of investigation. Patient-reported outcomes may be best studied by focusing on the patients that believe they have arthritis. Studies of the natural history of disease, association with comorbid conditions, and those of direct health care costs including resource utilization or provider adherence to clinical guidelines are best served when the cohort of arthritis patients is determined by providers rather than the patients. We should be cautious when interpreting and comparing conclusions of studies on health care utilization or physician behavior if disparate methods of case ascertainment were used.