Abstract
Objective. Case ascertainment through self-report is a convenient but often inaccurate method to collect information. The purposes of this study were to develop, assess the sensibility, and validate a tool to identify cases of systemic autoimmune rheumatic diseases (SARD) in the outpatient setting.
Methods. The SARD tool was administered to subjects sampled from specialty clinics. Determinants of sensibility — comprehensibility, feasibility, validity, and acceptability — were evaluated using a numeric rating scale from 1–7. Comprehensibility was evaluated using the Flesch Reading Ease and the Flesch-Kincaid Grade Level. Self-reported diagnoses were validated against medical records using Cohen’s κ statistic.
Results. There were 141 participants [systemic lupus erythematosus (SLE), systemic sclerosis (SSc), rheumatoid arthritis, Sjögren syndrome (SS), inflammatory myositis (polymyositis/dermatomyositis; PM/DM), and controls] who completed the questionnaire. The Flesch Reading Ease score was 77.1 and the Flesch-Kincaid Grade Level was 4.4. Respondents endorsed (mean ± SD) comprehensibility (6.12 ± 0.92), feasibility (5.94 ± 0.81), validity (5.35 ± 1.10), and acceptability (3.10 ± 2.03). The SARD tool had a sensitivity of 0.91 (95% CI 0.88–0.94) and a specificity of 0.99 (95% CI 0.96–1.00). The agreement between the SARD tool and medical record was κ = 0.82 (95% CI 0.77–0.88). Subgroup analysis by SARD found κ coefficients for SLE to be κ = 0.88 (95% CI 0.79–0.97), SSc κ = 1.0 (95% CI 1.0–1.0), PM/DM κ = 0.72 (95% CI 0.49–0.95), and SS κ = 0.85 (95% CI 0.71–0.99). The screening questions had sensitivity ranging from 0.96 to 1.0 and specificity ranging from 0.88 to 1.0.
Conclusion. This SARD case ascertainment tool has demonstrable sensibility and validity. The use of both screening and confirmatory questions confers added accuracy.
- SYSTEMIC AUTOIMMUNE RHEUMATIC DISEASES
- SCLERODERMA
- SYSTEMIC LUPUS ERYTHEMATOSUS
- SYSTEMIC SCLEROSIS
- SJÖGREN SYNDROME
- RHEUMATOID ARTHRITIS
Self-reported instruments may be helpful for case ascertainment in the population and are widely used for collecting information regarding a patient’s health status. However, their utility in research is limited by problems with accuracy, impeding their case-finding ability1,2,3. This limitation can greatly undermine the findings of studies that rely on self-report. Thus, availability of valid case-finding instruments is important.
Systemic autoimmune rheumatic diseases (SARD) such as systemic lupus erythematosus (SLE), systemic sclerosis (SSc), inflammatory myositis [polymyositis (PM), dermatomyositis (DM)], Sjögren syndrome (SS), and rheumatoid arthritis (RA) are multisystem, chronic illnesses associated with high morbidity and mortality. Self-reported instruments are frequently used in studies of SARD, but inconsistencies between self-reported and physician-reported diagnoses have been described2,3,4,5. In a study by Cooper, et al3 assessing whether the healthy relatives of patients with a known SARD were at a higher risk of developing an autoimmune disease than the general population, a questionnaire was administered to 893 family members of patients with SARD asking whether they had ever been diagnosed with any of 11 autoimmune diseases. Although 178 subjects reported an autoimmune disease, the authors were able to corroborate fewer than half these diagnoses.
The aim of our study was to develop a tool that would more accurately identify individuals with a SARD. Using the questionnaire of Cooper, et al3 as a template, several modifications were made with the objectives of improving the operating characteristics of the questionnaire and enhancing readability. We then evaluated the sensibility and validity of this new SARD tool.
MATERIALS AND METHODS
Instrument design
The questionnaire from the study of Cooper, et al3 was used as a template with 11 questions asking “Has a doctor ever told you that you had…” followed by the diseases SLE, RA, SSc, PM or DM, SS, antiphospholipid antibody syndrome (APS), hemolytic anemia, multiple sclerosis, thyroid disease, Type 1 diabetes mellitus (DM), and idiopathic thrombocytopenia purpura (ITP). This was followed by a dichotomous response option of “yes” or “no”. Because the terms “idiopathic thrombocytopenia purpura” and “antiphospholipid antibody syndrome” are less widely known to the general public, the question of whether a participant had the disease was preceded by a clinical question. For example, the ITP question was preceded by a question asking whether the participant had low platelets (yes or no). For APS, the preceding question asked participants if they had “recurrent blood clots or at least 3 pregnancy losses” with a response option of “yes” or “no”. These 11 questions were considered the screening questions.
The questionnaire was modified by the addition of “confirmatory questions” after the screening question. The use of confirmatory questions has previously been reported to improve the positive predictive value and agreement in patients who have been asked whether they had SLE or RA1. The confirmatory questions inquired about medication taken for the condition, symptoms of the SARD, or treatment by a particular specialist. Each confirmatory question was followed by the response options “yes,” “no,” and “I don’t know” (the SARD Participant Questionnaire is available online at jrheum.org as a data supplement). In consultation with an expert in health instrument design and evaluation (AMD) and application of the Dillman methods6, the visual format and readability of the tool was enhanced by the addition of white space, blocking and shading of alternating questions, and increased font size. These methods have been shown to improve the readability of tools, particularly in the classification of SARD7. Because this questionnaire was designed for research purposes to identify patients who have a SARD, it was further reviewed by a community rheumatologist and an academic rheumatologist-scientist (AMB, JW) to assess its applicability for this intended use.
Subjects
Participants with SARD and disease controls (hemolytic anemia, multiple sclerosis, thyroid disease, and Type 1 DM) were recruited from hospital-based specialty clinics at the Toronto Western Hospital in Toronto, Ontario, Canada, including the Scleroderma Clinic, the Sjögren’s Disease Clinic, the STAT Clinic (an urgent rheumatology referral clinic), and the Early Autoimmune Rheumatic Disease Clinic. Consecutive participants were approached in the waiting room. Patients were excluded if they were under the age of 18 or had difficulty reading and conversing in English. Healthy controls were recruited from hospital staff and associates. The participants were provided with a hard copy of the questionnaire and were allowed to complete it at their own pace. Study personnel were available to answer any questions that might arise during completion of the questionnaire. Ethics approval was obtained from the Research Ethics Board of the University Health Network (12-5189-BE). Participants provided written, informed consent.
Sensibility assessment
Sensibility is an important measurement property of an instrument, evaluating attributes of its usefulness. Determinants of sensibility include its clinical function, justification, applicability, format (comprehensibility and readability), face and content validity, and feasibility8. Comprehensibility of the SARD tool was evaluated using the Flesch Reading Ease and the Flesch-Kincaid Grade Level tools (Microsoft Office Professional Plus 2013). An acceptable Flesch Reading Ease score of 60.0–70.0 indicates that the material would be easily understood by 13- to 15-year-old students and a score of 90.0–100.0 indicates that the material would be easily understood by an average 11 year old9. Based on the sensibility assessment guide of Rowe and Oxman8, which has been used to evaluate other self-reported health instruments10,11, comprehensibility, feasibility, face validity, and acceptability were evaluated using a numeric rating scale from 1–7. The anchors varied based on the question stem. The sensibility assessment questionnaire included space for subjects to provide specific comments related to the SARD tool questions and space for general feedback.
Validation
Concurrent validity evaluates the agreement between 2 measures of a construct administered at about the same time12. Participants’ self-reported diagnoses using the SARD tool were compared to their diagnosis recorded by their physician(s) in their medical record. In SARD, there is no gold standard diagnostic test. A diagnosis is made through the physician’s judgment, using an aggregation of symptoms, signs, and test results that conform to the construct of the disease as learned in physicians’ specialized training and years of experience13. An investigator blinded to the self-reported diagnoses reviewed the medical records. Data collected included patient age, sex, and SARD diagnosis.
Statistical analyses
The data were summarized using descriptive statistics. For dichotomous screening questions, an answer of “no” was recorded as 0 while “yes” was recorded as 1. The response option of “I don’t know” was recorded as a 0. Based on a sample size assessment, 80 participants were required to achieve 80% (95% CI 0.71–0.88) power with a score of 5 or higher on a given item on the sensibility questionnaire14,15. To determine whether the confirmatory questions conferred added value, the sensitivity, specificity, and 95% CI were calculated for the screening questions with and without the confirmatory questions separately. Concurrent validity of the diagnosis reported using the SARD tool compared with the medical record was evaluated using Cohen’s κ statistic. The κ statistic may be interpreted as less than 0 indicating no agreement, 0.00–0.20 as slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as an almost perfect agreement16. A priori, we hypothesized that moderate agreement (κ > 0.41) would be acceptable because additional factors may affect self-reported and physician-reported diagnoses. Statistical analysis was done using RStudio.
RESULTS
Participants
The SARD tool was administered to 141 participants whose characteristics are summarized in Table 1. Our response rate was 100% with all subjects approached agreeing to participate. The majority of the participants (87.2%) were women. The number of participants varied by SARD with the highest number in the SLE group (n = 30). There were no patients with ITP. The healthy controls were 87.2% women. They were of similar age to the sample population with a mean ± SD age of 41.9 ± 15.6 years compared with 44 ± 14 years in the SARD group. Fourteen subjects had overlapping SARD conditions (SLE and SSc, n = 5; SLE and SS, n = 3; RA and SS, n = 4; SSc and SS, n = 2). A subset of 87 participants (including 18 healthy controls) were separately recruited to complete the sensibility assessment.
Participant characteristics. Values are n (%) unless otherwise specified.
Sensibility assessment
The Flesch Reading Ease score for the SARD tool was 77.1 and the Flesch-Kincaid Grade Level was 4.4. As shown in Table 2, respondents endorsed comprehensibility (mean rating 6.12 ± 0.92), feasibility (mean rating 5.94 ± 0.81), face validity (mean rating 5.35 ± 1.10), and acceptability (mean rating 3.10 ± 2.03 where 1 = very unlikely).
Sensibility assessment. Values are mean ± SD.
Validity
Participants’ self-reported diagnoses were compared with the diagnoses found in their medical record (Table 3). For the total sample population, the agreement between the SARD tool and medical record was κ = 0.82 (95% CI 0.77–0.88). Subgroup analysis by SARD found κ coefficients for SLE to be κ = 0.88 (95% CI 0.79–0.97), SSc κ = 1.0 (95% CI 1.0–1.0), PM/DM κ = 0.72 (95% CI 0.49–0.95), SS κ = 0.85 (95% CI 0.71–0.99), and RA κ = 0.52 (95% CI 0.32–0.72) improving to κ = 0.61 (95% CI 0.34–0.87) with the confirmatory question. All diseases in the questionnaire have been included in the table for completeness. Those with sample sizes of fewer than 10 patients have been indicated with an asterisk to aid in data interpretation.
Validation.
The screening questions had high sensitivity ranging between 0.96 and 1.0, with the exception of the screening question for hemolytic anemia (sensitivity 0.33, n = 3; Table 3). The specificity of the screening questions was high, ranging between 0.88 and 1.0. Modest increases to near perfect specificity were achieved through the addition of the confirmatory question in the SLE, RA, PM/DM, SS, and hemolytic anemia groups. In the subset of patients with overlapping conditions, the SARD tool correctly identified those with SLE and SSc overlap (5/5 for SLE and 5/5 for SSc). Among those subjects with a SARD overlapping with SS, there were 4 subjects who did not have the term “Sjögren’s disease” documented in the clinical chart. However, 3 of these subjects had documentation of sicca symptoms of dry eyes or dry mouth in their chart.
DISCUSSION
Case ascertainment tools are important instruments in clinical research. We have developed a SARD case ascertainment tool with demonstrable readability, comprehensibility, feasibility, and face and concurrent validity. This SARD tool is low cost and does not require specialized personnel, nomograms, or computation. It can be easily implemented in the outpatient setting to identify participants for research studies or by mail for family studies examining the development of SARD in the first-degree relatives of patients with this diagnosis. The use of both screening and confirmatory questions has resulted in good sensitivity and specificity.
The published accuracy of self-reported SARD diagnoses is highly variable depending on the population being studied and the method used to validate the self-report1,3,17,18. Our SARD tool had demonstrable validity with “almost perfect agreement”16 with the medical record for our sample population. For each of the SARD (SLE, RA, SSc, PM/DM, SS, and APS), the κ values indicated moderate to almost perfect agreement16. Among the disease controls, the κ values were moderate to substantial for multiple sclerosis and thyroid disease. For SSc, the screening question alone was sufficient. The addition of confirmatory questions following a screening question is a method that has been used previously to enhance questionnaire validity1. Indeed, in our study, the addition of the confirmatory question improved specificity. For RA and hemolytic anemia, the confirmatory questions inquired about the use of methotrexate and prednisone, respectively. These medications are frequently used in these conditions and are easily recognizable, thereby improving the specificity of the SARD tool for these conditions.
In comparison to the original version of this tool, the new version has improved visual formatting (addition of white space, blocking and shading of alternating questions, and increased font size). These methods have been shown to improve the readability of tools, particularly in the classification of SARD7. This SARD tool retains the original screening questions. The addition of 1–2 confirmatory questions adds improved specificity. In some cases, the confirmatory questions result in reduced sensitivity. Because we report the operating characteristics of both the screening and confirmatory questions by SARD in the same sample, future investigators may make an informed choice regarding the use of only the screening questions or both screening and 1 or more confirmatory questions, depending on their needs and preferences.
A potential limitation to implementing medication use as confirmatory questions for a particular disease is that the response may be susceptible to recall bias or bias the sample to those individuals with more severe disease. Second, it should be noted that reliance on confirmatory questions might reduce sensitivity because the medication(s) may not be indicated exclusively for 1 disease. The question of the use of artificial tears for treatment of SS is limited by the fact that these are commonly used for people with dry eyes who do not have a diagnosis of SS19,20. The use of symptoms as confirmatory questions is reasonable in diseases such as SS and SSc with classic and symptom-centered diagnoses21,22. The high sensitivity of the screening questions makes it reasonable to use them to rule out a particular SARD before asking confirmatory questions to increase specificity. Several of the diseases (SSc, PM/DM, and SS) had 2 confirmatory questions. Because patients may vary in their presentation as well as in their method of treatment, these particular questions should be interpreted separately. For example, a patient with SS may experience only 1 sicca symptom or may have dry eyes and mouth together23. It should be noted that the recruiting of participants from specialty clinics may affect the performance of the questionnaire. Diagnostic test accuracy is dependent on the prevalence of disease, which may be the primary driver of setting-dependence. Application of this questionnaire to a more general population may alter the performance characteristics. Finally, our cases and controls were comparable with regards to age and sex. We did not collect information on education level, socioeconomic status, or health literacy. These could affect the findings.
Overall, the aspects of sensibility were scored highly by the participants, with scores over 5 in each of the areas assessed. It should be noted that the question pertaining to being uncomfortable with the SARD tool had the most variable response. This may be because the anchors for the response scale were reversed, as compared to the other questions in the questionnaire. Some participants may have completed the tool quickly and responded highly in all categories, not realizing that lower scores for this domain were more positive. Indeed, 1 subject who marked a 7 on this scale (indicating that people would likely find the questionnaire very uncomfortable) wrote “none” in the space provided to indicate which particular questions people would find uncomfortable. The branching questions may have been another source of confusion for some of the participants. Some participants answered branching questions (particularly when the answer was affirmative) after they had answered the screening question in the negative. For example, a participant may have indicated that they did not have SSc, but said that they did have Raynaud phenomenon. Modifying the design of the tool by use of arrows or clearer instructions may eliminate this confusion24.
Moderate correlations were observed for RA, DM/PM, and SS. This was not surprising because subjects may have been told that they have these diagnoses as a means of explaining their symptoms. However, clinicians may consider inflammatory arthritis, inflammatory myositis, or sicca symptoms manifestations of the underlying SARD such as SSc or SLE. Indeed, it has been recognized that 15% of SSc subjects will have a symmetrical, inflammatory arthritis25. As a point of comparison, the Connective Tissue Screening Questionnaire is a 30-item questionnaire with additional demographic, socioeconomic, and comorbidity questions26. This questionnaire was validated among SARD cases and controls attending a tertiary, academic rheumatology center in the United States. The specificity of our SARD tool in SSc and PM/DM is improved because they reported specificities of 92% (95% CI 88–96) for SSc and 83% (95% CI 78–88) for PM/DM. The point estimates for specificity of our tool are better for SLE and RA, but have overlapping CI.
We have developed a case ascertainment tool for the identification of individuals with a SARD for use in the outpatient setting. This tool has demonstrable sensibility (readability, comprehensibility, acceptability, face validity, feasibility). This tool has demonstrable validity compared with medical records. The use of both screening and confirmatory questions confers added accuracy, with a sensitivity of 91% and a specificity of 99%.
ONLINE SUPPLEMENT
Supplementary data for this article are available online at jrheum.org.
Footnotes
Funded by a Strategic Operating Grant from The Arthritis Society of Canada to Dr. J. Wither.
- Accepted for publication September 21, 2016.