Abstract
Objective We aimed to determine whether specific respiratory tract diseases are associated with increased rheumatoid arthritis (RA) risk.
Methods This case-control study within the Mass General Brigham Biobank matched newly diagnosed RA cases to 3 controls on age, sex, and electronic health record history. We identified RA using a validated algorithm and confirmed by medical record review. Respiratory tract disease exposure required 1 inpatient or 2 outpatient codes at least 2 years before the index date of RA clinical diagnosis or matched date. Logistic regression models calculated ORs for RA with 95% CIs, adjusting for confounders. We then stratified by serostatus (“seropositive” was positive rheumatoid factor and/or anticitrullinated protein antibodies) and smoking.
Results We identified 741 RA cases and 2223 controls (both median age 55, 76% female). Acute sinusitis (OR 1.61, 95% CI 1.05–2.45), chronic sinusitis (OR 2.16, 95% CI 1.39–3.35), and asthma (OR 1.39, 95% CI 1.03–1.87) were associated with increased risk of RA. Acute respiratory tract disease burden during the preindex exposure period was also associated with increased RA risk (OR 1.30 per 10 codes, 95% CI 1.08–1.55). Acute pharyngitis was associated with seronegative (OR 1.68, 95% CI 1.02–2.74) but not seropositive RA; chronic rhinitis/pharyngitis was associated with seropositive (OR 2.46, 95% CI 1.01–5.99) but not seronegative RA. Respiratory tract diseases tended towards higher associations in smokers, especially > 10 pack-years (OR 1.52, 95% CI 1.02–2.27, P = 0.10 for interaction).
Conclusion Acute and chronic sinusitis, pharyngitis, and acute respiratory burden increased RA risk. The mucosal paradigm of RA pathogenesis may involve the upper respiratory tract.
Increasing evidence suggests rheumatoid arthritis (RA) may originate at sites of inflamed mucosa such as the lungs. Smoking is the strongest known risk factor for seropositive RA, increasing the odds of disease by 2- to 3-fold.1,2 Exposure to toxins like silica and coal has also been established as a risk for RA.1,3,4 Chronic lower respiratory tract disease such as asthma5–12 and chronic obstructive pulmonary disease (COPD) are also established risk factors for RA.6,13,14 A recent case-control study from the Epidemiological Investigation of Rheumatoid Arthritis (EIRA) in Sweden suggested that not only chronic lower respiratory tract diseases such as asthma and COPD but also acute respiratory tract diseases and chronic upper respiratory tract diseases were associated with increased risk of incident RA.15 However, specific acute and/or upper respiratory tract diseases were not studied.
Some studies suggest a possible relationship between certain respiratory tract infections and RA. For example, circulating levels of influenza16 and measles17,18 antibodies were higher in patients with prevalent RA compared to controls. However, these studies were cross-sectional with small sample size. Two recent, large cohort studies investigated the association between respiratory viral infections19 and mycoplasma pneumonia20 with RA risk. However, both only identified hospital-based infections. Further, the former studied only infections 8 weeks to 2 years prior to RA onset even though the RA disease process, including RA-related autoantibodies, typically develops 2–5 years prior to RA21,22; the latter did not adjust for key confounders including smoking and BMI.
Accordingly, we aimed to determine the association between specific acute and/or upper respiratory tract diseases and RA, including sinusitis, pharyngitis, and pneumonia, using a large cohort of patients with incident RA. We hypothesized that similar to chronic lower respiratory tract diseases, these would be associated with increased risk of RA.
METHODS
Study design. This case-control study used data from the Mass General Brigham (MGB) Biobank, a research repository from Massachusetts General Hospital, Brigham and Women’s Hospital, and their affiliated sites in the greater Boston, Massachusetts area that launched in 2010.23 As of June 2020, n = 117,248 participants had consented to provide blood and agreed to link their electronic health record (EHR) data, and were thus eligible for this study. Participants completed a MGB Biobank health survey that has a 40% response rate.23 This substudy received approval from the MGB institutional review board (protocol #2019P000264), followed the STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) guidelines, and complied with the Declaration of Helsinki.
Incident RA cases. We initially identified 2017 adult RA cases in the MGB Biobank using a previously published algorithm for RA that incorporates diagnosis codes, laboratory results, and natural language processing.24,25 This algorithm has a 95% positive predictive value (PPV) at 97% specificity for RA defined by the 2010 American College of Rheumatology/European League Against Rheumatism (ACR/EULAR) criteria.24 After the initial identification of RA by the algorithm, we verified that all patients in this analysis met 2010 ACR/EULAR RA criteria by medical record review.26 We defined index date as the time of RA clinical diagnosis (or matched date for controls) as indicated by the first interaction with a clinician related to RA symptoms or diagnosis by medical record review. To allow ample time for exposure accrual and to improve comparability to the EIRA respiratory study,15 we restricted this analysis to the 741 RA cases with at least 5 years of preceding medical record data. We identified serostatus in the EHR from clinical laboratory tests using structured data supplemented by manual medical record review. We defined seropositive RA as RA with positive rheumatoid factor (RF) and/or anticitrullinated protein antibodies (ACPA). We defined shared epitope positivity by presence of HLA-DRB1*04 or HLA-DRB1*01.27
Controls. We matched each RA case to 3 MGB Biobank controls based on age at index date of RA clinical diagnosis (± 5 yrs), sex, and length of prior EHR history (± 3 yrs, as most cases could not be fully matched with ± 2 yrs). We required they had no RA by the algorithm, no billing codes for RA or any other systemic rheumatic disease, and smoking data available from the MGB Biobank questionnaire, since we considered smoking a key covariate.
Respiratory tract disease exposures. The primary exposure was presence of any respiratory tract disease, as ascertained by International Classification of Diseases, 9th revision (ICD-9) or ICD-10 diagnosis codes. We also subdivided respiratory exposure by location and acuity of respiratory tract disease (upper and lower, acute and chronic) as well as by specific respiratory tract diseases including acute sinusitis, acute pharyngitis, chronic rhinitis/pharyngitis, chronic sinusitis, pneumonia, asthma, and COPD (Table 1). For all these respiratory tract exposures, we required at least 1 code in an inpatient or emergency department setting, or at least 2 outpatient diagnosis codes at least 30 days apart for chronic respiratory tract diseases but without any time requirement for diagnosis of acute respiratory tract diseases.
We also required all these codes to occur at least 2 years before the index date of RA onset or matched date for controls. This requirement reduced the possibility of reverse causation. We selected 2 years because RA-related autoantibodies appear several years before clinical onset of RA.21,22 Participants with at least 1 respiratory tract disease code before index date but not meeting both criteria were determined as “indeterminate” for that particular disease. The reference or “unexposed” group consisted of the participants with no respiratory tract disease code of any kind before index date of RA diagnosis.
A secondary exposure of interest was respiratory tract disease burden. We defined this as the total number of respiratory tract disease codes (continuous), both overall and for acute respiratory tract diseases alone, and studied it in increments of 10 codes. We presumed that the number of codes for chronic respiratory tract diseases would be less useful as it might reflect healthcare utilization more than true respiratory burden.
To determine the accuracy of using respiratory tract diagnosis codes as the exposures in this study, we performed a manual medical record review of 50 cases of each respiratory tract disease. We defined the gold standard for each respiratory tract disease based on published criteria for each disease.28,29,30,31,32 More specifically, acute sinusitis required up to 4 weeks purulent nasal drainage and nasal obstruction, facial pain, or both,28 whereas acute pharyngitis required documented symptoms of throat pain, dysphagia, or nasopharyngitis. Diagnosis of chronic sinusitis required documentation of inflammation plus at least 2 of 4 sinusitis symptoms for ≥ 12 weeks.28 Chronic rhinitis/pharyngitis required consistent symptoms including paroxysms of sneezing, rhinorrhea, nasal obstruction, nasal itching, postnasal drip, cough, throat pain, and/or dysphagia for ≥ 12 weeks.29 Pneumonia required demonstrable infiltrate by chest radiograph or other imaging technique.30 Diagnosis of asthma required all 3 of the following conditions including (1) history of cough with wheezing or dyspnea, (2) variability in symptoms, and (3) at least 2 supportive measures as defined previously.31 Finally, COPD required the forced expiratory volume in 1 second to forced vital capacity (FEV1/FVC) ratio to be > 0.7 or less than the lower limit of normal that is incompletely reversible after the administration of an inhaled bronchodilator or computed tomography scan showing emphysema.32 The PPVs by the gold standard definitions and by physician diagnosis were all high (means 72% and 86%, respectively; Supplementary Table 1, available from the authors on request).
Covariates. We chose covariates and potential confounders that were known risk factors for presence of respiratory tract diseases and RA, including age at index date, sex, length of EHR history in the Partners HealthCare system before index date of RA or matched date in years, biobank enrollment year, race/ethnicity (White, non-Hispanic vs other), education (4-year college or Master’s/Doctoral/professional degree vs less), BMI (< 20, 20 < 25, 25 < 30, ≥ 30 kg/m2), and smoking (current, past, or never, as well as continuous pack-yrs; all as of MGB Biobank enrollment on the health survey). Length of EHR history, sex, race/ethnicity, and BMI came from EHR data. We obtained enrollment year, education, and smoking data from MGB Biobank enrollment survey data. To determine missing BMI and smoking data, we performed manual medical review.
Statistical analysis. We used chi-square tests to compare proportions and Wilcoxon signed-rank tests with medians and IQRs to compare continuous variables between cases and controls. For participants with missing education (15%), we used logistic regression imputation to predict high or low educational level as a function of sex, case/control status, and race. All other covariates had no missing data. For our primary analysis, we used multivariable conditional logistic regression models to calculate ORs with 95% CIs for the association between each respiratory tract disease and incident RA. We examined any respiratory tract disease, specific respiratory tract diseases, and respiratory tract disease burden, as detailed above. The models adjusted for age, sex, length of EHR history, MGB Biobank enrollment year, race/ethnicity, education, BMI, smoking status, and smoking pack-years.
We also conducted stratified analysis by RA serostatus. We compared seropositive RA cases only to their matched comparators so that the matching was preserved. Patients with missing RA-related autoantibodies were excluded. We also performed stratified analyses based on smoking status (ever or never) and smoking pack-years (never smoker, > 0–10 pack-yrs, and > 10 pack-yrs). To investigate whether the relationship between respiratory tract diseases and RA varied by smoking status, we tested for multiplicative interactions between each respiratory tract disease and smoking for risk of RA. All the above analyses were prespecified in our protocol and performed using SAS version 9.4 (SAS Institute) except for logistic regression imputation, for which we used R Core Team (2020) MICE package (R Foundation for Statistical Computing). We considered 2-sided P < 0.05 as statistically significant.
RESULTS
Study sample characteristics. This study included 741 incident RA cases and 2223 matched controls (median age 55 yrs, 76% female). Compared to controls, RA cases were more likely to have slightly longer EHR history, non-White or Hispanic race, lower education, higher BMI, and higher smoking exposure (Table 2). Median time from index date to Biobank enrollment was 3.15 years for RA cases (IQR 0.55–8.12) and 4.04 years for controls (IQR 0.54–8.55, P = 0.26). Among the 741 RA cases, 426 (57%) were seropositive, 303 (41%) were seronegative, and 12 (1.6%) had no available test results for RF or ACPA. Within the 346 (47%) of RA cases with genotyping data available, 200 (58%) were positive for the shared epitope (data not shown).
Respiratory tract diseases and RA. Any respiratory disease exposure occurred in 346 (47%) of the RA cases and 931 (42%) of the controls (P = 0.007). After adjusting for covariates, specific respiratory tract diseases including acute sinusitis (OR 1.61, 95% CI 1.05–2.45), chronic sinusitis (OR 2.16, 95% CI 1.39–3.35), and asthma (OR 1.39, 95% CI 1.03–1.87) were associated with increased odds of RA (Table 3).
Acute respiratory tract disease burden was associated with increased odds of newly diagnosed RA (OR 1.30 per 10 codes, 95% CI 1.08–1.55). Overall burden of respiratory tract disease, as assessed by total number of respiratory tract disease diagnosis codes, was not associated with increased risk of RA (OR 1.01 per 10 codes, 95% CI 0.98–1.05). Of note, indeterminate respiratory tract disease exposures were not associated with RA (data not shown).
Results by RA serostatus. Overall, the association between respiratory tract diseases and RA was similar for seropositive and seronegative RA (P for interaction 0.60 for any respiratory disease, and P > 0.05 for all individual respiratory tract exposures; Table 3). Some respiratory tract diseases, including acute sinusitis and chronic rhinitis/pharyngitis, were associated only with seropositive RA (OR 1.89, 95% CI 1.03–3.49 and OR 2.46, 95% CI 1.01–5.99, respectively; Table 3). Others, including acute pharyngitis and chronic sinusitis, were associated only with seronegative RA (OR 1.68, 95% CI 1.02–2.74 and OR 3.23, 95% CI 1.65–6.32, respectively; Table 3). In a posthoc analysis, the association between respiratory diseases and seropositive RA did not differ substantially when restricting to cases with ACPA greater than twice the upper limit of normal (Supplementary Table 2, available from the authors on request).
Stratification by smoking. Similar to serostatus, the interaction between respiratory exposure and smoking was not statistically significant (P = 0.10 for any respiratory disease, and P > 0.05 for all individual respiratory tract exposures). By point estimates, however, nearly every respiratory exposure was more strongly associated with RA in smokers than in nonsmokers (Table 4). Stratifying by smoking pack-years revealed that respiratory tract diseases tended to be associated with increased risk of RA only in individuals who had smoked > 10 pack-years, but P was > 0.05 for all interactions (Table 5).
DISCUSSION
This large case-control study identified acute and chronic sinusitis and pharyngitis as risk factors for incident RA. Acute respiratory tract disease burden increased risk of RA as well. The association between respiratory tract diseases and RA occurred for both seropositive and seronegative RA and was strongest in smokers. These results extend the mucosal paradigm for RA pathogenesis to include the upper respiratory tract and necessitate further epidemiologic, microbial, and genetic studies of this novel anatomic site for RA-related autoimmunity.
The association between upper airway disease and RA has not been previously reported. Although no prior studies have directly examined pharyngitis and risk of RA, a recent study did show that 3 types of upper respiratory viruses (coronavirus, parainfluenza, and metapneumovirus) increased risk of incident RA.19 This study therefore provides further rationale to investigate whether SARS-CoV-2 could affect RA risk even after the resolution of acute infection.33,34 A metaanalysis restricted to high-quality studies of allergic rhinitis also showed an association with RA,35 supporting our findings for chronic pharyngitis/rhinitis. The single previous study of sinusitis showed no association with RA, but it used self-reported diseases only within 2 years of RA onset.36 Overall, these results suggest that the upper airway may be a novel mucosal site of RA origination but will require replication.
The total burden of acute respiratory tract diseases also increased the risk of RA. While the point estimate was modest, many acute respiratory tract diseases do not come to medical attention, which could bias results toward the null. No prior studies have examined respiratory tract burden and RA risk. Although these novel findings will require replication, they suggest that interventions to reduce respiratory tract diseases, including masking, handwashing, and vaccination, could affect RA risk.
The finding that so many different respiratory tract disease types are now associated with RA raises the question of whether the relationship is truly causal or confounded as an epiphenomenon. For example, broader immune system dysregulation could predispose one to both respiratory tract diseases and RA, with the respiratory tract diseases simply manifesting first. However, even noninfectious respiratory tract diseases such as asthma5–12 and COPD6,13,14 have been consistently shown to be associated with RA. Further, respiratory tract diseases including asthma,37 COPD,38 and interstitial lung disease (ILD)39 have been shown to generate ACPA. Overall, our results support a growing body of evidence that implicates the oral and respiratory mucosa in RA pathogenesis.40 Future studies examining the timing between respiratory tract diseases and RA onset could also help determine the likelihood of causality and optimal window for prevention.
In this study, the association between respiratory tract diseases and RA was stronger in heavy smokers than in nonsmokers for all respiratory tract disease types. A similar study using EIRA suggested the reverse association,15 whereas another study of asthma alone showed no difference in association by smoking status.6 Besides chance, a reason for this discrepancy could be that this study selected controls with presence of smoking data. As a result, more controls might have been nonsmokers, giving the illusion that the association between respiratory tract disease and RA occurred more often in smokers. Unlike the EIRA study, our study also required a 2-year lag between exposure and RA. The interaction between smoking and respiratory tract disease may act differently for recent vs distant respiratory tract diseases in relation to timing of RA onset. Other possibilities include population differences. For example, the gene–environment interaction between smoking and the shared epitope is strong in Sweden,27,41,42 but this has not been consistently replicated in North America.43,44 Population genetic differences may underlie this discrepancy, and could contribute to different mechanisms of RA pathogenesis. Further studies of respiratory tract diseases and smoking in combination with RA genetics are needed.
Strengths of this study include its large sample size, incident RA cases, manual verification of RA, the timing of RA onset, data on RA serological status, and adjustment for many important confounders including smoking. Limitations of this study first include the potential for selection bias among MGB Biobank participants, who may differ from the surrounding general population—especially those who answered the health survey. In addition, this population of participants may not generalize to other populations outside of the greater Boston area and/or White, non-Hispanic race. These population differences could bias results either toward or away from the null. Second, the sample size precluded study of certain diseases of interest such as influenza and ILD. Third, misclassification of respiratory tract disease exposures is possible. Although the PPVs were high, these codes did not capture events that occurred outside the MGB system or early life exposures. However, this form of misclassification would bias exposures and observed effects toward the null. Fourth, residual confounding is possible. For example, we did not adjust for pollution or other inhalational agents, and smoking status measured at time of MGB Biobank enrollment may not reflect the precise smoking history prior to RA, although this would be more likely to affect smoking pack-years than smoking status. Fifth, reverse causation, where RA increases risk of respiratory tract diseases, is possible. Although we required respiratory tract diseases to occur at least 2 years before date of clinical onset of RA, further investigation of the association of timing of respiratory tract exposures and RA is warranted. Finally, our analyses did not adjust for multiple comparisons and thus require replication.
In summary, we identified sinusitis, pharyngitis, and acute respiratory tract burden as potential novel risk factors for RA. These findings extend the mucosal hypothesis of RA pathogenesis to the upper airway. Future studies should investigate whether additional upper respiratory tract diseases such as SARS-CoV-2 are associated with RA and define the timing, microbiology, and genetic underpinning of these associations.
ACKNOWLEDGMENT
We would like to thank the participants and staff of the Mass General Brigham Biobank who made this study possible.
Footnotes
This study was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases awards NIH-P30-AR072577 (VERITY Pilot & Feasibility award to VLK), K23 AR069688 (JAS), R03 AR075886 (JAS), L30 AR066953 (JAS), R03 HL148484 (TJD), and P30 AR070253 (Joint Biology Consortium). It was also supported by the R Bridge Award (JAS) from the Rheumatology Research Foundation. The funders had no role in the decision to publish or preparation of this manuscript. The content is solely the responsibility of the authors and does not necessarily represent the official views of Harvard University, its affiliated academic health care centers, or the National Institutes of Health.
The authors declare no conflicts of interest related to this article.
- Accepted for publication October 5, 2021.
- Copyright © 2022 by the Journal of Rheumatology
This is an Open Access article, which permits use, distribution, and reproduction, without modification, provided the original article is correctly cited and is not used for commercial purposes.