Abstract
Objective Previous studies suggested that distinct phenotypes of eosinophilic granulomatosis with polyangiitis (EGPA; formerly known as Churg-Strauss syndrome) could be determined by the presence or absence of antineutrophil cytoplasmic antibodies (ANCA), reflecting predominant vasculitic or eosinophilic processes, respectively. This study explored whether ANCA-based clusters or other clusters can be identified in EGPA.
Methods This study used standardized data of 15 European centers for patients with EGPA fulfilling widely accepted classification criteria. We used multiple correspondence analysis, hierarchical cluster analysis, and a decision tree model. The main model included 10 clinical variables (musculoskeletal [MSK], mucocutaneous, ophthalmological, ENT, cardiovascular, pulmonary, gastrointestinal, renal, central, or peripheral neurological involvement); a second model also included ANCA results.
Results The analyses included 489 patients diagnosed between 1984 and 2015. ANCA were detected in 37.2% of patients, mostly perinuclear ANCA (85.4%) and/or antimyeloperoxidase (87%). Compared with ANCA-negative patients, those with ANCA had more renal (P < 0.001) and peripheral neurological involvement (P = 0.04), fewer cardiovascular signs (P < 0.001), and fewer biopsies with eosinophilic tissue infiltrates (P = 0.001). The cluster analyses generated 4 (model without ANCA) and 5 clusters (model with ANCA). Both models identified 3 identical clusters of 34, 39, and 40 patients according to the presence or absence of ENT, central nervous system, and ophthalmological involvement. Peripheral neurological and cardiovascular involvement were not predictive characteristics.
Conclusion Although reinforcing the known association of ANCA status with clinical manifestations, cluster analysis does not support a complete separation of EGPA in ANCA-positive and -negative subsets. Collectively, these data indicate that EGPA should be regarded as a phenotypic spectrum rather than a dichotomous disease.
- antineutrophil cytoplasmic antibody–associated vasculitis
- eosinophilic granulomatous vasculitis
- vasculitis
Eosinophilic granulomatosis with polyangiitis (EGPA; formerly Churg-Strauss syndrome) is a rare multisystemic disease of unknown etiology that occurs mainly in middle-aged people with late-onset asthma, frequent nasal polyposis, and involvement of various organs.1-4 Laboratory hallmarks of EGPA are eosinophilia and positive antineutrophil cytoplasmic antibodies (ANCA), which are detectable in 30% to 40% of patients.5-13 Histological findings include granuloma, eosinophilic infiltration, and vasculitis, and EGPA can be seen as a condition at the crossroads of ANCA-associated vasculitides and hypereosinophilic disorders.14 Indeed, EGPA shares features of other types of ANCA-associated small-vessel vasculitis (eg, purpura, pauci-immune glomerulonephritis, and alveolar hemorrhage), and of conditions characterized by eosinophilic proliferation, such as cardiomyopathy or eosinophilic pneumonia.3,15
Recently, the concept has emerged to consider EGPA as a syndromic entity that might comprise 2 distinct vasculitis- and eosinophilia-driven subsets. This concept is mainly substantiated by observations indicating that ANCA-positive and ANCA-negative EGPA are associated with distinct clinical features. Available data suggest that renal disease is almost exclusively found in ANCA-positive cases, which also show more common peripheral neuropathy. In contrast, ANCA-negative EGPA is characterized by more frequent heart involvement.5-13 Whether EGPA exists as 2 distinct disease subsets is still uncertain, but this question has potentially important implications as it might help to understand its etiopathogenesis and possibly lead to more effective phenotype-targeted treatment approaches. Conversely, it remains unproven if such a clustering of clinical patterns linked with vasculitis or eosinophilia does occur at the patient level.
These considerations gave impetus to the present study, which aimed at exploring if cluster analysis can subclassify a large European cohort of patients with EGPA into clinically distinct subgroups.
METHODS
Selection of patients and case definition. This retrospective study considered patients who were clinically diagnosed with EGPA and who were followed between 1984 and 2015 in 1 of 15 tertiary referral centers for internal medicine, rheumatology, pulmonology, or nephrology in France, Germany, Italy, or Poland. Centers with known expertise in the field of vasculitis were asked to participate in this study.
We included patients who had a diagnosis of vasculitis and fulfilled the American College of Rheumatology (ACR) classification criteria for EGPA and/or the criteria used by Lanham et al.1,2 As asthma is a prominent feature in EGPA, we did not retain patients for whom this characteristic was absent. We also excluded patients for whom the investigator retained a different diagnosis in the end. In light of the high diagnostic value of ANCA, we additionally defined “modified ACR criteria” by including the presence of ANCA as a seventh classification item, with the requirement that at least 4 classification items were fulfilled. In addition, for Lanham criteria, we also accepted biopsy demonstrating extravascular eosinophils as a surrogate for blood eosinophilia. Thus, patients satisfying none of these sets of criteria were excluded from the analyses. In 2022, the ACR/European Alliance of Associations for Rheumatology (EULAR) classification criteria for EGPA were published.4 The fulfillment of these criteria was also checked among the selected patients. For the purpose of the cluster analysis, we excluded the patients who had incomplete information for the input variables (see below). The inclusion criteria are summarized in Supplementary Table S1 (available from the authors upon request).
Data collection. Clinical data were collected retrospectively using a standardized case report form in Microsoft Excel. The spreadsheet included prefilled items as scroll-down menus in order to permit homogeneous data collection, with a precise definition given for each item. All variables were cumulative data that included signs or symptoms at diagnosis or at any time during the follow-up.
Clinical variables were divided into the following 10 domains and, if relevant, further described by the specific pattern of involvement. These domains were musculoskeletal [MSK] symptoms, mucocutaneous symptoms, ophthalmologic symptoms, ENT symptoms, pulmonary involvement, cardiovascular symptoms, gastrointestinal (GI) symptoms, renal involvement, peripheral neuropathy, and central nervous system (CNS) or psychiatric symptoms. Pulmonary involvement was defined as the presence of nonfixed pulmonary infiltrates on radiographs and/or pleural effusion confirmed by images and/or alveolar hemorrhage signs; asthma was excluded from that definition and listed separately. Renal involvement was defined as the presence of impaired renal function, and/or a proteinuria ≥ 1+ on urinalysis or > 0.2 g in a 24-hour sample or > 0.2 on a protein-to-creatinine ratio (spot urine sample), and/or hematuria ≥ 1+ on urinalysis or an active urinary sediment. Other symptoms (MSK, mucocutaneous, ophthalmologic, ENT, cardiovascular, GI, neurological) were defined as signs that were considered by the investigator to be directly linked with the diagnosis of EGPA.
Laboratory data included blood eosinophil count at diagnosis and at the time of relapses, as well as the presence of ANCA, which was defined as a positive ANCA (by immunofluorescence or ELISA), at any time. If available, data on biopsy findings were recorded with respect to the site of biopsy and their findings which, if abnormal, were categorized into 3 main findings of eosinophilic infiltrate, vasculitis, and extravascular granuloma.
Outcomes were described by duration of follow-up and in terms of death or relapses. Medication was recorded with respect to the entire course of disease, independent from the sequence or length of treatment regimen.
Ethical approval was obtained by local and national ethics committees in accordance with local regulatory requirements for noninterventional retrospective studies.
Statistical analyses. Descriptive data were presented as percentages or means (with SD). Between-group comparisons were made by chi-square statistics (or Fisher exact test when necessary) for categorical variables and t test for continuous variables, and the multivariate analyses used a logistic regression. Estimations were provided with a 95% CI. Statistical tests were 2-tailed and statistical significance was defined as P < 0.05.
Cluster analysis is a statistical approach based on multidimensional modeling that assesses the relationship between given characteristics and creates homogeneous subgroups of patients. We first performed a multiple correspondence analysis to optimize the dimensions of the dataset. We then used a hierarchical cluster analysis based on Euclidean distance and the Ward method to identify subgroups of patients according to their characteristics.16 The main cluster analysis considered 10 categorical input variables, at diagnosis or at relapse: MSK symptoms, mucocutaneous symptoms, ophthalmologic symptoms, ENT symptoms, pulmonary involvement, cardiovascular symptoms, GI symptoms, renal involvement, peripheral neuropathy, and CNS or psychiatric symptoms. A second model included ANCA positivity as an additional input variable, to explore whether ANCA had an effect on the formation of subgroups. As a result of the analysis, dendrograms were computed and the number of clusters was determined by visual inspection (horizontal cut at the level of higher dissimilarity between clusters) and by checking the gain in within-cluster inertia achieved at each clustering step. The generated clusters were described by their most prominent summary characteristics.
Decision tree models were constructed to assess if there were specific variables that predicted the partition into clusters. For this analysis, we used a classification and regression tree (CART) algorithm based on the Gini impurity index.17 Again, 2 models were computed, with and without ANCA positivity as an input variable.
All statistical analyses were conducted with R, version 4.2.0 (www.r-project.org).
RESULTS
Patient selection and characteristics. A total of 15 centers contributed to the dataset, including departments of nephrology, internal medicine, rheumatology, and pulmonology. The centers were located in France (2 centers), Germany (4 centers), Italy (8 centers), and Poland (1 center). Patients without asthma (36 patients) or with an alternative final diagnosis (other systemic vasculitis, allergic bronchopulmonary aspergillosis, or other hypereosinophilic syndromes; 12 patients) were excluded. Among the remaining 618 patients, 466 (75.4%) fulfilled the modified ACR criteria and 486 (78.6%) fulfilled Lanham criteria. All patients included in the analysis fulfilled the 2022 ACR/EULAR classification criteria. Only complete cases could be analyzed; 45 patients with missing data (most often regarding MSK symptoms [33.3%] and ANCA [44.4%]) were excluded. The analyzed dataset included 489 EGPA cases diagnosed between 1984 and 2015. Figure 1 shows the flow chart of patient selection.
Flow chart of patient selection. ACR: American College of Rheumatology; DCVAS: Diagnostic and Classification Criteria in Vasculitis Study.
Table 1 shows the cumulative characteristics of the analyzed patients, both overall and according to ANCA status. The mean age at diagnosis was 50.8 (SD 15.0) years and 53.8% of the patients were female. The most frequently observed manifestations were ENT symptoms (91%), peripheral neuropathy (71%), pulmonary involvement (58.9%), and MSK symptoms (53.8%). The blood eosinophil count was > 1 × 109 cells/L and/or > 10% of white blood cells in 81% of patients. ANCAs were detected in 37.2% of patients and were mostly perinuclear ANCA (85.4%) and/or antimyeloperoxidase (87%) according to the assay used. Other ELISA test results were ANCA with no antimyeloperoxidase or antiproteinase 3 specificity (8.4%) and ANCA with antiproteinase 3 specificity (4.5%). Biopsy results were available for 317 patients (64.8%) and showed anomalies consistent with EGPA in 271 of these cases (85.5%). The most frequent biopsy sites were cutaneous (28.4%), bronchopulmonary (27.4%), neuromuscular (18.3%), GI (17.4%), sinonasal (11.4%), renal (7.6%), and endomyocardial (5.4%). Findings included vasculitis in 52.4% of biopsies, eosinophilic infiltration in 72.9%, and granulomas in 14.2%. Vasculitis and eosinophilic infiltration were concurrent in 118 biopsies (37.2%). All patients received systemic glucocorticoids. The main immunosuppressive treatments reported were cyclophosphamide in 230 patients (47.0%), methotrexate in 172 patients (35.2%), and azathioprine in 145 patients (29.7%). Six patients received omalizumab (1.2%); mepolizumab was used in only 1 patient, as it had not yet been introduced for EGPA at the time of patient inclusion in this study. The median follow-up duration was 4.8 (IQR 2.1-8.9) years. Relapses were recorded in 257 patients (52.6%), and 28 patients (5.7%) died during follow-up.
Main demographic, clinical, and histological characteristics of patients (overall and stratified by presence and absence of ANCA).
Associations between disease characteristics and ANCA status. Stratification of the disease characteristics by ANCA status found several statistically significant differences. Patients were more often men and older in the ANCA-positive group, although these differences were not statistically significant in the multivariate analysis. For ANCA-positive EGPA, there was a significant association with renal involvement (P < 0.001 in univariate and multivariate analysis) and peripheral neuropathy (P < 0.001 in univariate analysis; P = 0.04 in multivariate analysis). The absence of ANCA was positively associated with more frequent cardiovascular involvement (P < 0.001 in univariate analysis and multivariate analysis). With respect to histological data, vasculitis was more commonly reported for ANCA-positive cases (P = 0.01 in univariate analysis; P = 0.05 in multivariate analysis), and eosinophil infiltration was more frequent in ANCA-negative cases (P < 0.001 in univariate analysis; P = 0.001 in multivariate analysis). There was no difference in terms of blood eosinophil count between ANCA-positive and -negative patients.
Cluster analysis and decision tree analysis. The dendrograms of the cluster analyses are shown in Figure 2. On the graphical representation of the multiple correspondence analysis, displaying the most representative dimensions, there were no clearly individualized subgroups and the spatial position of individuals from each cluster overlapped (Figure 2).
Dendrograms of cluster analysis and graphical representation of multiple correspondence analysis in the most representative 2 dimensions. (A,B) The superior panel model without ANCA shows 4 clusters; (C,D) the inferior panel model with ANCA shows 5 clusters. ANCA: antineutrophil cytoplasmic antibody.
The first model, which did not include ANCA positivity as an input variable, generated 4 clusters. The characteristics of these clusters, of, respectively, 34 (cluster 1), 376 (cluster 2), 39 (cluster 3), and 40 patients (cluster 4), are displayed in Table 2. The comparisons showed that the frequencies of ophthalmological, CNS, and ENT involvement significantly differed across the clusters. Figure 3 shows the results of the decision tree model carried out to identify the variables that allowed assigning patients to each of the 4 clusters. The combination of the 3 variables—ophthalmological, CNS, and ENT involvement—assigned patients to their respective clusters with 100% accuracy. Based on this tree analysis, the largest cluster (cluster 2) was characterized by 100% ENT involvement with no CNS and no ophthalmologic involvement, and the other 3 smaller clusters by 100% ENT and ophthalmological involvement (cluster 1), 100% CNS involvement and no ophthalmological involvement (cluster 3), and no ENT and no CNS involvement (cluster 4).
Patient characteristics of 4 groups identified by cluster analysis (model without ANCA).
Decision tree (model without ANCA), showing the implication of ENT, CNS, and ophthalmologic involvement. ANCA: antineutrophil cytoplasmic antibodies; CNS: central nervous system; Opht: ophthalmologic.
With the addition of ANCA positivity to the model, cluster analysis resulted in 5 clusters, including 3 (named cluster 1, 4, and 5 with 34, 39, and 40 patients, respectively) which remained unchanged compared to the model not including ANCA positivity as an input variable (Table 3). The other 2 clusters (clusters 2 and 3) resulted from the division of the largest cluster (identified in the model not including ANCA positivity), with 298 and 78 patients, respectively. Comparisons of the characteristics across the 5 clusters additionally showed a statistically significant difference with regard to the frequency of renal involvement (P < 0.001) and ANCA positivity (P < 0.001). Supplementary Figure S1 (available from the authors upon request) shows the results of the corresponding decision tree analysis, which additionally used renal involvement for the categorization, yielding a 99.6% accuracy for classification (ie, 487/489 patients were correctly categorized). Compared to the decision tree for the 4 clusters generated by the cluster analysis not including ANCA positivity as an input variable, the 2 additional clusters were characterized by 100% ENT involvement, with no renal, no CNS, and no ophthalmological involvement (cluster 2), and 100% ENT and renal involvement, with no CNS and no ophthalmological involvement (cluster 3). Even though cluster 3 had 100% positivity for renal involvement, it included only 78 of 104 patients (75%) with renal involvement in the total patient sample, showing that this variable was not exclusive to this cluster.
Patient characteristics of 5 groups identified by cluster analysis (model with ANCA).
For both cluster analysis models, it is notable that the generated clusters did not differ with regard to peripheral neuropathy or cardiovascular involvement. There was also no significant difference among clusters in terms of relapses (P = 0.70 in the model without ANCA; P = 0.50 in the model with ANCA) and deaths (P = 0.10 in both models).
DISCUSSION
This study analyzed the cumulative data of 489 patients with EGPA fulfilling widely accepted classification criteria, with the specific aim of identifying homogeneous subgroups using cluster analysis. Although reinforcing the link between ANCA status and clinical signs, the cluster analyses did not reveal clinically meaningful subgroups based on the most relevant clinical features, such as kidney disease, peripheral neuropathy, or cardiovascular involvement. In particular, the results of our cluster analyses did not support the concept that EGPA can be divided into only 2 distinct phenotypes characterized by ANCA positivity and negativity.
However, our data confirm the well-known associations between ANCA status and clinical features, and they also add new insights with respect to this point. For the patients with ANCA-positive EGPA, we found a significantly higher percentage of peripheral neuropathy and renal disease, whereas for the patients with ANCA-negative EGPA, there was a higher frequency of cardiac involvement, in accordance with previous studies.5-13 Pertinently, the analysis of histological data further supports the idea that the presence or absence of ANCA is associated with a more vasculitic or eosinophilic pattern, respectively. Indeed, eosinophilic infiltrates were more common in ANCA-negative cases and, although not reaching statistical significance in the multivariate analysis, histological evidence of vasculitis was more common in ANCA-positive cases. We acknowledge that these data need to be interpreted bearing in mind that the likelihood of observing the individual histological findings may vary across biopsy sites and that the sites of sampling were not equally distributed between these patient subgroups.
To the best of our knowledge, this study is the first to perform a cluster analysis in the context of EGPA. For this multidimensional approach, we used multiple correspondence analysis to optimize the dimensions of the dataset displaying the relationships between the variables and identify, through hierarchical cluster analysis, subgroups of patients according to their characteristics.16 To assess whether ANCA positivity was an important element contributing to the formation of clusters, we defined 2 models, with and without the inclusion of ANCA as input variable. Decision tree analysis was then undertaken in an attempt to highlight the most distinctive characteristics of the clusters generated by the cluster analyses.17
With the premise that EGPA might have 2 distinct phenotypes driven by either systemic vasculitis or hypereosinophilia, and where ANCA positivity is closely linked with the vasculitic pattern, the cluster analyses produced apparently incongruous results. Most noteworthy, neither cardiovascular involvement, peripheral neuropathy, nor ANCA positivity substantially affected the characteristics of the 4 and 5 clusters formed by the 2 cluster analysis models. Both cluster analyses generated 3 identical clusters, which accounted for no more than 23.1% of the full patient population and were essentially characterized by the presence or absence of ENT, CNS, and ophthalmological involvement. With the addition of ANCA positivity as an input variable, an additional small cluster was created characterized by 100% renal disease, but this class did not include all patients with renal involvement. Because the lack of ENT involvement and the presence of CNS, ophthalmological, and renal involvement represent the less common features of EGPA, and because cluster analysis may attribute undue high weights on uncommon features (with high variance), the partition was likely driven by the rarity of these manifestations and did not result in clinically relevant subclasses. This is also supported by the fact that both cluster models retained a large cluster, comprising 60.9% or 76.9% of all patients, with no particular distinctive clinical features.
We believe that the failure of cluster analysis to reproduce a separation into ANCA-positive and ANCA-negative EGPA reflects the substantial overlap of clinical features between these 2 entities. There is no clinical hallmark unique to ANCA-positive or ANCA-negative EGPA, thus highlighting that clinical criteria do not allow a clear-cut separation into subgroups. As suggested by our pathology findings, which indicate that many affected samples display both findings of vascular inflammation and eosinophilic infiltration, this could imply that clinical features are unreliable proxies for one particular underlying pathological process. For example, in patients with ANCA-negative EGPA and severe cardiomyopathy requiring heart transplantation, histological examination of the explanted hearts showed active vasculitis in most patients (78%).18 Similarly, observations regarding the efficacy of mepolizumab, an interleukin 5 inhibitor, in strongly reducing the number of circulating eosinophils and suppressing eosinophilic activation, show that it does provide benefits for both ANCA-negative and ANCA-positive EGPA.19
Hence, the present data support the understanding of EGPA as a spectrum intertwining eosinophilic proliferation and vasculitic processes to various extents rather than a disease with clear-cut subsets. The causes that may drive the eosinophilic proliferation or vasculitic inflammation are not well understood but may in part be due to genetic factors.20 This concept also counters the recent line of thought that EGPA should be defined as a mere subset of EGPA with vasculitis findings, and that clinical forms lacking features of vasculitis should be viewed as a different category. Such cases, for which the term “hypereosinophilic asthma with systemic manifestations” was coined,21,22 are not reliably depicted for the moment, and it is unproven that they do not have EGPA. Also, a previous report of another entity in patients presenting biopsy-proven vasculitis with eosinophilic infiltration without any manifestation of asthma23 highlights the fact that the full width of range of EGPA-type conditions may not have been identified yet.
Our study has strengths and limitations. The main strengths are that it was based on a carefully selected cohort of cases diagnosed with EGPA from multiple countries and with homogeneously collected data. With close to 500 patients, this study represents one of the largest EGPA cohorts reported so far. Nevertheless, we acknowledge that this sample size may still be insufficient, as it has been suggested that the number of individuals studied in a cluster analysis should be 60 to 70 times the number of descriptor variables.24 Even though we cannot exclude this possibility, we believe it unlikely that use of a 1.5- to 2-times larger sample size would have resulted in substantially different findings. Replicating our findings in an independent cohort of similar size would have been desirable but seems unrealistic given the rarity of EGPA. We further recognize that our findings need to be interpreted within the boundaries of the definitions used to categorize symptoms, which may lack granularity, and that the presence of treatment at the time of diagnosis of EGPA, such as glucocorticoids for asthma, could have led to the underestimation of subsequent symptoms or eosinophilia. However, we believe that we used clinically sound and widely accepted definitions for symptoms of EGPA and that these issues are inherent to any EGPA cohort due to the nature of the disease. Finally, our results indicating that there were no significant between-cluster differences in terms of relapses and deaths should be viewed keeping in mind that the retrospective data collection prevented us from collecting detailed longitudinal information to perform more powerful time-to-event analyses.
The classification and subgrouping of EGPA continues to be a challenging area. It remains elusive if future research will provide new biomarkers (eg, etiological, immunological, or immunogenetic signatures) based on which clinically relevant subsets of EGPA could be defined more stringently than with ANCA status or clinical criteria. Thus, for the time being, EGPA should still be viewed as a heterogeneous but single entity.
Footnotes
The authors declare no conflicts of interest relevant to this article.
- Accepted for publication August 15, 2023.
- Copyright © 2023 by the Journal of Rheumatology