Abstract
Objective. Several studies have suggested that patients with rheumatoid arthritis (RA) presenting with ultrasound (US) synovitis despite clinical remission have more subsequent flares than those who show both clinical and sonographic remission. The objective of our study was to investigate whether these results could be translated to a real-life setting.
Methods. We compared the time from the first US performed in clinical remission to loss of remission (defined by a DAS28 > 2.6 or the need for stepping up treatment with disease-modifying antirheumatic drugs) within the Swiss Clinical Quality Management cohort of patients with RA, and we adjusted for relevant confounders. Analyses were repeated for different definitions of US-detected synovitis (US+) using greyscale, Doppler, and combined modes based on previously validated scores, and they were adjusted for relevant confounders.
Results. There were 318 RA patients with 378 remission phases included. Loss of clinical remission was observed in 60% of remission phases. Residual US synovitis was associated with a shorter duration of clinical remission (median 2–5 mos) and a moderately increased hazard ratio (HR) for loss of remission (HR 1.2–1.5), with the highest HR for the combined US score. The association between US+ and loss of remission was strongest when the US measurement had taken place early in remission (shorter median duration of 6–20 mos) and when followup time was limited to the first 3 or 6 months (most HR between 2–4).
Conclusion. US-detected synovitis, particularly when US is performed early in clinical remission, has a moderate predictive power for loss of remission in a real-life setting.
Subclinical synovial inflammation can be detected by ultrasonography in patients with rheumatoid arthritis (RA) in clinical remission, as demonstrated in several studies1,2,3, including observational cohort studies4,5. There is evidence that patients in clinical remission presenting with ultrasound (US)-detected residual synovitis might flare more often and have a shorter duration of remission than patients without6,7. These latter studies, based essentially on the use of the Doppler mode, have mostly been performed in single centers by a few highly skilled operators, on a limited number of patients8,9,10. It is currently unknown whether the predictive value of US regarding flares and the duration of remission remains valid in the context of a large group of US assessors using different US machines, as encountered in a real-life setting. Moreover, it remains unclear whether US should be recommended in everyday clinical practice to support therapeutic decisions, and to monitor clinical remission.
The Swiss Sonography in Arthritis and Rheumatism (SONAR) synovitis score is a US score that has been validated in real-life practice among many different operators11. Further, it has been used as a tool for the sonographic followup of patients with RA in the Swiss Clinical Quality Management in Rheumatic Diseases (SCQM), an online database of a national RA cohort since 2009. The SCQM cohort allows the physician to correlate clinical data and US results in real time, and therefore to quickly adapt the treatment by consulting both evaluations.
Central to our trial is the definition of US-detected residual synovitis, which has been validated among patients with RA and controls using the SONAR score4. The cutoffs for residual synovitis for B-mode, Doppler, and combined modes have been defined and published in previous studies4,12. Using theses cutoffs, it has been shown that significant residual synovitis detected by US was present in more than one-third of patients considered to be in clinical remission, according to the 28-joint Disease Activity Score (DAS28) and the American College of Rheumatology/European League Against Rheumatism criteria12.
The objective of our study was to investigate the predictive value of US-detected residual synovitis on loss of remission in real-life settings using the SONAR score for the definition of US residual synovitis. Because US examinations may have been performed at any time during the remission period, we also evaluated whether the timepoint of US examination (early or late during remission) might affect the predictive power for the duration of remission. Moreover, we analyzed whether the predictive power of US-detected residual synovitis was higher when focusing on the short-term risk for loss of remission.
MATERIALS AND METHODS
Inclusion criteria
Included in our study were all disease-modifying antirheumatic drugs (DMARD)–experienced patients diagnosed by the rheumatologist as having RA in the SCQM cohort, with ≥ 1 US score performed during clinical remission and ≥ 1 followup visit after US in clinical remission. Clinical remission was defined as DAS28-erythrocyte sedimentation rate (ESR) < 2.6. SONAR collection started in 2009 and data up to the end of January 2017 were included; ≥ 1 remission phase per patient could be included. Patients in the SCQM signed an informed consent for contributing to the registry, and the design of the study received approval from the local ethics committee (CER-VD Switzerland: 89/14).
US operators
More than 100 rheumatologists who were board-certified for performing US have been trained to measure the SONAR score. In the analysis dataset, 96% of US evaluations were by operators who had entered ≥ 5 US scores in the SCQM database, and a total of 63 operators contributed data to our study. Because most operators also represented the treating rheumatologist, they were not blinded to the clinical data in most cases.
US assessment
The score comprises 22 of the 28 joints included in DAS28 (shoulder and thumb were not assessed). Synovitis was graded on B-mode and on Doppler mode from 0 to 3 according to the Outcome Measures in Rheumatology criteria13. Moderate to substantial interrater agreement was reported for the SONAR score among rheumatologists who had performed ≥ 5 SONAR scores11. According to a previous study that evaluated 50 healthy controls and 300 patients with RA, the cutoffs for significant US-detected residual synovitis were the following: B-mode of ≥ 2 joints with synovitis grade ≥ 2 or a total B-mode score > 8 points (maximum score of 66 points), and any Doppler activity inside the joint4. A combined US synovitis score was defined as B-mode > 8 and/or any Doppler residual synovitis. Only complete US assessments, where all 22 joints had been examined, were considered. Patients in clinical remission with a baseline US above these cutoffs were considered to have significant US-detected residual synovitis (US+) and compared to those without (US−). Moreover, a reliability study with the same score among 12 physicians who had participated to the SONAR training has been published with satisfying results11.
Definition of residual time of remission
To evaluate the predictive value of US, the remaining time of clinical remission was calculated, starting from the first visit with a sonographic examination (baseline) performed in clinical remission (Figure 1, post-US time in remission).
The timepoint of loss of remission was defined as follows: (1) the first visit with DAS28-ESR > 2.6; (2) drug discontinuation because of insufficient effectiveness; or (3) indirect evidence of nonsatisfactory disease activity state by start of a synthetic (s-) or biologic (b-) DMARD. Only DMARD starts > 60 days after the baseline US were interpreted as a sign for loss of remission, because DMARD starts in close chronological vicinity of US are more likely to represent treatment adaptations based on US, rather than being indicative of a loss of remission. In the absence of all 3 indicators for loss of remission, the post-US time in remission was censored at the last clinical visit in remission (DAS28 ≤ 2.6). It is plausible that the residual time in remission depends on the duration the patient already spent in remission at the time of the US evaluation. Because the date of clinical examination did not necessarily correspond with the start of remission, left and right imputation were used to approximate the duration of remission at the time of the US examination (Figure 1). Left mode and right mode imputation are over- and underestimates of the true duration in remission. For the left mode imputation, both clinical and medication data were evaluated to estimate the timing of start of remission.
Subanalyses were performed for patients in whom the baseline US evaluation was performed “early” in remission, which was defined as occurring ≤ 6 months after the start of clinical remission. To analyze whether US+ might be more predictive in the short term than in the long term, subanalyses were also performed using followup time until 3, 6, 12, and 24 months after baseline US. In these analyses, all remission phases for which loss of remission was not observed up to the time of interest were censored at the time of interest.
Confounding factors
We considered the following variables as potential confounding factors: age, sex, disease duration, seropositivity (rheumatoid factor or anticyclic citrullinated proteins), type of treatment (no sDMARD or bDMARD), and duration of remission at baseline.
Tapering DMARD treatment during remission may be considered a time-changing variable that can influence loss of remission. In patients without residual synovitis, DMARD may more often be tapered, which may lead to loss of remission. Because the focus of our study was to evaluate baseline predictive factors for loss of remission, the tapering of DMARD treatments is presented in a descriptive manner and was not used in multiple adjusted models.
Statistics
We compared baseline covariates of US+ versus US− remission phases using the Kruskal-Wallis test for continuous or discrete variables and the Fisher exact test for categorical variables. All tests were 2-sided, with a significance level set at 0.05.
The median time in remission from the first US examination in remission was analyzed using Kaplan-Meier plots, and p values for differences between US+ and US− were assessed using the log-rank test. HR and 95% Wald CI between US+ and US− for the time from baseline US examinations to loss of remission were estimated using Cox proportional hazard models. The models were adjusted for the baseline levels of the potential confounders.
RESULTS
The number of eligible remission phases (no. of patients with RA) with available US data for the combined mode, Doppler, and B-mode scores were 321 (276), 341 (292), and 378 (318), respectively (Table 1 and Supplementary Data, available from the authors upon request).
Table 1 summarizes the baseline data between the 2 groups using the combined score (data with Doppler alone and the 2 different B-mode definitions for residual US-detected synovitis are available in Supplementary Tables 1–3, available from the authors upon request). All but 19 patients were still taking a DMARD at time of first US assessment in remission (baseline). The frequency of US-detected residual synovitis varied depending on the definition of significant US synovitis (US+), ranging from 32% for B-mode ≥ 2 joints with synovitis grade ≥ 2, to 58% for the combined score of B-mode > 8/66 and any Doppler residual synovitis. There were very few differences in the baseline covariates between US+ and US− patients (Table 1). The duration of remission prior to baseline US when using left mode imputation was between 4 to 7 months longer in US− than in the US+ group when using different definitions for significant US-detected synovitis. US+ patients were, in some comparisons, more frequently seropositive than US− patients. The difference ranged from 3% to 17% using the combined score and Doppler score, respectively, for the definition of residual synovitis. Finally, in all comparisons, US+ patients were a median of 1.5 to 4 years older than US− patients. Around the baseline US examination, the median (interquartile range; IQR) time interval between clinical visits was 7.6 months (3–12).
Loss of remission was observed in 60% of remission phases. DAS > 2.6 was the main cause of loss of remission (66% of cases in both the US+ and the US− groups). In this subgroup, an increase in DAS28 of > 0.6 and > 1.2 was observed in 85% and 46%, respectively. The percentages were similar in both US+ and US− groups. About 11% of the losses of remission were a result of stopping DMARD because of insufficient effectiveness, and the remaining 23% were because of the start of a bDMARD or sDMARD (only DMARD starts > 60 days after baseline US were considered). Tapering antirheumatic treatment, defined as stopping a bDMARD or sDMARD for reasons other than insufficient effectiveness, occurred in 11% of remission phases in both the US+ and US− groups.
The median duration of clinical remission after the first US examination in remission was about 2–5 months longer for US− than for US+, but this difference was only statistically significant when using the combined US synovitis score (US+ 2.1 95% CI 1.5–2.4 vs US− 2.5 95% CI 2.1–3.7, p = 0.03; Table 2). In the subgroup of patients with US early in clinical remission (i.e., < 6 mos after start of remission), the difference in duration of remission between US+ and US− varied by the type of US mode used for defining residual synovitis. Results were also influenced by the way the duration in remission before baseline was imputed (left vs right imputation), with larger crude differences between US+ and US− for left mode imputation. Differences varied from 8 to 18 months in most comparisons (Table 2). Figure 2 depicts the Kaplan-Meier survival curves for loss of remission in the subgroup of remission phases where the US evaluation was performed early in remission.
After adjusting for potential confounders, HR for loss of remission for all remission phases using the complete followup time (all times) was higher for patients with US-detected synovitis at baseline compared to those without (combined US score HR 1.4 95% CI 1.03–2 vs HR 1.5 95% CI 1.1–2.1 for left and right imputation, respectively; Figure 3, and Supplementary Table 4, available from the authors upon request). Regarding the other residual US-detected synovitis scores, the HR were generally lower and the differences between US+ and US− did not reach statistical significance, independent of using left or right mode imputation. HR showed 2- to 3-fold increase when considering only observation times briefly after the baseline US (i.e., 3 and 6 months, Figure 3), indicating that the short-term predictive power of US-detected residual synovitis may be more significant than its longterm predictive power.
In the early US subset group, the HR using the complete followup time were slightly higher than in the overall cohort, with a combined US score of HR 2.3 95% CI 1.1–4.8 and HR 1.6 95% CI 1.1–2.2 for left and right imputation, respectively (Figure 3, all times, and Supplementary Table 4, available from the authors upon request). Again, the short-term was more substantial than the longterm predictive power (Figure 3).
Among the potential confounders, longer duration of remission prior to the baseline US was consistently associated with a lower probability of loss of remission after the baseline US evaluation (Supplementary Table 5, available from the authors upon request).
DISCUSSION
Our study aimed to evaluate the practical usefulness of US performed in real-life conditions, with physicians who have been trained in US examinations and who use echography as a supplementary tool for the evaluation of disease activity in RA. Our data suggest that the presence of synovitis as detected by US in a patient in clinical remission has a moderate but significant independent predictive value for the loss of remission. The predictive value of US appears to be less striking than in previous studies6,7 but remained significant. This could be due to different definitions of flare, different US scoring methods, and because all previous studies were performed in single centers with selected patients, operators, and machines.
Multicenter studies using US as a tool for RA management are rare. Two recent randomized multicenter studies showing no difference between using US and clinical measures in a treat-to-target approach introduced some doubts about the usefulness of US to monitor therapy14,15. However, both these studies were performed in patients with active disease, whereas our study focused on residual signs of synovitis in patients in clinical remission. The results from the randomized studies may be interpreted thus: in active disease and alongside the classical measures of disease activity, the sonographic examination may add little information for the treating rheumatologist16.
Two multicenter studies have been published recently on the predictive value of US in patients with RA who are in remission. The first is available only as an abstract17. The second evaluated a subgroup of patients included in a nationwide randomized controlled trial for whom information was available on US evaluation at time of stopping the tumor necrosis factor (TNF) agent while in low disease activity18. Although the definition of remission and of loss of remission of the 2 studies was slightly different from ours, the predictive effect of US on the duration of remission was similar to our findings, with HR about 1.7 in favor of US.
The predictive value of US was most striking during the first 6 months of followup and when US was performed early after the start of remission. This finding is in line with the conclusion of a recent metaanalysis of 9 monocentric studies performed on the same subject7. This suggests that US should be done as soon as remission is reached, and may be repeated after 3 or 6 months of remission.
The majority of studies for the evaluation of disease activity and the prediction of flares6 used Doppler mode only. However, for comparable high-quality Doppler data, the use of standardized machines, which are usually not available in a real-life setting, is important. Doppler assessments are technically and manually more difficult than B-mode measurements. Moreover, since the beginning of our study in 2009, the sensitivity of Doppler in the new machines has been greatly enhanced. The combined mode definition of residual US synovitis both in the crude analyses and in the adjusted models performed better than Doppler when the complete observation times of remission were used, whereas Doppler appeared to be more predictive in the short term (Figure 3). Both modes appeared useful, in line with recommendations that propose to use combined B-mode and Doppler scores in real life and in multicenter studies19,20.
The definition of flare remains debatable and no consensus has yet emerged21. Our definition of a DAS28 > 2.6 or intensification of treatment may have led to high level of loss of remission that does not necessarily reflect a real change in response to therapy6. However, the predictive value of US in occurrence of flare is similar to those obtained with other definitions with regard to the HR, such as an increase in DAS28 of 0.6 or a DAS28 > 3.210,17,18. Notably, the percentages of the patients with such increases of DAS28 were well balanced in the 2 groups of our cohort.
After adjusting for potential confounders, the only significant confounding predictive factors for the duration of remission were bDMARD treatment, US residual synovitis, and the prior time in remission. Tapering the treatments during remission occurred during only 10% of remission phases and was balanced between the US− and US+ groups. A sensitivity analysis including tapering in the multiple adjusted models did not influence the results for the HR for loss of remission comparing US− and US+ (data not shown).
The strengths of our work are the large number of patients, a validated US score for significant US synovitis, and the real-life setting that allows conclusions applicable to everyday practice. Our results therefore may have some importance in promoting the widespread use of US in real-life conditions for the followup of RA, especially when in remission.
The limitations of our study have already been partially mentioned above. The operators were not blinded to the clinical data and vice versa. Usually the clinical evaluation was done first, followed by US examination, most of the time by the same physician. This might have resulted in rheumatologists overassessing the US synovitis in patients with a slightly higher DAS28 (but still in remission). However, the DAS28 in patients with US+ and US− was very similar, which suggests that there was no relevant bias from the lack of blinding. We cannot exclude that knowledge of the baseline US synovitis score might have influenced the rheumatologists’ decision to alter treatment after US examination (which would be counted as loss of remission). However, we only interpreted DMARD starts > 60 days after the baseline US as a sign for loss of remission. Also, the reasons for loss of remission were well balanced between US+ and US−.
Heterogeneity in the quality of the machine certainly could influence whether the patient was considered as US+ or US−, and may even affect durability of remission, because the rheumatologists with more sensitive equipment may be more aggressive in their escalation of DMARD. In our definition of loss of remission, the initiation or change of the bDMARD < 60 days after the baseline US was not considered as loss of remission. Therefore, we consider this potentially confounding factor as less important.
The exact times of start and end of remission are particularly difficult to determine in an observational cohort, because the schedule of visits is not predefined and timepoints between visits may be quite variable among patients. In our cohort, the median (IQR) time interval between clinical visits around baseline US examinations was 7.6 months (3–12). To address this limitation, we applied sensitivity analyses (with left and right imputations) and the results were largely consistent, whatever mode of imputation was used.
Our study shows that US can also be useful in predicting the duration of remission in real-life conditions, but the question remains whether this tool should be recommended for this purpose at an individual level. According to our results, in particular regarding the moderate HR, a single US done in remission cannot be used as the unique predictor of flare. The next step would be to evaluate whether previous US performed prior to reaching remission and repeated US performed while in remission, particularly after 3 to 6 months, could provide additional information that is useful to the clinician20. Indeed, the kinetics of the US score before and after reaching remission could be relevant, especially for B-mode evaluation, allowing better differentiation of fibrous residual synovitis from active synovitis, in patients with US-detected residual synovitis and long remission. Another question not answered by our study is whether US can be used to choose in which patients the treatments can be stopped or tapered. Our study was not designed to answer this question, and the data in the literature on this particular point are scarce20. However, the recent multicenter US POET-Dutch study18 showed that the additional value of US as a predictor of flare after stopping TNF inhibitors was limited. More prospective randomized studies evaluating the use of US for the monitoring of treatments in remission patients are still needed.
Acknowledgment
We thank all patients and their rheumatologists who have included data in the SCQM registry (a list of rheumatology offices and hospitals that are contributing to the SCQM registries can be found at www.scqm.ch/institutions) and to the members of the SCQM team for their contribution to data management and for statistical expertise. We also acknowledge the referees for their helpful inputs for this paper.
Footnotes
The SCQM Foundation is supported by the Swiss Society of Rheumatology and by Abbvie, BMS Bristol-Myers-Squibb, Celgene, Janssen, MSD Merck-Sharp & Dohme, Novartis, Pfizer, Pfizer PFE, Roche, and UCB.
- Accepted for publication October 18, 2017.