Abstract
Objective. Clinical joint examination (CJE) is less time-consuming than ultrasound (US) in rheumatoid arthritis (RA). Low concordance between CJE and US would indicate that the 2 tests provide different types of information. Knowledge of factors associated with CJE/US concordance would help to select patients and joints for US. Our objective was to identify factors associated with CJE/US concordance.
Methods. Seventy-six patients with RA requiring tumor necrosis factor-α (TNF-α) antagonist therapy were included in a prospective, multicenter cohort. In each patient, 38 joints were evaluated. Synovitis was scored using CJE, B-mode US (B-US), and power Doppler US (PDUS). Joints whose kappa coefficient (κ) for agreement CJE/US was < 0.1 were considered discordant. Multivariate analysis was performed to identify factors independently associated with CJE/US concordance, defined as factors yielding p < 0.05 and OR > 2.
Results. Concordance before TNF-α antagonist therapy varied across joints for CJE/US (κ = −0.08 to 0.51) and B-US/PDUS (κ = 0.30 to 0.67). CJE/US concordance was low at the metatarsophalangeal joints and shoulders (κ < 0.1). Before TNF-α antagonist therapy, a low 28-joint Disease Activity Score (DAS28) was associated with good CJE/B-US concordance, and no factors were associated with CJE/PDUS concordance. After TNF-α antagonist therapy, only the joint site was associated with CJE/B-US concordance; joint site and short disease duration were associated with CJE/PDUS concordance.
Conclusion. Concordance between CJE and US is poor overall. US adds information to CJE, most notably at the metatarsophalangeal joints and shoulders. Usefulness is decreased for B-US when DAS28 is low and for PDUS when disease duration is short.
Rheumatoid arthritis (RA) is a systemic inflammatory disease that results in cartilage and bone destruction. The number of joints with synovitis by clinical joint examination (CJE) is a relevant measure of disease activity. However, CJE may fail to detect all joints with synovitis1,2. Patients in clinical remission may have subclinical synovitis associated with a risk of structural disease progression3,4,5.
Numerous studies have proven that ultrasonography (US) using both B-mode and power-Doppler (PD) mode is more sensitive than CJE for detecting synovitis and that PD ultrasonography (PDUS) provides information on the degree of inflammation6,7,8,9,10,11,12. In a previous study, findings by B-mode US (B-US), PDUS, and CJE independently predicted radiographic progression13. Recently, Dougados, et al confirmed the validity of clinical and/or sonographic synovitis for predicting 2-year structural deterioration in RA14. Further, synovial hypervascularization regresses under combined biologic and disease-modifying antirheumatic drug (DMARD) therapy, suggesting that US may assist in evaluating the response to treatment over time15,16,17,18,19,20.
The use of US has been criticized because the results are heavily operator-dependent. However, the OMERACT group (Outcome Measures in Rheumatoid Arthritis Clinical Trials) developed definitions of US abnormalities in various joints, with the goal of improving the reliability and other metrological properties of joint US21,22,23. The main problem now is that concordance between CJE and US may vary across joints. Such variability may explain why studies of US findings in a limited number of joints showed good agreement with CJE24,25,26.
CJE is less time-consuming than US. The time needed for US can be decreased by limiting the evaluation to joints where discordances with CJE findings are most likely to occur. Therefore, it would help to identify factors that influence CJE/US concordance at specific joints in the individual patient.
The objectives of our study were to evaluate concordance between CJE and US (B-US, PDUS, and both modes in combination) for detecting synovitis in patients with active RA before and after tumor necrosis factor-α (TNF-α) antagonist therapy and to identify factors associated with good concordance.
MATERIALS AND METHODS
We conducted a prospective, multicenter, 4-month study of patients with RA referred to the study centers by their rheumatologists for TNF-α antagonist therapy. The study was approved by the appropriate ethics committees. All patients gave their written informed consent.
Patients
Patients older than age 18 years who met 1987 American College of Rheumatology (ACR) criteria for RA27 were eligible if they were referred to the study centers for TNF-α antagonist therapy in 2007 or 2008. A swollen joint count of 6 or more as assessed by CJE was required. Patients were evaluated at baseline and after 4 months of TNF-α antagonist therapy. For all patients, 2 investigators worked in pairs (clinical investigator and US investigator) during the 4-month study.
Clinical joint evaluation
In each of the 9 study centers, a single investigator (research nurse or rheumatologist) with experience in clinical metrology in RA, blinded to the US data, was in charge of monitoring the patients. Demographics were collected at baseline including sex, age, disease duration, history of surgery related to RA, and previous RA treatments. In each patient, the investigator determined the counts on 66 joints according to ACR recommendations. At each joint, synovitis was scored semiquantitatively (0, definitely no synovitis; 1, doubtful; 2, moderate; 3, obvious and important; clinical synovitis defined by score of at least 2 joints) and a binary score (presence of synovitis, yes/no), but the binary score was used only for the present study.
The following data were collected at the baseline and Month 4 visits: tender joint count (on 68 joints), patient’s global assessment using a 0–100 visual analog scale (VAS), functional impairment using the Health Assessment Questionnaire-Disability Index (HAQ-DI), and physician’s global assessment of disease activity using a 0–100 VAS.
The US evaluation
The US evaluation was performed on 38 joints, including the 28 joints of the DAS28 [shoulders, elbows, wrists, metacarpophalangeal (MCP) joints, proximal interphalangeal (PIP) joints, and knees] and the metatarsophalangeal (MTP) joints. US was performed in a dimly lit room. In each of the 9 study centers, a single experienced sonographer (radiologist in 1 center or rheumatologist in others) who was blinded to the CJE data performed all the US evaluations for the study. Their intraobserver and interobserver reproducibility were fair to good (0.37 to 0.75).
Multiplanar greyscale (B-mode) and PD images were obtained using commercial real-time scanners (Esaote Technos MPX, Toshiba Aplio, Esaote MyLab, Philips HD11, or BK Mini Focus) and multifrequency linear transducers (7–12 MHz). US scanning techniques, greyscale (B-mode) and PD machine settings, and definitions of abnormalities were standardized before the study during a 1.5-day meeting of all 9 study sonographers28,29. PD measurements were adjusted to the lowest permissible pulse repetition frequency (PRF) to maximize sensitivity, which led to PRF values as low as 750 Hz. Low-wall filters were used. Color gain was set just below the level at which color noise appeared in the underlying bone. Synovitis was defined according to OMERACT definitions as a grade of at least 1 for B-mode (hypoechogenic thickening of the synovial membrane that was nondisplaceable and poorly compressible) and PD mode independently21,22. B-mode and PD mode measure different aspects of inflammation that can be combined to define synovitis, but we considered each of them separately for statistical analysis. Both B-US and PDUS were recorded for each joint. On B-US images, synovitis was scored using a 0 to 3 scale with these subjective definitions for each grade: 0, no synovial thickening; 1, mild synovial thickening; 2, moderate synovial thickening; and 3, marked synovial thickening. For PDUS images, a 0 to 3 scale was also used, with these definitions: 0, no signal and no intraarticular flow; 1, mild, signal from 1–2 vessels (including 1 confluent vessel) for small joints and 2–3 vessels (including 2 confluent vessels) for large joints; 2, moderate vessel confluence (> grade 1) occupying < 50% of the normal synovial surface area; and 3, marked vessel confluence occupying > 50% of the normal synovial surface area.
Statistics
Concordance between synovitis by CJE and synovitis (grade 1 or higher) by B-US, PDUS, or B-US + PDUS at baseline was assessed by computing the kappa coefficient (κ) for each of the 38 joints. Concordance between B-US and PDUS for the presence of grade 1 synovitis was also assessed by computing κ. Joints with κ values < 0.1 were considered discordant and removed from the assessment of factors associated with CJE/US concordance. For the remaining joints, univariate logistic regression was used to look for an effect on CJE/US concordance of age, sex, body mass index (BMI, kg/m2), disease duration, history of RA-related surgery, HAQ score, DAS28, erythrocyte sedimentation rate (ESR), physician’s and patient’s global assessments, and rheumatoid factor (RF). Factors yielding p values < 0.05 by univariate analysis were entered into a multivariate logistic regression model. Factors that yielded p values < 0.05 in the multivariate model and that had OR > 2 were classified as significantly affecting CJE/US concordance.
RESULTS
Patients
During the study period, 76 patients met our study selection criteria and were included. After the baseline CJE and US evaluation, 66 patients returned for the second visit at Month 4.
Of the 76 patients, 64 were female (84%); mean age was 55 ± 13 years and mean disease duration 10 ± 9 years. Only 16 patients (21.05%) had a disease duration < 2 years. Tests for RF were positive in 59 patients (78%). A history of RA-related surgery was noted in 21 patients (27.5%). All patients received at least 1 DMARD. The mean number of previous DMARD was 3 ± 2. Of the 76 patients, 52 were naive to TNF-α antagonist therapy, 15 were taking TNF-α antagonist therapy with unsatisfactory results, and 8 had taken TNF-α antagonist therapy in the past; information on prior TNF-α antagonist therapy was missing for 1 patient.
At baseline, mean DAS28 ESR was 5.12 ± 1.31 (5.22 ± 1.23 in the group with disease duration < 2 yrs and 5.19 ± 1.34 in the group with disease duration > 2 yrs), mean C-reactive protein (CRP) was 18 ± 19 mg/l, and mean HAQ score was 1.41 ± 0.68. At Month 4, 66 patients returned for the second visit; mean DAS28 ESR was 3.47 ± 1.37, mean CRP was 8 ± 13 mg/l, and mean HAQ score was 1.0 ± 0.7. There was not any difference of characteristics for the distinct disease duration subgroups at baseline.
Concordance between the CJE and US
For CJE versus B-US of all joints, concordance was 63.2% at baseline before TNF-α antagonist therapy and 69.5% after 4 months of TNF-α antagonist therapy. Corresponding values for CJE versus PDUS were 75.2% and 84.0%.
CJE versus US concordance rates at baseline (Table 1) varied across joints (κ = −0.08 to 0.51). Concordance was lowest at the MTP joints (κ = −0.08 to 0.28) and shoulders (κ = −0.08 to 0.05), which were excluded from the evaluation of factors associated with concordance.
At all joints, fewer cases of synovitis were detected by PDUS than by B-US, indicating a greater sensitivity of B-US. Results of the analysis using B-US or PDUS findings were similar to those of the analysis using only B-US findings. Consequently, we assessed concordance of CJE with B-US and with PDUS but not with both B-US and PDUS or with either B-US or PDUS.
Concordance between B-US and PDUS
At baseline, concordance between B-mode and PD findings was fair to moderate and varied across joints (Table 2). Thus, κ ranged from 0.3 for the first MTP joint to 0.67 for the fourth PIP joint. In each joint, PDUS was positive only when B-US also showed synovitis.
Factors associated with concordance between the CJE and US
By multivariate analysis, factors associated with good concordance (p < 0.05) at baseline were as follows: for CJE versus B-US, DAS28 and RF; and for CJE versus PDUS, RA duration, DAS28, age, and sex (Table 3). After 4 months of TNF-α antagonist therapy, the following differences were noted: for CJE vs B-US the significant factors were joint site, sex, BMI, RA duration, HAQ score, and DAS28; and for CJE/PDUS they were joint site, BMI, RA duration, HAQ score, DAS28, and RF.
The OR values for each factor at baseline versus the reference value (Table 4) showed that factors associated with good CJE/B-US concordance were low DAS28 and positive RF. However, the OR was > 2 only for low DAS28 (compared with DAS28 > 5.1, which had the lowest concordance). Factors associated with good CJE/PDUS concordance were age > 50 years, being female, and DAS28 indicating moderate disease activity, whereas semi-recent RA (2–5 years) was associated with a low concordance. However, none of these factors was associated with an OR > 2.
After 4 months of TNF-α antagonist therapy (Table 5), factors associated with good CJE/B-US concordance were joint site, being male, low BMI, disease duration < 5 years, HAQ score indicating moderate disability, and low DAS28. However, OR > 2 were found only for joint site (elbow, MCP 4 and 5, PIP 2, 3, 4, and knee). Factors associated with good CJE/PDUS concordance were joint site, low BMI, recent disease progression, HAQ score indicating moderate disability, low DAS28, and positive RF; OR > 2 occurred for joint site (elbow, MCP 4 and 5, PIP 1, 2, 3, 4, and 5, and knee; compared with wrist, which had the lowest concordance) and short disease duration (compared with > 10 yrs disease duration).
Table 6 summarizes the main data. CJE/US concordance was poor for the MTP joints and shoulders and for the wrists after 4 months of TNF-α antagonist therapy. At baseline, only low DAS28 was associated with good CJE/B-US agreement, and no factors were associated with CJE/PDUS concordance. After 4 months of TNF-α antagonist therapy, only joint site was associated with CJE/B-US concordance; and joint site and short disease duration were associated with good CJE/PDUS concordance.
DISCUSSION
Although US may be considered a gold standard for synovitis detection, many studies have demonstrated the usefulness of CJE in defining RA activity and in predicting RA outcome, whereas data remain scarce for US. Our goal was to select the joints with the more important US/CJE concordance or discordance. In this study, concordance between CJE and US was poor overall. Similarly, previous studies7,29,30,31,32,33,34,35,36,37,38,39 reported only modest correlations between CJE and US findings in patients with RA.
Our study suggests that shoulder and MTP joints are clearly discordant using CJE and US evaluation and that among the other joints, some patients’ characteristics and some joint sites are associated with a good concordance. However, factors associated with good concordance are not similar before and after TNF-α antagonist therapy.
At baseline, i.e., at a time of high disease activity (defined by the clinician on the basis of elevated DAS28, elevated corticosteroid dosage, or rapid radiological progression), only low DAS28 (≤ 3.2) affected CJE/US concordance. After 4 months of TNF-α antagonist therapy, joint site (for both B-US and PDUS) and disease duration (for PDUS only) were associated with CJE/US concordance. Concordance at M4 was highest for MCP 4 and 5, the PIP joints, the elbows, and the knees; and lowest for the MTP joints and shoulders. Luukkainen, et al also reported poor correlations between CJE and US at the MTP joints (κ = 0.165)32 and shoulders33. Concerning the relatively good concordance in the elbow, studies reported similar results using a posterior view, and that may be explained by the superficial position of the joint in its posterior part34. CJE/US concordance was not significantly affected by sex, BMI, HAQ score, or RF positivity. Short disease duration was associated with better CJE/PDUS concordance. Acute synovitis is probably easier to detect clinically in the early stages of RA than in longstanding disease, when joint damage and periarticular fibrosis without active inflammation may be mistaken for acute synovitis.
B-US and PDUS are known to provide different types of information on the same joints in patients with RA, and the US definition of synovitis includes findings from both modes21,22. B-US is important to detect the main abnormalities produced by synovitis and to determine whether the definition of synovitis is met. Jousse-Joulin, et al28 showed that US can perform better or worse than CJE depending on the cutoff used to define synovitis. B-US was more sensitive than CJE with a cutoff of 1, but not with a cutoff of 2 or higher. The number of joints with synovitis by PDUS was consistently lower than by CJE, regardless of the cutoff used.
OMERACT recommendations define synovitis as a B-US or PDUS grade > 1. In our study, all joints with PDUS grades > 1 also had B-US grades > 1. This finding seems to support the use of B-US without PDUS in patients with RA. However, in longstanding RA, synovitis by B-US without a positive PDUS signal is taken to indicate remission, whereas synovitis by B-US with a PDUS signal is classified as acute synovitis. In the study by Jousse-Joulin, et al28, a positive PDUS signal showing subclinical synovitis was more common in younger patients with recent-onset RA, whereas synovitis by B-US was more common in older patients with longstanding RA. Thus, both B-US and PDUS are useful for evaluating RA synovitis, although PDUS more specifically detects active inflammation3,4,5.
A major strength of our study is that CJE and US were performed by different physicians, who were blinded to the results of the other investigations. In addition, all physicians were experienced in the investigation they performed, and all sonographers attended a 1.5-day training session on criteria for synovitis outcome measures28,29. The main weakness of our study is that the data were collected by many different physicians and on different machines with different PD settings and performances (sensitivity), but it is the best way to evaluate agreement between the sonographers and clinicians. Previous studies have documented interobserver variability in CJE and US findings2,12.
There is no consensus about the number and location of the joints that are most relevant for monitoring patients with RA. In several studies, monitoring a limited number of joints was reliable25,26 and in others the treatment response evaluation was unaffected by the number of joints included in the US scores35. A study of several US scoring systems for RA synovitis established that US was at least as relevant as CJE36. In our study, as in others32,33, concordance between CJE and US was low at the MTP joints and shoulders. In future studies, MTP joints and shoulders should be evaluated routinely by US, even in the absence of clinical abnormalities. In patients with RA considered in remission, US often showed persistent active inflammation, which predominated at the second and third MCP joints and correlated with the DAS2837. Ankles were not included because our goal was to evaluate CJE/US concordance on the joints used for the DAS evaluation. It could be interesting to evaluate all joints in future studies.
An unexpected finding was that some clinical synovitis found by clinical investigators was not confirmed in sonography (less often in shoulders and MTP joints). This may be due to lack of experience or because of tenosynovitis, which has not been evaluated by US because it was not in the definition of US synovitis. In a knee evaluation, US was found to be more sensitive than CJE in the detection of suprapatellar bursitis, knee effusion, and Baker’s cyst7.
The US definition of synovitis, the mode used, and joints assessed have changed over time38. The optimal number and location of joints to be assessed in patients with RA needs to be determined. The development of a global score at the patient level would be helpful. The OMERACT group is developing the Global OMERACT Synovitis Score as a tool for obtaining additional information about CJE in a way that is consistent with the constraints of everyday practice.
Achieving clinical remission is now a realistic objective in patients with RA. However, patients in clinical remission may have persistent synovitis as seen by US and magnetic resonance imaging (MRI) and may therefore be at risk for further structural damage. Thus, patient followup should probably rely not only on physical evaluations, laboratory tests, and radiographs, but also on US evaluations37. US and MRI have been found to be more sensitive and more specific than CJE and radiographs for assessing synovial inflammation and structural damage39,40. Consequently, they are of interest for monitoring patients with RA in remission. In a comparison of US and MRI that used OMERACT definitions to evaluate remission, the baseline B-US synovitis count predicted relapse and the baseline PDUS synovitis count predicted erosions41,42.
Our study of 76 patients with active RA documented discordances between CJE and US (B-US and/or PDUS) in their detection of synovitis of grade 1 or higher. B-US was more sensitive than PDUS for detecting synovitis. Many factors were associated with CJE/US concordance, both before and after TNF-α antagonist therapy. These factors differed between B-US and PDUS. It is difficult to select patients or joint sites justifying CJE or US evaluation, except for MTP joints, shoulders, and wrists, which had the lowest CJE/US concordance. The performance of a combination of CJE (for MCP, PIP, elbow, and knee) and B-mode US (for MTP joints, shoulders, and wrists) in predicting RA outcomes by comparison with the swollen joint count used in the DAS28 should be further evaluated. It would be interesting in clinical practice and clinical trials to have a global scoring system before initiating or stopping treatment. For further study, the best way could be to compare 3 groups in predicting outcomes: 1 with CJE only, 1 with US evaluation only, and 1 with a combination of CJE (used for the joints with good concordance) and US (used for the joints with lower concordance).
Acknowledgment
We thank the investigators who recruited and/or monitored patients: Pierre Bourgeois, Maxime Breban, Françoise Carbonnelle, Tiffen Couchouron, Pascal Guggenbuhl, Rachida Inaoui, Catherine Le Bourlout, Xavier Mariette, Jean-Marcel Meadeb, Anne Miquel and Valerie Devauchelle-Pensec.
Footnotes
-
Supported by an unrestricted grant from Abbott France.
- Accepted for publication November 27, 2012.