Abstract
Objective. To extend the magnetic resonance imaging (MRI) score for assessment of wrist synovitis in juvenile idiopathic arthritis (JIA) by inclusion of the metacarpophalangeal (MCP) joints, and to compare the metric properties of the original and the extended score.
Methods. Wrist MRI of 70 patients with JIA were scored by 3 independent readers according to (1) the wrist component of the rheumatoid arthritis MRI synovitis score (comprising distal radioulnar, radiocarpal, and combined midcarpal and carpometacarpal joints); and (2) an extended score including the MCP joints. Thirty-eight patients had a 1-year MRI followup. The concordance between the readers [intraclass correlation coefficient (ICC), 95% limits of agreement (LOA), and weighted Cohen’s κ], correlations with clinical variables (Spearman’s ϱ), and the sensitivity to change [standardized response mean (SRM)] were calculated for both scores.
Results. The interreader agreement was moderate for the original score (ICC 0.77; 95% CI 0.68–0.84) and good for the extended score (ICC 0.86; 95% CI 0.80–0.91). Using 95% LOA, the aggregate score variability was less favorable with relatively wide LOA. Weighted Cohen’s κ of the individual joints indicated good agreement for the original score and good to excellent agreement for the extended score. Correlations with clinical variables reflecting disease activity improved for the extended score and its SRM was higher compared to that of the original score.
Conclusion. The extended score showed better reliability, construct validity, and sensitivity to change than the original. Inclusion of the MCP joints should be considered for a more accurate assessment of disease activity and treatment efficacy in JIA.
- JUVENILE IDIOPATHIC ARTHRITIS
- MAGNETIC RESONANCE IMAGING
- PATIENT OUTCOME ASSESSMENT
- RELIABILITY AND VALIDITY
- METACARPOPHALANGEAL JOINTS
Juvenile idiopathic arthritis (JIA) is the most common rheumatic disease in children. Its hallmark is inflammation of the synovial tissue of joints and tendon sheaths lasting more than 6 weeks, which, if left untreated, can cause destruction, deformations, and long-lasting disabilities1,2. The aim of treatment is to prevent these long-lasting and sometimes irreversible sequelae by inducing early disease remission. Consequently, potential drugs for the treatment of JIA should be tested in clinical trials for their efficacy in accomplishing this goal. Response to treatment in JIA is currently assessed using validated outcome measures and is calculated as the percentage of change of a core set of clinical variables consisting of the number of active joints, the number of limited joints, the physician’s global assessment of disease activity, the parent/patient global assessment of well-being, the Childhood Health Assessment Questionnaire score, and the erythrocyte sedimentation rate (ESR)3. The American College of Rheumatology pediatric (ACRp) 30 definition of improvement has been accepted by regulatory agencies for drug registration as the primary outcome in clinical trials in JIA.
In contrast with clinical variables, the use of magnetic resonance imaging (MRI) allows a more accurate assessment of the bone and surrounding soft tissue structures within a targeted joint. By directly visualizing the inflamed synovial membrane, contrast-enhanced MRI has an intuitive advantage in assessing treatment efficacy over the clinical outcome variables, which are all surrogate markers of synovial inflammation. Over the last decade, several randomized controlled trials in rheumatoid arthritis (RA) have included the Outcome Measures in Rheumatology (OMERACT) rheumatoid arthritis MRI scoring method (RAMRIS)4 to successfully determine therapeutic efficacy of different disease-modifying therapies5,6,7,8,9,10,11. Further, results of a study comparing the RAMRIS synovitis score with the ACRp criteria to assess treatment efficacy supported the potential of MRI as a primary efficacy outcome measure also in JIA12. The wrist component of the 0–3 scale RAMRIS synovitis score, which assesses the distal radioulnar, radiocarpal, and the combined midcarpal and carpometacarpal joints, was found to be a reliable tool to assess disease activity also in children with JIA13; unlike RA, however, the sensitivity to change of the wrist component of the RAMRIS score was only moderate.
Another study, using a 0–2 scale synovial enhancement score, reported moderate to good agreement both within and between observers14. To further refine and standardize MR-scoring systems for JIA, international initiatives have combined forces15,16.
The metacarpophalangeal (MCP) joints are frequently affected in JIA patients with wrist involvement and have a functional effect on children’s physical and daily activities. This is why the second and third MCP joints are included in the reduced joint count of the juvenile arthritis disease activity score (JADAS), a validated clinical tool to assess disease activity in JIA17,18. The aim of our current study was therefore to examine the precision of an MRI-based synovitis score of the MCP joints and to compare the metric properties of the original and an extended MRI wrist score in a cohort of patients with JIA.
MATERIALS AND METHODS
Seventy patients with JIA followed at the Istituto Giannina Gaslini (Genoa, Italy) who had previously participated in observational studies aiming to assess the suitability of the wrist component of the RAMRIS when applied to JIA13,14,19,20,21 were included in the present study. The MRI examinations were performed between June 2006 and June 2008. Inclusion criteria were a diagnosis of JIA according to International League Against Rheumatism criteria22, with wrist involvement as defined clinically. Requirement of sedation or contraindications to perform MRI were criteria for exclusion. For the current analysis, patients whose MCP joints were not included in the field of view were excluded. An MRI scan acquired after a median of 1.2 years [interquartile range (IQR) 1.1–1.4] of followup was available for 38 patients. Full ethical approval was obtained from the Istituto Giannina Gaslini ethical review board and the study was approved with formal act (no. 114-09/06/2006) by the advisory board of the same institute. All participants provided informed consent and the study was performed according to good clinical practice guidelines and the declaration of Helsinki.
Wrist MRI were acquired on a 1.5-T MRI scanner (Achieva Intera; Philips Medical Systems) using a Sense Flex Small Coil. The imaging protocol consisted of a 3-D turbo spin echo (TSE) T1, a TSE T2 fat-saturated, and a 3-D gradient echo spectral presaturation with inversion recovery image (Table 1), acquired immediately after the injection of 0.1 mmol/kg body weight of gadolinium-based contrast agent (gadoterate meglumine, Gd-DOTA, Dotarem Guerbet). The field of view included the distal radioulnar, radiocarpal, and midcarpal joints as well as the metacarpals and MCP joints. All sequences were acquired in a coronal plane. The MRI were independently read by 1 experienced reader (CM, 10 yrs of experience) and 2 relatively inexperienced readers (EHvD and FV, each 1 yr of experience), according to both the wrist component of the original RAMRIS score13 (which comprises assessment of synovitis in the distal radioulnar, radiocarpal, and the combined midcarpal and carpometacarpal joints) and an extended score adding assessment of the first to the fifth MCP joints.
The readers were blinded to the patient’s identity and clinical status. Synovitis was defined as an area in the synovial compartment of a thickness greater than the width of the normal synovium that shows postgadolinium enhancement of an intensity greater than the surrounding muscle4,14. A score of 0 is normal, while scores of 1 to 3 (mild, moderate, severe) increase by thirds of the presumed maximum volume of enhancing tissue in the synovial compartment (Figure 1), according to the definition and scoring system proposed by the OMERACT RAMRIS group4. A total score was obtained by summing the scores of the single joints, resulting in a total score ranging from 0 to 9 for the wrist component of the RAMRIS score and from 0 to 24 for the extended score. If not all MCP joints were within the field of view for an individual patient, the score was normalized by dividing by the patient’s maximally obtainable score and multiplying by the score’s theoretical maximal score (i.e., 9 for the original RAMRIS and 24 for the extended score). This approach was chosen because it can be applied clinically in the case of 1 MCP joint outside the field of view, without considering them negative, or having to exclude the MRI. For the 38 children having both a baseline and a followup MRI examination, both examinations were read sequentially, during the same session.
The following clinical and laboratory data were also recorded, according to international standardized and validated measurements: number of active joints, physician’s global assessment of disease activity, ESR, and the parent/patient’s global assessment of well-being. Using these 4 variables, the JADAS-71 was calculated18. The response to treatment was calculated using the ACRp-30 criteria3. The clinical examinations were performed by treating physicians who were blinded for the imaging results.
Statistical analysis
Of both the wrist component of the RAMRIS and the extended score, the interreader reliability for the total score was assessed using the intraclass correlation coefficient (ICC)23, evaluating the 2 inexperienced readers separately as well as all readers together. The agreement for the total score was classified as follows: ICC < 0.4 indicated poor agreement, 0.4–0.8 moderate agreement, and ≥ 0.8 good agreement. The ICC was compared using the F test. Mean differences and 95% limits of agreement (LOA) were assessed between all possible pairs of readers using the Bland-Altman approach24. Agreement on the evaluation of the individual joint recesses was evaluated by squared-distance weighted Cohen’s κ for all possible pairs of readers25. Cutoff points were defined as follows: κ < 0.2 indicated poor agreement, 0.2–0.4 fair agreement, 0.4–0.6 moderate agreement, 0.6–0.8 good agreement, and 0.8–1.0 excellent agreement.
The sensitivity to change of subjects who responded to the therapy according to the ACRp-30 criteria was calculated using the standardized response mean (SRM), which is the mean of all subjects’ change over time, divided by the SD of the change26. The threshold levels for SRM were defined as follows: 0.2–0.5 indicated small sensitivity to change, 0.5–0.8 indicated moderate sensitivity to change, and ≥ 0.8 indicated high sensitivity to change. Finally, to evaluate the score’s construct validity, correlations with clinical variables were calculated using Spearman’s ϱ. Construct validity examines whether the construct in question is related to other measures in a manner consistent with a priori prediction. Given that the MRI synovitis score is a measure of disease activity, it was predicted that the correlation with routine measures of disease activity would be in the moderate to high range. Statistical analysis was carried out using R statistics version 3.1.227 and the IRR package28.
RESULTS
Of the 70 patients included, 55 (79%) were female. Twenty-one patients had systemic arthritis, 26 had polyarthritis (7 were rheumatoid factor–positive), 17 had extended and 6 persistent oligoarthritis (Table 2). The median disease duration was 4.0 years (IQR 1.1–6.4) and all but 4 patients had clinically active disease as shown by a median JADAS-71 of 18.6 (IQR 11.5–40.9). The distribution of synovial inflammation in the wrist and MCP joints as revealed by MRI is shown in Table 3. Synovitis in at least 1 MCP joint was detected by MRI in around 80% of the cases. The most frequently affected was the first MCP joint (69%). The frequency of involvement of the MCP joints was lower in patients with persistent oligoarticular JIA than in patients with a polyarticular disease course (Table 3).
The MRI reading for the extended score took about 5–10 min to complete, and slightly less time for the wrist component of the RAMRIS. The median baseline total synovitis score for the wrist component of the RAMRIS was 5 (IQR 3–6), compared to a median of 10 (IQR 6–14) for the extended score. At followup, these scores were 3 (IQR 1–4) and 4 (IQR 2–7), respectively.
The interreader reliability of the wrist component of the RAMRIS was moderate when comparing all 3 readers together (ICC 0.77; 95% CI 0.68–0.84; Table 4), whereas the interreader reliability of the extended score was good (ICC 0.86; 95% CI 0.80–0.91). There was strong evidence that the ICC for the extended score was better than for the original score [F test (69, 119.4) = 1.80, p = 0.002). The ICC of the 2 less experienced readers ranged from 0.82 (95% CI 0.73–0.88) for the wrist component of the RAMRIS to 0.86 (95% CI 0.78–0.91) for the extended score. Further, mean differences (95% LOA) were −0.4 (−3.4, 2.6) for the original RAMRIS score and −0.6 (−6.7, 5.5) for the extended score (widest limits among all possible pairs of readers; Figure 2).
Weighted Cohen’s κ for the individual joint recesses indicated moderate to almost perfect agreement, ranging from 0.59 for the radiocarpal joint (when comparing the experienced reader and the second less-experienced reader) to 0.87 for the first MCP joint (when comparing the experienced reader and the first less-experienced reader; Table 4). Notably, weighted Cohen’s κ of all the MCP joints was higher than Cohen’s κ of the joint recesses of the wrist component of the RAMRIS.
The correlations with clinical variables indicating disease activity improved for the extended score, in particular the correlation with the number of active joints (Spearman’s ρ equaled 0.56 for the extended score and 0.29 for the wrist component of the RAMRIS) and with the JADAS-71 (0.51 and 0.35, respectively). Finally, the sensitivity to change in patients who responded to treatment according to the ACRp-30 criteria improved for the extended score with an SRM of 1.56, compared to an SRM of 1.13 for the wrist component of the RAMRIS.
DISCUSSION
The results of the current study indicate that adding the MCP to the wrist component of the RAMRIS score improves the metric properties of the MRI scoring system, thus allowing for a more accurate assessment of disease activity in JIA. This result is not entirely unexpected because the MCP joints are frequently affected in JIA patients with wrist involvement17. In a clinical trial comparing intermediate versus higher doses of methotrexate in 595 patients, the frequency of clinical involvement of the MCP joints ranged from 12.8% for the fifth MCP joint to 32.8% for the second MCP joint29. Based on these clinical observations, the second and third MCP joints have been included in the JADAS-27, an internationally validated clinical measure of disease activity in JIA18.
Among the patients included in our study, more than 80% had MRI findings consistent with synovitis in at least 1 of the MCP joints. This high percentage of MCP involvement suggests, in line with previous studies, that MRI is more sensitive than clinical examination, thus providing additional evidence to support the use of imaging to ensure accurate disease activity assessment.
Further, in contrast to the clinical evaluation, MRI demonstrated more frequent involvement of the first MCP joint as compared to the second to fifth MCP joints (69% vs 61–64%, Table 3). Because the first MCP joint is a target joint for osteoarthritis in RA, it has not been included in the RAMRIS score. However, in childhood this comorbidity is irrelevant; therefore, the first MCP joint should be carefully examined and included in the MRI scoring system.
In our study, the interreader reliability, as evaluated by weighted Cohen’s κ, was higher for the MCP joints compared to the more complex wrist joint recesses, and inclusion of these joints resulted in an overall significant improvement of the total synovitis score ICC. This suggests that the MCP joints were easier to score than the radiocarpal, radioulnar, and carpometacarpal joints. A poor agreement for the second to fifth carpometacarpal joints, which was partly due to a considerable disagreement between scores 0 and 1, was previously reported14. In our study, although the mean differences between readers were low, the 95% LOA were relatively wide, indicating that further refinement of the scoring system is needed.
Of note, our results have clearly demonstrated that the original as well as the extended MRI score can be reliably used by relatively inexperienced readers, after appropriate training sessions. This is a crucial point to support the feasibility of the proposed MRI scoring system and to promote its wider use in pediatric rheumatology.
The inclusion of the MCP joints led to a better construct validity of the MRI score, as shown by stronger correlations with clinical variables reflecting disease activity, most notably the number of active joints and the JADAS-71. The encouraging correlations between MRI measurement of synovitis of the wrist and MCP joints and conventional measures of global disease activity underpin the potential utility of the wrist as an index joint to reflect total burden of disease activity of children with JIA and wrist involvement. Because one of the major limitations of MRI is the restriction to scan 1 joint per session, the effect is considerable of such an “informative” joint, as a surrogate for the global joint involvement in children with JIA.
Finally, the inclusion of MCP joints led to an improvement in the sensitivity to change of the MRI scoring system, a crucial step toward the implementation of imaging-based outcome measures in clinical trials. The availability of sensitive imaging tools to assess changes in response to treatment potentially increases the statistical power for demonstrating the efficacy of novel potential antirheumatic drugs, allowing the minimization of the number of patients to be recruited in clinical trials as well as their duration. It has been shown that MRI of the unilateral wrist and MCP joints compared with conventional radiography of both hands, wrists, and forefeet requires less than half the number of patients and followup time to detect a difference between 2 treatment groups of patients with early RA30,31. Given this evidence, and the ethical imperative to limit the time that patients are exposed to potentially ineffective treatment in randomized controlled trials, we suggest to include MRI as a key outcome measure in future therapeutic trials in JIA.
Some limitations of the present study should be considered. For ethical reasons, we did not collect data concerning the criterion validity of the MRI for the assessment of inflammation in the MCP joints; as a consequence, the true discriminative power of MRI for synovitis cannot be definitely determined. Studies in RA, however, have clearly demonstrated that the extent of synovitis in the MCP joints correlates with macroscopic findings on mini-arthroscopy32. One caveat should be mentioned regarding sensitivity to change. The readers in our study were aware of the chronological order of the MRI scans. It has been shown previously that this method leads to an increase in the sensitivity to change33. The lack of data from age-matched healthy controls is another limitation of our study. Such data would have strengthened the study, reducing the possibility of scoring normal joints as pathological. Finally, we have limited our evaluation to synovitis because the significance of bone irregularities and bone marrow changes on MRI in patients with JIA remains to be clarified.
The MCP joints are frequently affected in JIA patients with wrist involvement. Because of the effect of MCP involvement on children’s physical and daily activities and the frequency of involvement, these joints should be included in the JIA MRI wrist scoring system. Inclusion of the MCP joints improved the metric properties of the MRI scoring system significantly. In particular, the high sensitivity of the extended MRI score in detecting inflammatory changes makes this imaging technique a promising outcome measure for the assessment of treatment efficacy in JIA. In this perspective, optimizing the role of MRI as a robust biomarker and surrogate outcome remains a priority of the future direction of research in this field.
Acknowledgment
The authors thank the members of the OMERACT Working Group MRI in JIA and of the Health-e-Child Radiology group, in particular Derk F.M. Avenarius, Laura Tanturri de Horatio, Nikolay Tzaribachev, and Andrea S. Doria, for their support for our present study.
Footnotes
Dr. van Dijkhuizen was the recipient of funding from the 7th Framework program of the European Union, SP3-People, support for training and career development for researchers (Marie Curie), Network for Initial Training (ITN), FP7-PEOPLE-2011-ITN, under the Marie Skłodowska-Curie grant agreement No. 289903.
- Accepted for publication May 31, 2018.
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.