Abstract
Objective. To develop semiaxial magnetic resonance imaging (MRI) scoring methods for assessment of sacroiliac joint (SIJ) bone marrow edema (BME) in patients with axial spondyloarthritis, and to compare the reliability with equivalent semicoronal scoring methods.
Methods. Two semiaxial SIJ MRI scoring methods were developed based on the principles of the semicoronal Berlin and Spondyloarthritis Research Consortium of Canada (SPARCC) methods. A global quadrant-based method was also developed. Baseline and 12-week MRI of the SIJ from 51 patients participating in a randomized double-blind placebo-controlled trial of adalimumab 40 mg every other week versus placebo were scored by the semiaxial and the corresponding semicoronal methods. Results were compared by linear regression analysis. The reproducibility and sensitivity were evaluated by intraclass correlation coefficients (ICC) and smallest detectable change [SDC, absolute values and percentage of the highest observed score (SDC-HOS)].
Results. Interreader and intrareader ICC were moderate to very high for semiaxial scoring methods (baseline 0.83–0.88 and 0.85–0.97; change 0.33–0.78), while high to very high for semicoronal scoring methods (baseline 0.90–0.92 and 0.93–0.97; change 0.77–0.89). Association between semiaxial and semicoronal scores were high for both the Berlin and SPARCC method (baseline: R2 = 0.93 and 0.88; change: R2 = 0.82 and 0.87, respectively), while lower for the global method (baseline: R2 = 0.79; change: R2 = 0.54). The SDC-HOS were 9.8–18.6% and 5.9–10.7% for the semiaxial and semicoronal methods, respectively.
Conclusion. Detection of SIJ BME in the semiaxial scan plane is feasible and reproducible. However, a slightly lower reliability of all 3 semiaxial methods supports the general practice of using the coronal scan-plane in therapeutic studies.
Magnetic resonance imaging (MRI) of the sacroiliac joints (SIJ) has a key role in the diagnosis and followup of patients with axial spondyloarthritis (axSpA)1,2,3. This role relies on the ability of MRI to detect SIJ inflammation [i.e., bone marrow edema (BME) and/or osteitis] before structural lesions can be detected on conventional radiography and computed tomography4,5, and on MRI being the most sensitive method to detect changes in inflammation6,7. Several MRI scoring methods for SIJ inflammation have been developed8. In clinical trials of patients with axSpA, sacroiliitis is assessed with standardized and validated semiquantitative MRI scoring methods for inflammation, most often the Berlin MRI score9,10 or the Spondyloarthritis Research Consortium of Canada (SPARCC) score11. For the assessment of inflammation with these methods, the acquisition of a water-sensitive sequence in the semicoronal plane is needed. Usually, a short-tau inversion recovery (STIR) sequence is used for this purpose.
However, there is no evidence concerning which scan plane should be used in clinical practice, and consequently it is most often decided locally based on the local radiologist’s experience and time, and quite frequently STIR images in the semiaxial rather than semicoronal plane are obtained. This may be explained by the impression of a higher sensitivity of the semiaxial scan plane to detection of changes in the ligamentous part of the joint and at the “sacral corners” of the cartilaginous part of the SIJ12,13. Nevertheless, a direct comparison of the ability of the semicoronal versus semiaxial scan plane to detect and monitor lesions has never been performed, to our knowledge.
Consequently, the current MRI SIJ scoring methods do not allow assessment of semiaxial images, and this hinders the use of these methods for systematic followup of MRI of many patients treated in routine care. This is problematic because patients diagnosed and treated in routine care often present a broader spectrum of disease than patients recruited to clinical trials, and they may respond differently to treatment as compared to clinical trial patients14,15. Therefore, just as clinical registries have been very valuable in providing relevant additional clinical data on patients treated in routine practice, analyses of MRI scans performed in routine care may provide additional important objective information about treatment efficacy. However, to use such MRI in routine care, systematic and validated scoring methods that can be applied in both scan planes are needed.
The aim of our study was to develop and validate 3 semiaxial MRI scoring methods of SIJ inflammation, made by modifying the principles of the Berlin and SPARCC MRI SIJ methods and by developing a new global quadrant-based method. In addition, we compared the reproducibility and sensitivity to change of the semiaxial and semicoronal methods to investigate their performance.
MATERIALS AND METHODS
Patients
The study comprised 52 patients included in a previously described randomized, double-blind, placebo-controlled investigator-initiated study16. Patients were randomized in blocks of 4 to treatment with either subcutaneous (SC) injection of adalimumab (ADA) 40 mg or placebo every other week for 12 weeks. To be included in the study, patients had to fulfill the European Spondyloarthropathy Study Group classification criteria for SpA17, and have sacroiliitis on MRI and/or conventional radiography as judged by an SpA expert radiologist with 30 years of experience (AGJ). All patients fulfilled the imaging arm of the Assessment of Spondyloarthritis international Society criteria for axSpA1. Further, patients had to have a Bath Ankylosing Spondylitis Disease Activity Score > 40 mm (visual analog scale 0–100 mm) and clinical indication for treatment with tumor necrosis factor-α (TNF-α) inhibitor. Before the MRI were read, 1 patient was excluded because MRI of the SIJ at baseline had not been performed in both the semiaxial and semicoronal planes.
MRI details
MRI were performed at 1.5T according to a standardized protocol at 3 different departments of radiology. The SIJ scans were performed before treatment initiation (i.e., at baseline) and at Week 12. The MRI sequences included a STIR sequence of the SIJ performed in the semicoronal and semiaxial scan plane, respectively.
The STIR sequences were performed with echo time 26–60 ms, repetition time 2500–4200 ms, inversion time 150–170 ms, and field of view (FOV) 240 × 240 mm with matrix 256 × 189 or 256 × 256 (semicoronal) and 224 × 168 or 256 × 256 (semiaxial) or FOV 300 × 300 mm with matrix 240 × 180 or 368 × 274 (both planes). The slice thickness was 4–5 mm and gap 0.5–0.6 mm in both scan planes. During the 12 weeks of study, the individual patients were scanned according to the same MRI protocol, i.e., with the same scan measures within the individual patients.
The MRI were anonymized and evaluated with the readers blinded for clinical, biochemical, and other imaging data. All scans were read on the same MRI workstation. Two readers evaluated the scans according to a standardized reading methodology, which included standardized paper-based score diagrams.
Semicoronal Berlin and SPARCC scoring methods
In the semicoronal Berlin10 and SPARCC11 scoring methods, the SIJ are divided into 4 quadrants by a horizontal line going midway through the joint on each image slice, resulting in an upper and lower iliac and sacral quadrant, respectively10,11 (Figure 1). In the Berlin scoring method, the volume of BME is graded semiquantitatively for each SIJ quadrant into 0: No BME; 1: < 33%; 2: 33–66%; and 3: > 66%10. The total score range is 0–24.
The SPARCC MRI SIJ Inflammation Index is based on 6 consecutive semicoronal slices covering the cartilaginous part of the SIJ11. BME is scored dichotomously (0: absent; 1: present) for each quadrant for each slice. An additional score of 1 is given per joint per slice if the signal is “intense,” corresponding to the signal intensity of the cerebrospinal fluid. Moreover, an additional score of 1 is given per joint per slice if there was a homogeneous and unequivocal increase in signal extending over a depth of at least 1 cm from the articular surface (“depth”)11. The total score range is 0–72.
The new semiaxial methods corresponding to the Berlin and SPARCC methods, and a new global method
In the semiaxial Berlin and SPARCC methods, the SIJ are also divided into 4 quadrants. On the semiaxial scan, the border between the upper and lower quadrants is defined by the particular image slice that displays the first sacral neural foramen (the image slice in Figure 1E), i.e., corresponding to the border between S1 and S2. The number of semiaxial slices that cover the SIJ from top to bottom is usually 12, and with the definition above there are most often 6 slices covering the upper and lower quadrants, respectively.
In the semiaxial Berlin method, the volume of BME is graded semiquantitatively for each SIJ quadrant with the same methodology as the semicoronal method, providing similar score range (0–24) to the conventional semicoronal Berlin method.
In the semiaxial SPARCC method, the 12 consecutive slices of the cartilaginous part are scored. The scoring approach for BME including depth and intensity is scored according to the same definitions as described for the semicoronal SPARCC. Total score range is 0–96. It is derived from a score of 0–48 for presence of BME + score of 0–24 for presence of “intensity” + score of 0–24 for presence of “depth.”
In the new semicoronal and semiaxial global scoring method, the SIJ are divided into 4 quadrants according to the same definitions as described above for the semicoronal and semiaxial Berlin and SPARCC methods. BME is scored dichotomously (0: absent; 1: present) per quadrant per joint. The total score range is 0–8 in both the semiaxial and the semicoronal scan planes.
Reading exercises
The study comprised 2 reading exercises. Exercise 1 was performed as a prereading exercise to calibrate the readers and to assess the reliability for status scores. Baseline MRI in the semicoronal and semiaxial scan plane were anonymized according to 2 different series of random anonymization numbers. First, the readers independently of each other scored the semicoronal MRI according to the Berlin, SPARCC, and global quadrant method. Thereafter, the readers evaluated all semiaxial MRI according to the corresponding 3 new semiaxial methods.
In exercise 2, the semicoronal and semiaxial scans were reanonymized according to different series of numbers. First, both readers independently of each other read all semicoronal scans and thereafter all semiaxial scans. Baseline and followup scans were scored simultaneously in known time order.
Statistical analysis
Baseline and change scores were characterized with descriptive statistics. Intraobserver and interobserver agreement was assessed with intraclass correlation coefficients (ICC). A 2-way mixed-effects model with the patient as a random factor and the observer as a fixed factor was used and the results were given as single-measure ICC, absolute agreement. An ICC ≤ 0.40 was considered fair, > 0.40 to ≤ 0.60 moderate, > 0.60 to ≤ 0.80 good, > 0.80 to ≤ 0.90 very good, and > 0.90 to ≤ 1.00 excellent18. Scores were compared with independent t tests (between groups) and paired t tests (within groups). The association between the semicoronal and semiaxial methods was estimated for the mean scores of the 2 readers using ICC and linear regression analyses. Bland-Altman plots were performed to visualize any systematic difference between the readers in relation to status and changes scores (data not shown).
Effect size (ES), smallest detectable change (SDC), and smallest detectable difference (SDD) were used to evaluate the sensitivity to change of all methods. ES was calculated as the difference between the mean baseline scores and followup scores at Week 12 divided by the SD of the baseline scores19. SDC was calculated as ± 1.96 × SD of the change scores divided by [√2 × √k] and SDD as ± 1.96 × SD of the baseline scores divided by √k, where k is number of readers20. The SDC was also calculated as the percentage of the highest observed score (SDC-HOS), defined as the mean of the highest scores of reader 1 and 2, and the percentage of the highest possible score (SDC-HPS), defined as the highest possible score of the scoring method. ES was considered poor if < 0.20, small if ≥ 0.20 but < 0.50, moderate if ≥ 0.50 but < 0.80, and large if ≥ 0.8019.
The study was approved by the Ethical Committee of Copenhagen (KF-02-270781) and the Danish Medicines Agency, and conducted according to the Helsinki II Declaration. All participants provided written informed consent before inclusion in the study.
RESULTS
Patient characteristics
Table 1 provides the clinical characteristics of the patients. All patients had moderate to high disease activity. At inclusion, there were no statistically significant differences between treatment groups.
Reliability of the readers
Table 2 shows the results of exercise 1 and 2, including the reliability of the readers. There were no significant differences between the scores of the 2 readers, although reader 2 numerically scored slightly higher than reader 1. Similar results were achieved with nonparametric tests (results not shown). The intrareader (ICC 0.85–0.97, p < 0.0001) and interreader (0.77–0.98, p < 0.0001) reliability of the baseline scores were good to excellent for all methods (Table 2). For baseline MRI scores, the interreader ICC were numerically higher for the semicoronal methods (ICC 0.90–0.98) as compared to the semiaxial methods (ICC 0.77–0.92). For change scores, the interreader ICC of the semicoronal MRI scores were also good to very good (0.77–0.89), whereas the ICC for the semiaxial scores were fair to good (0.33–0.78, p < 0.0001).
Comparison of semicoronal and semiaxial methods
The absolute agreement on presence/absence of BME visualized in the semicoronal versus the semiaxial plane was in exercise 1, 88.0% (reader 1) and 80.3% (reader 2), and in exercise 2, 92.2% (reader 1) and 88.0% (reader 2).
Table 3 provides the ICC of the comparison of the semicoronal versus semiaxial methods for the 2 readers. ICC of the baseline scores were good to excellent (ICC 0.78–0.96, p < 0.0001) for all 3 methods. The ICC for the change score also showed good to almost perfect correlation (0.72–0.93, p < 0.0001) for the Berlin and SPARCC methods, but with a numerically lower correlation (ICC 0.54–0.77) for the global methods.
Figure 2 shows the semicoronal versus the semiaxial baseline and change scores for exercise 2 with the linear equations best fitting the observed data. Linear regression analyses revealed very good correlation between the semicoronal and semiaxial methods for the Berlin and SPARCC methods (baseline: R2 = 0.93 and 0.88 and change: R2 = 0.82 and 0.87, respectively) and moderate correlation for the global methods (R2 baseline: 0.79 and change: 0.54).
Comparison of treatment groups
No statistically significant differences between the treatment groups were observed (Table 4). Both the treatment group and the placebo group had statistically significant decreases in BME scores from baseline to Week 12 by all MRI methods (results not shown).
Sensitivity to change
The SDC, including SDC-HOS and SDC-HPS, were low for the Berlin and SPARCC methods and higher for the global method (Table 2). Moreover, the SDC-HOS were about twice as high for the semiaxial methods as compared to the corresponding semicoronal methods (Table 2). ES and the SDC stratified according to treatment group are presented in Table 4. ES was poor for all the scoring methods in patients treated with ADA and placebo. The standardized response mean was moderate in both treatment groups for all MRI methods.
DISCUSSION
In our study, we developed and validated new SIJ MRI scoring methods for BME applied on the semiaxial scan plane and compared these with the semicoronal methods. The semiaxial methods based on the SPARCC and Berlin methods both demonstrated good to excellent reliability, and good to excellent agreement between semicoronal and semiaxial scores. The SDC were low, indicating high sensitivity to change, although the SDC-HOS were about twice as high for the semiaxial methods as for the semicoronal methods. Finally, changes in MRI scores according to all methods were numerically, although not significantly, higher in the treatment group than the placebo group. Our results support that semiaxial MRI can provide reliable information on the efficacy of different therapies in clinical practice.
Prior to our present study, the only MRI scoring method that also used the semiaxial scan plane was the Aarhus method, developed by Madsen and Jurik8,13. According to this method, the semiaxial scan plane is evaluated in combination with the semicoronal scan plane, resulting in 1 single score for inflammation. Therefore, the Aarhus method is not directly comparable with the methods developed in our current study. However, the validation study of the Aarhus method by Madsen and Jurik13, the interobserver agreement for BME on the patient level, was κ 0.86, whereas our agreement for BME based on the semiaxial slices was κ 0.88. In the validation study of the semicoronal SPARCC SIJ MRI inflammation method, the status score ICC was 0.67–0.70 and the change score ICC was 0.53–0.7911. Thus, the reliability of the new semiaxial scores was at the level of that of the validation studies of the SPARCC and Aarhus methods.
Only a few studies have investigated sensitivity to change of MRI scoring methods for assessment of SIJ inflammation. Our finding of an SDC 3.2 for the semicoronal SPARCC method is in concordance with the results of a study by van den Berg, et al21, in which they observed SDC in the range of 2.1 to 3.4 over 3 to 12 months for patients treated with nonbiologics. Similarly, Maksymowych, et al22 proposed a minimal important change of the semicoronal SPARCC method of 2.5, when the analyses are anchored to global assessment of change in MRI inflammation as well as anchored to clinical response. The lower sensitivity to change of the global method may partly be explained by the lower score range for this method as compared to the Berlin and SPARCC methods. Our results support that semiaxial MRI in clinical practice can provide reliable additional information on the efficacy of different therapies in clinical practice. However, the slightly lower reliability of the semiaxial methods indicates that the semicoronal scan plane supports the general practice of using this scan plane in therapeutic studies.
The ES for all MRI methods (semicoronal and semiaxial) were small for patients treated with ADA and poor for patients treated with placebo. This may partly be explained by not all patients having BME at baseline. In the current study, the ES for the semicoronal Berlin method for patients treated with ADA were higher than previously reported in a cohort of patients with ankylosing spondylitis (AS) treated with TNF-α inhibitor (ES 0.22)23. The ES for the semicoronal SPARCC method is in concordance with results of 2 randomized double-blind placebo-controlled trials of patients with nonradiographic axSpA treated with etanercept (ES 0.39) versus placebo (0.08) for 12 weeks17 and patients with AS treated with ADA (0.36) versus placebo (0.12) for 24 weeks24.
Regardless of the observed high agreement between the semiaxial and semicoronal MRI SIJ scores, we experienced several potential important differences when assessing MRI of the SIJ in semiaxial versus the semicoronal scan plane. On the semiaxial MRI, we had the impression that the most anterior part of the upper anterior surface of the cartilaginous part of the SIJ was poorly visualized because of partial volume effects of bone and psoas muscle owing to the scan plane not being perpendicular to the steep slope of the joint surface in this area. This may be important because our experience is that many patients have BME and capsulitis in relation to the upper anterior superior surface of the SIJ, and it may contribute to the lower semiaxial scores. In contrast, the semiaxial scans easily displayed capsulitis and enthesitis in relation to the anterior inferior and posterior surfaces of the SIJ, which are poorly visualized in the semicoronal scan plane because of slice direction and partial volume effects. Because the SPARCC and Berlin scoring methods focus on BME and not capsulitis and enthesitis, we consider this a minor issue in relation to systematic assessment of the SIJ with the MRI scoring methods. Nevertheless, the very complex location and anatomy of the SIJ call for further studies of the utility of MRI scans performed in the semicoronal versus semiaxial scan plane for diagnosing and prognosticating axSpA and other conditions affecting the SIJ. Moreover, future studies should also investigate whether combined evaluation of the semiaxial and semicoronal sequences are superior to evaluation of one of the scan planes.
Highly standardized MRI protocols of the SIJ are important to avoid reader variation due to differences in scan measures and suboptimal angulation irrespective of scan plane, when comparing changes over time. However, this occurred in only a few patients, and because it concerned both semiaxial and semicoronal slices, we do not regard it as having any major effect on the results. A full validation of a new method requires other steps, e.g., comparison with histology/ pathology. Nevertheless, in our study we have shown that both the semiaxial Berlin and SPARCC methods are reproducible and have face, construct, and discriminative validity. Other limitations of our study were the relatively small sample sizes.
This first comparison of semiaxial and semicoronal scoring methods documents that scoring of MRI sacroiliac joint inflammation according to the principles of the SPARCC and Berlin methods can also be done reliably by using semiaxial slices. This indicates that these modified scoring methods can be used to get important additional information from clinical practice cohorts in which semiaxial but not semicoronal STIR images were acquired. Our study also indicates that there is no advantage in using semiaxial in therapeutic studies as compared to semicoronal slices because the reliability and sensitivity to change are slightly inferior, thus supporting that acquisition of semicoronal slices continues to be the standard method used in clinical trials.
Acknowledgment
We thank personnel at the Department of Radiology, Sygehus Lillebaelt, Vejle, and the MRI Center Thava Aabenraa for their assistance with the MRI scans and radiography, and the staff at Slagelse Hospital for including and following patients.
Footnotes
Abbvie provided the study drug for the first 24 weeks of the study and provided financial support to this investigator-initiated study, but had no role in the planning and conduct of the study, or analysis and interpretation of the results.
- Accepted for publication June 21, 2017.