To the Editor:
I was interested to read the paper by Christiansen, et al published in the January 2017 issue of The Journal of Rheumatology1. The purpose of the authors was to determine the reproducibility of sacroiliac joint (SIJ) radiographs among readers with varying levels of experience, and to identify potential drivers of disagreement in classification among 5 predefined radiographic lesion types1. A hundred and four patients with low back pain ≥ 3 months who met the Assessment of Spondyloarthritis international Society definition for a positive SIJ magnetic resonance image were recruited. Seven blinded readers (2 musculoskeletal radiologists, 5 rheumatologists) classified pelvic radiographs according to the modified New York criteria (mNY) and recorded presence/absence of 5 lesion types in both SIJ: erosion, sclerosis, ankylosis, joint space widening, and joint space narrowing. Reproducibility of mNY classification among 21 reader pairs was assessed and potential drivers of disagreement were identified among 5 lesion types1. Based on their results, mean κ values (percent concordance) were 0.39 (84.1%) for mNY classification over 21 reader pairs, 0.46 (79.8%) between 2 musculoskeletal radiologists, and 0.55 (86.5%) and 0.36 (77.9%) between the most experienced rheumatologist and the 2 radiologists, respectively1.
It is crucial to know that using κ is one of the common mistakes in reliability analysis for qualitative outcomes. Two important weaknesses of κ statistics to assess agreement are as follows: (1) it depends on the prevalence in each category, which means it can be possible to have different κ values having the same percentage for both concordant and discordant cells; (2) κ value also depends on the number of categories2,3,4,5,6,7,8,9,10. Moreover, in reliability analysis, an individual-based approach should be applied instead of global average. Therefore, using mean κ values is another methodological issue in reliability analysis2,3,4,5,6,7,8,9,10. Finally, appropriate correction should be applied in multiple comparison analysis. They concluded that reproducibility of radiographic SIJ classification in a spondyloarthritis (SpA) inception cohort was only fair to moderate among 7 readers with varying levels of experience, questioning the applicability of mNY in early SpA1. Such conclusion should be supported by the above-mentioned statistical and methodological issues. Otherwise, misdiagnosis and mismanagement of the patients may occur.
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.