Abstract
Objective. To evaluate the intraobserver and interobserver reliability of the ultrasonographic (US) assessment of subtalar joint (STJ) synovitis in patients with rheumatoid arthritis (RA).
Methods. Following a Delphi process, 12 sonographers conducted an US reliability exercise on 10 RA patients with hindfoot pain. The anteromedial, posteromedial, and posterolateral STJ was assessed using B-mode and power Doppler (PD) techniques according to an agreed US protocol and using a 4-grade semiquantitative grading score for synovitis [synovial hypertrophy (SH) and signal] and a dichotomous score for the presence of joint effusion (JE). Intraobserver and interobserver reliability were computed by Cohen’s and Light’s κ. Weighted κ coefficients with absolute weighting were computed for B-mode and PD signal.
Results. Mean weighted Cohen’s κ for SH, PD, and JE were 0.80 (95% CI 0.62–0.98), 0.61 (95% CI 0.48–0.73), and 0.52 (95% CI 0.36–0.67), respectively. Weighted Cohen’s κ for SH, PD, and JE in the anteromedial, posteromedial, and posterolateral STJ were −0.04 to 0.79, 0.42–0.95, and 0.28–0.77; 0.31–1, −0.05 to 0.65, and −0.2 to 0.69; 0.66–1, 0.52–1, and 0.42–0.88, respectively. Weighted Light’s κ for SH was 0.67 (95% CI 0.58–0.74), 0.46 (95% CI 0.35–0.59) for PD, and 0.16 (95% CI 0.08–0.27) for JE. Weighted Light’s κ for SH, PD, and JE were 0.63 (95% CI 0.45–0.82), 0.33 (95% CI 0.19–0.42), and 0.09 (95% CI −0.01 to 0.19), for the anteromedial; 0.49 (95% CI 0.27–0.64), 0.35 (95% CI 0.27–0.4), and 0.04 (95% CI −0.06 to 0.1) for posteromedial; and 0.82 (95% CI 0.75–0.89), 0.66 (95% CI 0.56–0.8), and 0.18 (95% CI 0.04–0.34) for posterolateral STJ, respectively.
Conclusion. Using a multisite assessment, US appears to be a reliable tool for assessing synovitis of STJ in RA.
The incidence of subtalar joint (STJ) disease in patients with rheumatoid arthritis (RA) is greatly increased between 5 and 10 years of disease duration and regularly precedes changes in the tibiotalar joint1,2,3. However, the STJ is notoriously difficult to assess clinically and frequently overlooked in favor of the more accessible tibiotalar joint.
Valgus or everted deformity of the STJ is a typical feature of foot disease in patients with RA; it is associated with localized pain and joint stiffness, and with progressive impairment of gait and disability. The level of hindfoot valgus deformity has been shown to increase with disease duration, especially in the first 5 years3. The extent of reported foot problems in patients with RA suggests that the provision of effective, timely, and targeted care is essential to prevent deformity and maintain mobility. Measures to prevent or delay the progression of STJ valgus deformity in RA must combine management of both the synovitis and any underlying mechanical dysfunction4,5,6,7.
However, because of its complex anatomy, the diagnosis of STJ synovitis is difficult, with no single clinical examination (CE) technique of sufficient sensitivity and specificity to be of practical use. The STJ consists of 2 parts: an anterior part of the calcaneal bone articulating with the navicular and talar bones (talocalcaneonavicular joint), and a posterior part, which consists of the articulation of calcaneus with the talar bone (talocalcaneal joint). The posterior facet is the only one in which the capsule forms a recess8. Assessing the characteristics of inflammation in this joint and its effect on the functional status of patients with RA will enable the clinician to target treatment interventions and improve commonly reported symptoms in the foot and ankle.
Ultrasound (US) is a well-tolerated, noninvasive, and relatively inexpensive imaging technique that can be performed in clinic or at the bedside and has the potential of overcoming the limitations of CE. Most previous studies evaluating the foot using US in patients with RA have focused on the forefoot, with relatively few specifically assessing the hindfoot and in particular, the STJ9,10,11. One reason for this is a perceived difficulty of evaluating this joint with US owing to its complexity and a lack of standardization in this area.
We hypothesized that US might be used as a reliable outcome measure to evaluate synovitis of the STJ in patients with RA. The objectives of our study were first, to develop an expert consensus-derived definition of synovitis and scanning protocol for the STJ, and second, to test the reliability of the definitions and protocol.
MATERIALS AND METHODS
Study design
Our study comprised a series of sequential steps. First, we conducted a systematic literature review (SLR) with 2 aims: to identify a set of potential elementary lesions that defined US-detected synovitis in the STJ of patients with RA, and to determine the anatomical sites that have previously been used to provide transducer access for the STJ. The proposed lesions and sites were discussed at an Outcome Measures in Rheumatology (OMERACT) meeting that informed the development of a Delphi consensus exercise. Subsequently, a patient-related reliability exercise was undertaken using the definitions and methodology proposed. The steps are described in more detail below.
SLR
Members of the OMERACT Foot and Ankle US Scoring System in RA (FUSS-RA) group proposed the key question relating to US-detected synovitis in the STJ of patients with RA to guide the SLR: Is US a more accurate technique than clinical assessment and comparable with magnetic resonance imaging (MRI) for the detection of synovitis in the STJ of patients with RA? The PICOS (Patient/Population — Intervention — Comparison/Comparator — Outcome —Search) terms used included the following: (1) Problem: RA, STJ, rearfoot, hindfoot, ankle; (2) Intervention: US, sonography, ultrasonography; (3) Comparison: clinical assessment, magnetic resonance imaging, MRI; (4) Outcome: synovitis, inflammation; and (5) Study: assessment, score, measure, evaluation, and outcome.
An extensive systematic literature search using MEDLINE, Embase Classic + Embase, Evidence-Based Medicine Reviews — Cochrane Database of Systematic Reviews, and Allied and Complementary Medicine databases through Ovid from inception to October 2015 was carried out to identify relevant publications. In anticipation of the relative scarcity of available publications, no limitations regarding study type, research design, or language were applied. In addition to peer-reviewed publications, conference abstracts from international conferences were also identified. Additional articles obtained through a hand search were included and FUSS-RA members were asked to review the final list of included papers and invited to suggest articles and abstracts that had not been included. All abstracts were read by 2 reviewers (HJS and RJW) and selected full-text articles and abstracts (when full-text articles were not available) were also reviewed by a third person (GAWB).
Consensus process
Reaching consensus consisted of 3 phases: (1) a Delphi consensus process to define an US protocol for assessing STJ synovitis in patients with RA, which included identifying the anatomical sites for imaging (“scanning windows”) the STJ and appropriate scoring among experts in musculoskeletal US; (2) the collection of predefined US images (as agreed in Phase 1) of the STJ representative of all levels of STJ synovitis by experts in musculoskeletal US from patients with RA seen in their clinical practice; and (3) final consensus on the assigned scores using the collection of images (obtained in Phase 2) of STJ synovitis reviewed during a meeting of experts prior to undertaking a patient-based reliability exercise in patients with RA.
Delphi process
A Delphi consensus process was undertaken through 2 consecutive written questionnaires sent by e-mail to 21 sonographers from 11 countries, all of whom were experts [and members of the OMERACT/European League Against Rheumatism (EULAR) imaging network] in musculoskeletal US.
Round 1 included 25 statements divided into 3 sections on the following topics: (1) US scanning windows for assessing the STJ; (2) US-defined elementary lesions of STJ abnormalities on B-mode and Doppler mode using the OMERACT definition of synovitis12; and (3) US definitions for scoring joint synovitis on B-mode and Doppler mode. The participants were asked to rate their level of agreement for each statement according to a 5-point Likert scale (1 = strongly disagree to 5 = strongly agree).
Round 2 consisted of several statements not previously agreed on and some new statements generated from the comments supplied in the first round. The aim was to achieve consensus on all the statements to generate an US protocol. Round 2 comprised 16 statements included in 6 sections: positioning of the patient; regions of the subtalar joint to be scanned; US scanning windows for assessing the STJ; location of image aquisition; image aquisition (i.e., plane); US-defined elementary lesions of STJ abnormalities on B-mode and Doppler mode using the OMERACT definition of synovitis; and US definitions for scoring joint synovitis on B-mode and Doppler mode. Consensus in both rounds was achieved if ≥ 75% of responders scored an item either 4 or 5.
Collection of US images representative of the semiquantitative scoring system for STJ synovitis
The respondents to the Delphi survey were requested to collect US images of STJ synovitis in patients with RA that represented the 4 grades (0–3) of synovitis scores [synovial hypertrophy (SH) and power Doppler (PD)] and dichotomous joint effusions (JE) scores agreed in the Delphi process (Figure 1). Each expert was asked to collect at least 1 US image in both transverse and longitudinal planes representing each B-mode and PD grade of STJ synovitis and the presence or absence of JE.
US reliability assessment
The second part of the study consisted of a reliability exercise on patients with RA conducted during 2 days in Amsterdam, the Netherlands. This exercise included the intraobserver and interobserver reliability assessment of US in scoring STJ synovitis on B-mode and PD mode, and JE in the anteromedial, posteromedial, and posterolateral STJ regions12.
Patients
Ten consecutive patients with RA according to the American College of Rheumatology/EULAR 2010 criteria13 and having rearfoot pain were recruited from the outpatient rheumatology clinics of the MC Groep hospitals. The local MC Groep hospitals ethics committee approved the study and informed consent was obtained from all patients prior to the study commencing (IB-2016/5).
Patient data were recorded at study entry and included demographics, RA characteristics, previous imaging modalities, RA treatment, and Disease Activity Score in 28 or 44 joints (Table 1).
Each patient was randomly assigned to a scanning machine, where they were assessed in 2 rounds, 1 in the morning and another in the afternoon. The STJ of both feet were included in the US assessment.
Sonographers
The sonographers consisted of 10 rheumatologists and 2 podiatrists, all experienced in conducting musculoskeletal US examinations of the foot. All sonographers were kept to a maximum 10 min duration scanning protocol.
US examination
US was performed using 5 commercially available real-time scanners (2 MylabAlpha, 2 MylabEight, and 1 MylabClass C; Esaote), each equipped with a multifrequency linear array probe (16–8 MHz). B-mode and PD settings for each US machine were optimized for image resolution and sensitivity by an application specialist and expert sonographer before the patient-based reliability exercise. The sonographers were not allowed to change these settings during the reliability exercise. Patient positioning was supine with the ankle supported to enable dynamic assessment. All US assessments were non–weight-bearing, and the sonographers were able to move the STJ of all our patients.
Scoring system
Scanning and image acquisition were undertaken in line with the consensus-driven scanning protocol agreed upon before the start of the meeting. For scoring purposes, the site of maximum pathology was recorded.
The predefined STJ sites were scored by B-mode and PD mode for both SH (grades 0–3) and by B-mode for JE (absence = 0, presence = 1).
Statistical analysis
Statistical analysis was performed using SPSS, version 17.0 (SPSS) and R language (R Foundation for Statistical Computing). Simple summary statistics were calculated from the responses to the Delphi questionnaires. Continuous variables were presented as the mean and SD, or as frequencies and percentages for categorical variables.
Intra- and interobserver reliability were assessed according to standard κ coefficient and weighted coefficient with absolute weighting [κ (w)]. While intraobserver coefficients were evaluated on pairs of measures performed by the same sonographer at each site, calculation of interobserver coefficient was exclusively based on the first measure of those pairs. Percentage of observed agreement (i.e., percentage of observations that obtained the same score) was also calculated. Interobserver reliability was studied by calculating the mean κ for all pairs (i.e., Light’s κ)14. Intraobserver and interobserver reliability were determined for each STJ region (anteromedial, posteromedial, and posterolateral) separately. Agreement was computed for the elementary lesions of inflammation (i.e., SH, JE, and PD signal). Because the prevalence of the lesions was not determined in advance, and to correct 1 of the 2 κ paradoxes (i.e., artificially low or high κ due to difference in prevalence), we also tested the κ max, which express the best κ reliability possible based on the observed prevalence15,16.
κ coefficients were interpreted according to Landis and Koch17. κ values of 0–0.20 were considered poor, 0.20–0.40 fair, 0.40–0.60 moderate, 0.60–0.80 good, and 0.80–1 excellent.
RESULTS
SLR
There were 334 articles identified and screened; 310 were excluded because they did not include the STJ. A further 6 were excluded because they involved patients with juvenile arthritis and not RA. Four others were excluded because they were MRI only, and in 7 others, the US evaluation was not reported. Twenty-four full-text articles and 6 abstracts were assessed for eligibility; 7 studies were included in the qualitative synthesis, of which 1 was an abstract only.
The SLR resulted in only 6 full papers from 1993 to 201311,18,19,20,21,22, although the scanning region was reported in 4 of the papers11,20,21,22; details regarding probe position and expected image were given in only 1 paper14. This confirmed the lack of standardization in the US evaluation regarding definitions and methodology. Nevertheless, these studies provided a useful basis for helping choose the questions for the Delphi consensus questionnaire.
Delphi process
Consensus was reached on the following statements in Round 1:
US examination of the STJ should include the talocalcaneal joint.
The anterior talocalcaneal joint should be scanned from the medial aspect of the ankle.
The posterior talocalcaneal joint should be scanned from the posterior aspect of the ankle.
Both greyscale and Doppler US should be used to scan the STJ.
The OMERACT scoring system for synovitis (grade 0–3) is suitable for assessing the STJ.
The talocalcaneonavicular joint is not routinely scanned as part of the STJ assessment.
The sinus tarsi is not routinely scanned as part of the STJ assessment.
The levels of agreement for some statements in the first round, especially those relating to assessing abnormalities and guided injections in the STJ, were very low.
It was considered that some of the disagreements relate to the understanding of the complex and difficult-to-visualize anatomy of the joint. For this reason, images were included in Round 2.
Consensus was reached on the following statements in Round 2:
US examination of the STJ should only include the talocalcaneal joint (i.e., excluding the talocalcaneonavicular joint and the sinus tarsi).
Scanning the talocalcaneal joint should include the medial aspect of the ankle, in the region of the sustentaculum tali.
Scanning the talocalcaneal joint should include the posterior-lateral aspect of the ankle. If scanning the medial talocalcaneal joint, the patient should be supine with the knee flexed to 20° and the foot in 10° eversion.
If scanning the posterior talocalcaneal joint, the patient should be prone with their knee in extension and the foot dorsiflexed to 90°, hanging off the examination couch.
Each joint area should be scanned in the longitudinal (sagittal or coronal) plane and pathology should be confirmed in the orthogonal plane.
For each image acquisition, the sonographer should scan across the whole area (e.g., the entire medial talocalcaneal joint) and record the site of maximum pathology.
In the case of normal joints, the sonographer should obtain an image consistent with that of a normal joint, previously agreed upon by consensus.
Assessing synovitis in the talocalcaneal joint should include SH.
Assessing synovitis in the talocalcaneal joint should include effusion.
Assessing synovitis in the talocalcaneal joint should include synovial Doppler activity.
The OMERACT scoring system for synovitis (grade 0–3) is suitable for assessing the talocalcaneal joint.
Greyscale (B-mode) SH in the talocalcaneal joint can be scored semiquantitatively from 0 to 3 (i.e., according to OMERACT definitions of SH).
Synovial Doppler activity in the talocalcaneal joint can be scored semiquantitatively from 0 to 3 (i.e., according to OMERACT definitions of SH with Doppler signal).
Consensus was not achieved for a semiquantitative scoring system for effusion in the talocalcaneal joint; therefore, a dichotomous system was used.
Consensus meeting
A meeting of the US experts was held 1 day prior to the patient-based reliability exercise. Here, all collected images were discussed and the assigned scores were either agreed on immediately or following discussion by the group. Probe positions were agreed for scanning the antero-and posteromedial and the posterolateral aspects of the STJ, which included the option to sweep the probe anteriorly or posteriorly to achieve optimal visualization of the STJ (Figure 2). The posterior probe position was felt important but not critical and was finally dropped for practical reasons, including time limitation. A final consensus on the image acquisition protocol and the STJ synovitis and JE scoring system was reached, which served as the method in the patient reliability exercise during the following 2 days.
The patient characteristics for the real-time reliability scanning session
Table 1 summarizes the clinical characteristics of the patients with RA.
Prevalence of US abnormalities
Data obtained from the 2 days revealed the mean prevalence of US-detected SH on B-mode was 47% (range 26–65). Mean prevalence of PD signal was present in 13% (range 2–23). All PD signals were grade 1 or 2. Mean prevalence of JE was 35% (range 20–54). The mean prevalence of SH, PD, and JE for the anteromedial side was 52% (25–84), 11% (0–12), and 41% (22–60), respectively; for the posteromedial side 48% (8–66), 18% (0–38), and 23% (5–48), respectively; and for the posterolateral side, 50% (20–74), 10% (0–10), and 43% (22–65), respectively.
Intraobserver reliability
The κ values and 95% CI for the intraobserver concordance are shown in Tables 2 and 3. Both B-mode and Doppler mode showed good intraobserver reliability for SH and PD.
Interobserver reliability
Table 4 displays the κ values and 95% CI for the interobserver concordance.
DISCUSSION
The purpose of our study was to develop a standardized method of evaluating synovitis in the STJ of patients with RA for future clinical practice and longitudinal research studies. In the latter case, a reliable imaging scoring system would be of benefit to monitor the effects of drug and mechanical interventions, such as footwear and orthoses, in patients with rearfoot problems.
To our knowledge, this is the first multiobserver US study of STJ synovitis in patients with RA. Using experienced sonographers (rheumatologists and podiatrists), we focused on the evaluation of reliability of the technique, following agreement for pathological definitions and scanning methodology.
We found that reliability was dependent on pathology and which aspect of the joint was being assessed. Good scores were found for SH, moderate to good for PD signals, and poor results for JE. The posterolateral site appeared the most reliable for assessment of synovitis.
Few studies have reported the outcome of US assessment of the STJ in RA9,10. Only 1 study has reported the effect of using US to determine the need for local injection therapy in patients with RA and ankle, rear-, and midfoot problems23. In this study, the use of US led the clinician to conduct unplanned corticosteroid injections in 35% of subtalar joints examined.
The presence of effusion is of significant interest when trying to expose synovitis10,11. US is able to detect very small quantities of fluid accumulations24. Remarkably, although effusion was detected in 35% of patients, the interobserver reliability was poor, suggesting that this elementary lesion of synovitis should not be used as a clinical finding of STJ inflammation.
Although most patients showed clinical evidence of active RA, very few sonographers scored a PD grade of 3. This low prevalence of highly active inflammation may be due to the current treat-to-target strategy and the effectiveness of the new treatments of RA (i.e., biological therapy), though other explanations could be anatomical, because deep joints generally show lesser PD signals. Yet another explanation is that a 4-grade semiquantitative scoring system is not suitable for the STJ. However, validity studies of STJ disease are required to confirm these assumptions.
Differences in the reliability of scoring of the 4 assessed STJ sites may be present but are not crystal clear because some overlap between the CI exists. Among the assessed sites, the most reliable acquisition site seems to be the posterolateral STJ, at least for PD. For practical purposes, we did not assess the posterior position, and by consequence, we cannot draw definitive conclusions about which site of the STJ is the most reliable for assessment of inflammation.
Some important study limitations should be taken into account. First, only 10 patients were assessed, although the patients chosen exhibited a range of severity scores. Second, the lack of a gold standard, typically MRI, prevented our ability to confirm the presence of STJ synovitis in the rearfoot of patients with RA. The next phase of our work will be to compare our agreed methodology with MRI. Third, the participants involved were all considered experts in musculoskeletal sonography. Therefore, these reliability results cannot be extrapolated to a population of less experienced sonographers, confirming the need for a detailed protocol with a supporting atlas of images to identify synovitis in the STJ of patients with RA. A limitation confined to many US studies is the absence of blinding of the investigator. However, because synovitis of the STJ is more difficult to detect by clinical examination than that of joints of the hand5, sonographer bias can be discounted. Finally, as stated previously, an important limitation is that the posterior site of the SJ was not assessed and further research on the STJ should also include this probe position.
This study has also several strengths, including the US investigation of patients to allow the acquisition phase of images to be considered. Additional strengths are that multiple sonographers from different professional backgrounds were included in the study and the use of various US machines, suggesting a high applicability in daily practice.
Our present study demonstrated that experienced sonographers can reach a high intra- and interobserver reliability for the US assessment of STJ synovitis in patients with RA, using a consensus-driven US protocol and agreed scoring system. Larger studies are needed to demonstrate that these findings are also valid in clinical settings with less-experienced sonographers.
Acknowledgment
The authors thank Esaote BV, Maastricht, the Netherlands, for supplying the ultrasound equipment.
Footnotes
UCB, Celgene, and Janssen-Cilag BV, the Netherlands, provided financial support for the meeting that was the basis of this article. Dr. H.J. Siddle is funded through a Health Education England/National Institute for Health Research Clinical Lectureship.
- Accepted for publication July 26, 2018.