Abstract
Objective. Bone erosions in rheumatoid arthritis (RA) have been studied in an increasing amount of research. Both earlier and present classification criteria of RA contain erosions as a significant classification component. Ultrasound (US) can detect bone changes in accessible surfaces. Therefore, the study group performed a systematic literature review of assessment of RA bone erosions with US.
Methods. A systematic search of PubMed and Embase was performed. Data on the definitions of RA bone erosions, their size, scoring, relation to synovitis, comparators, and elements of the OMERACT (Outcome Measures in Rheumatology Clinical Trials) filter were collected and analyzed.
Results. The selection process identified 58 original research papers. The assessed joints were most frequently metacarpophalangeal (MCP; 41 papers), proximal interphalangeal (19 papers), and metatarsophalangeal joints (MTP; 18 papers). The OMERACT definition of RA bone erosion on US was used most often (17 papers). Second and fifth MCP and fifth MTP were recommended as target joints. Conventional radiography was the most frequently used comparator (27 papers), then magnetic resonance imaging (17 papers) and computed tomography (5 papers). Reliability of assessment was presented in 20 papers and sensitivity to change in 11 papers.
Conclusion. This paper presents results of a systematic literature review of bone erosion assessment in RA with US. The survey suggests that US can be a helpful adjunct to the existing methods of imaging bone erosions in RA. It analyzes definitions, scoring systems, used comparators, and elements of the OMERACT filter. It also presents recommendations for a future research agenda based on the results of the review.
The identification of bone erosions is crucial to the early diagnosis, the prediction of future bone damage, and the monitoring of therapeutic outcomes in patients with rheumatoid arthritis (RA)1,2. Until relatively recently, conventional radiography (CR) had been considered the mainstay for their detection. However, with the availability of newer imaging techniques such as ultrasound (US), magnetic resonance imaging (MRI), and computed tomography (CT)3,4, it was acknowledged that radiography appeared less sensitive to detect early bone damage. In addition, some of the newer techniques (US and MRI) had the added advantage of being able to simultaneously visualize soft tissues. As a result, these new techniques were explored as potential alternatives for radiography. It was noted that each technique had its own strengths and weaknesses. For example, CT has a high resolution for bone damage, but is hampered by its radiation use, which makes it undesirable for repeated examinations. MRI has the benefit of simultaneously assessing soft tissue inflammation, but machine accessibility is generally low, it is expensive, and it is time-consuming to perform, especially for multijoint assessments. In contrast, US is safe, widely available, allows the simultaneous imaging of bone and soft tissue, and allows the examination of many joints in a relatively short period of time.
Increasing interest in US led to a consideration of how to standardize the technique. One of the first advances was the creation of the OMERACT (Outcome Measures in Rheumatology Clinical Trials) Ultrasound Task Force in 20045. This resulted in the first published consensus-derived definitions for US-related pathologies, including erosions. Although work by different groups has been ongoing since then, there has been, to date, no clear agreement on which joints to examine and how to score and monitor changes, and therefore no agreement on how to implement US bone erosions in clinical and research practice.
The aim of our study was to systematically review current literature to identify levels of evidence for the use of US for the detection of bone erosions in patients with RA and to identify any gaps. The specific objectives of our study were (1) to determine the level of homogeneity in the US definitions for erosions in the published literature, and (2) to evaluate the metric properties of US for the detection and quantification of bone erosions according to the OMERACT filter, including recommendations for target joints. It was hoped that these data could then be used toward the development of guidelines at a later stage.
MATERIALS AND METHODS
Search strategy and study selection
The search for original articles, published in English between January 1985 and May 2014 and referring to bone erosions and US, was carried out in PubMed and Embase databases. Reviews, letters, comments, and abstracts from scientific congresses were not included.
To obtain the largest number of references, the search was performed in PubMed and Embase with the following key words: “ultrasound OR ultrasonography OR sonography AND erosions OR cortical defect OR interruption of the bone surface OR change in the bone surface OR intra-articular discontinuity OR cortical break OR interruption of the bone margin AND rheumatoid arthritis”.
Only references with available abstracts were assessed. Titles, abstracts, and full reports of articles identified were systematically screened by the first author (MS) with regard to inclusion and exclusion criteria. Final search was verified by the second and last authors (LT, MADA).
Articles that did not meet inclusion criteria were excluded at any step of the study selection.
Data extraction
All data were extracted from the selected articles using a standardized spreadsheet previously developed and validated for systematic reviews in musculoskeletal US6. All selected articles were rated to determine US definitions of bone erosions and size, and to evaluate the quality of the studies according to the OMERACT filter7. A standardized tool for assessing the quality of the analyzed studies was developed and assessed in a binary mode (yes/no) based on a set of 6 predefined criteria: (1) Was the patient population well-defined in the methods section? (2) Was the definition of US erosions clearly formulated? (3) Was there a description of an erosion score? (4) What joints were evaluated and were target joints recommended? (5) Was the choice of comparator adequately explained and results completely given? and (6) Was 2-dimensional or 3-D evaluation performed? Attention was also given to the quantification of erosions and the simultaneous presence of synovial changes (synovial hypertrophy ± Doppler signal) detected with US.
Evaluation methods
Face validity, construct validity, criterion validity, and discriminant validity (i.e., reliability and responsiveness) were independently evaluated in every paper, including whether the methods for assessing it and their measurement were available or not. Face validity, essentially subjective, was analyzed according to the conclusions of the authors. Criterion validity was considered achieved when US results were compared with a true “gold standard” (CR, MRI, CT, macroscopic view, or phantom).
Construct validity was considered as achieved when US evaluation of erosions was demonstrated to be consistent with theoretical concepts. The reliability was evaluated according to the design used (images reading reliability or patient scanning reliability).
Responsiveness or sensitivity to change was evaluated by the ability of the tool to demonstrate change, usually in response to an intervention.
Statistical analysis
Descriptive statistics were used to report data. Frequencies and percentages were used for categorical variables.
RESULTS
The primary search identified 147 articles in PubMed and 166 in Embase, which after further analysis were reduced to 58 original research papers (Figure 1). The selected papers were then divided among the group members and assessed according to an agreed scoring sheet (for results of Evaluation score, see Table 1)4,5,8–17,18–27,28–37,38–47,48–57,58,59,60,61,62,63.
The data extracted from the articles during the search are displayed in Table 1, Table 2, and Table 3.
The assessed articles included 43 case series, 11 case control studies, 2 reliability studies, 1 expert consensus, and 1 experimental study. As the inclusion criteria stated, all studies involved patients with RA, apart from the expert consensus5 and the experimental study41, which were selected because they brought original data not requiring patients to be assessed. Other arthritides, such as psoriatic arthropathy, juvenile idiopathic arthritis, spondyloarthropathy, gout, and osteoarthritis, were also represented, either in control groups or in parts of more heterogeneous study populations.
The assessed joints were predominantly the finger joints: metacarpophalangeal (MCP) in 41 studies, proximal interphalangeal (PIP) in 19 studies, and distal interphalangeal in 1 study. In 5 studies, the finger joints were scanned only from the dorsal positions; in all the others, both from dorsal and volar. The erosions in metatarsophalangeal (MTP) joints were described in 18 papers. The erosive changes in wrists were analyzed by 13 papers, the elbows by 2, and the shoulders by 6; the distal ulna as a separate joint area was analyzed in 5 papers. Erosions of the knee were studied in 3 papers, and the heel and the sternoclavicular joint in 1 paper each.
Definitions, size, and scoring of bone erosions on US
In 46 papers (79%), there was a clear description or definition of bone erosions. In 17 papers, the authors used the definition proposed by the OMERACT group6. Other definitions included visualization of erosive findings in 2 planes, and described the changes as “cortical breaks,” “interruptions,” or “defects.” A suggestion of an irregular floor of the change was also raised32. When the size of the cortical break was proposed for determining the change as an erosion, the detection limit was set at 2 mm in diameter in 7 papers (14%) and at 1 mm in 2 papers (4%).
In 26 studies, a semiquantitative scale of evaluation of bone erosions was used, most frequently with a scale from 0 to 3, based on either the extent of the changes or their size.
Recommendation of target joints and relation to synovitis
Only a few papers addressed the subject of recommended joints for detection or monitoring of bone erosions. The suggested joints were the second MCP joint, fifth MCP joint, fifth MTP joint (5 papers each), MCP joints in general (2 studies), PIP joints (2 papers), elbow joint (1 paper), and the distal ulna (1 paper).
The relationship of bone erosions to US synovitis (synovial hypertrophy ± Doppler signal) of the affected joint was assessed in 15 papers (26%). Yet the direct relationship between the presence of synovitis and the development of bone erosions detected with US was not studied. Ohrndorf, et al showed in a cohort of patients with RA a fall in greyscale and Doppler variables in parallel with a decrease in the erosion score, suggesting a healing process of erosions under treatment62.
Elements of the OMERACT filter, used comparators, and type of visualization
Table 3 presents the results of the 58 selected papers when tested according to the OMERACT filter.
Truth
Face validity was achieved in 50 studies, criterion validity assessed in 27 papers, and construct validity in 20 studies. The most frequent comparator was CR (27 papers, 47%), then MRI (17 studies, 30%), CT (5 studies, 9%), and macroscopic artificial bone erosions in 1 paper (2%).
Discrimination: Reliability
Reliability of detection of bone erosions was described in 21 papers; both intra- and interobserver reliability were assessed. Most often (in 13 out of 21 studies, 62%) reliability was tested dynamically by different ultrasonographers examining the same patients, but assessment on sets of static images and videos were also used.
Discrimination: Responsiveness
Sensitivity to change was assessed in 11 studies and in 2 of them, a reduction in size of erosions in the evaluated joints was reported55,61. In 6/11 studies, where sensitivity to change was studied, erosive changes were described as present/absent. In 5/11, scoring systems were used, 3 based on size and 2 based on description.
Feasibility
None of the analyzed papers reported information about the feasibility of US for examining erosions.
Visualization
The use of 3-D probes for detection of bone erosions was assessed only in 3 papers. In 2 papers, the numbers of patients were extremely small (2 in each).
DISCUSSION
US is increasingly being used as a tool by rheumatologists for the assessment of patients with suspected or proven RA. Its many advantages include its wide availability and ability to be performed at the point of care, supplementing the clinical examination with important information. Bone erosions are one of the major features of RA and prevention of their emergence is one of the main aims of the treatment, stressing the need for sensitive imaging methods. Moreover, their presence is a prognostic sign of aggressive disease. Therefore, the detection and followup of bone erosions in RA are of major importance for initial and current evaluation of the progression of the disease and affect further treatment decisions.
Our systematic literature review shows that researchers most frequently use US to assess finger and toe joints, followed by the wrists and shoulders, and rarely the knee joints; this is in accordance with studies showing that the earliest detectable erosive bone changes appear in the small joints in patients with RA64. As shown in the systematic review by Baillet, et al65, US is more sensitive for detecting erosions in both the small and large joints of the extremities than CR, especially in early RA, and has a sensitivity similar to MRI6. This is especially true for joint surfaces that are easily available for US; while in the areas with no or only limited accessibility for US, MRI has an obvious advantage as shown by Døhn, et al4.
The definitions used in describing erosions showed greater homogeneity after the publication of the OMERACT consensus definitions by Wakefield, et al5, predominantly using the definitions presented there. The OMERACT definition did not encompass the size criterion, and with the increasing resolution of the US probes, it may be necessary to include both shape and size of the findings. The most frequently suggested cutoff between a normal variation in bone surfaces and a true erosion was 2 mm. Studies showing that there are erosion-like changes such as vessel channels46 even in healthy controls underline the necessity for further studies that explore the limits between normality and findings regarded as true bone erosions.
The literature review also revealed a lack of consensus regarding scoring systems for erosive changes. Moreover, the common limitation for many of the suggested scoring systems is the inability to show progression once the top score is reached (e.g., erosion > 4 mm). In the case of counting the number of erosions as a scoring system, the problem arises when 2 or more erosions melt and become 1 big erosion — in fact, a progression that will be scored as an improvement. This highlights the necessity for a consensus-based scoring system that takes these issues into account.
Bone erosions can be detected with US in all joints affected by RA; however, our review demonstrates that the recommended target joints include MCP and MTP, and in particular the second and fifth MCP joints and the fifth MTP joint.
Some publications also reported on the presence of synovitis concomitant with the presence of bone erosions in the assessed joints, referring to the possible causal effect of the pannus on the emergence of bone erosions. Yet the studies on coexistence of synovial changes and development of bone erosions with US are too few to conclude anything substantial.
The metric properties of US for erosions were analyzed using the OMERACT filter and the analyzed literature reported good intra- and interobserver reliability. The sensitivity to change was assessed in 11 papers, showing that it was possible to assess changes in bone erosions over time, either as present/absent or with scoring systems. However, the weakness of the analysis lies in the differences among the systems, making general conclusions difficult. The criterion validity (comparison to micro- or macroscopic appearance) was assessed in only 1 paper using macroscopic bovine changes as a comparator41. The construct validity was analyzed mainly in relation to MRI and CR, showing good correlation with the former and better sensitivity than the latter, especially in early RA, for detection of bone erosions. Somewhat surprisingly, CT was used as the gold standard in only 5 publications. CT is more advantageous than MRI for the assessment of bones. However, its inability to detect inflammation such as synovitis, tenosynovitis, or bone marrow edema, and the radiation risk are undoubtedly the principal reasons limiting its use in daily clinical practice and in trials.
A novel technique for evaluating bone erosions by US is the 3-D US35,39,55. This technique was evaluated in only 3 papers and further studies are warranted before its use for diagnosis and monitoring can be assessed, but especially for monitoring, the modality holds promise for the future.
The assessed papers did not allow confirmation that differentiation between arthritides with US on the basis of presence, character, and localization of bone erosions is possible. However, as shown by Zayat, et al, the presence of any erosive changes in the fifth MTP was specific for RA61. It should be pointed out that some of the included papers in this review15,35,39,48,51,53 contain a small number of patients and therefore their importance lies more in supporting evidence from larger studies and demonstrating the development of US in erosion diagnostics and monitoring.
US may be useful in the detection of RA bone erosions. It appears as a valid (especially for face and content validity) and reliable tool for evaluating erosions. The definition is fairly congruent, but a consensus on scoring erosions is required to improve the comparability of studies and to improve the value of US in RA management. Recommendations for future research include (1) assessment of joints other than MCP, PIP, and MTP (wrists, ankles, knees, elbows, and shoulders for bone erosions); (2) consensus on a size-based definition; (3) longitudinal studies of development of erosions and especially development of erosions when inflammation is well controlled; (4) more widespread use of CT as comparator; (5) more widespread use of 3-D visualization of bone erosions; (6) differentiation between arthritides with US on the basis of presence, character, and localization of bone erosions; and (7) differentiation between bone erosions and physiological bone profile interruptions.
- Accepted for publication August 27, 2015.