Article Text

Extended report
Ultrasound definition of tendon damage in patients with rheumatoid arthritis. Results of a OMERACT consensus-based ultrasound score focussing on the diagnostic reliability
  1. George A W Bruyn1,
  2. Petra Hanova2,
  3. Annamaria Iagnocco3,
  4. Maria-Antonietta d'Agostino4,
  5. Ingrid Möller5,
  6. Lene Terslev6,
  7. Marina Backhaus7,
  8. Peter V Balint8,
  9. Emilio Filippucci9,
  10. Paul Baudoin1,
  11. Richard van Vugt10,
  12. Carlos Pineda11,
  13. Richard Wakefield12,
  14. Jesus Garrido13,
  15. Ondrej Pecha14,
  16. Esperanza Naredo15
  17. on behalf of the OMERACT Ultrasound Task Force
  1. 1Rheumatology Department, MC Groep Hospitals, Lelystad, The Netherlands
  2. 2Department of Rheumatology, Institute of Rheumatology, Prague, Czech Republic
  3. 3Department of Rheumatology, Sapienza Università di Roma, Rome, Italy
  4. 4Department of Rheumatology, Université Paris Ouest-Versailles-Saint Quentin en Yvelines, Hôpital Ambroise Paré, APHP, Boulogne-Billancourt, France
  5. 5Department of Rheumatology, Instituto Poal, Barcelona, Spain
  6. 6Department of Rheumatology, Copenhagen University Hospital at Glostrup, Copenhagen, Denmark
  7. 7Department of Rheumatology, Charite University Hospital, Berlin, Germany
  8. 8Department of Rheumatology, National Institute of Rheumatology and Physiotherapy, Budapest, Hungary
  9. 9Department of Rheumatology, Clinica Reumatologica, Universitá Politecnica delle Marche, Jesi, Ancona, Italy
  10. 10Department of Rheumatology, VU Medisch Centrum, Amsterdam, The Netherlands
  11. 11Department of Rheumatology, National Institute of Rehabilitation, Mexico City, Mexico
  12. 12Academic Unit of Musculoskeletal Disease, University of Leeds, Leeds, UK
  13. 13Department of Social Psychology and Methodology, Faculty of Psychology, Autonoma University, Madrid, Spain
  14. 14Technology Centre ASCR, Prague, Czech Republic
  15. 15Department of Rheumatology, Hospital General Universitario Gregorio Marañón, Madrid, Spain
  1. Correspondence to George A W Bruyn, Department of Rheumatology, MC Groep Hospitals, Lelystad 8333 AA, The Netherlands; gawbruyn{at}wxs.nl

Abstract

Objective To develop the first ultrasound scoring system of tendon damage in rheumatoid arthritis (RA) and assess its intraobserver and interobserver reliability.

Methods We conducted a Delphi study on ultrasound-defined tendon damage and ultrasound scoring system of tendon damage in RA among 35 international rheumatologists with experience in musculoskeletal ultrasound. Twelve patients with RA were included and assessed twice by 12 rheumatologists-sonographers. Ultrasound examination for tendon damage in B mode of five wrist extensor compartments (extensor carpi radialis brevis and longus; extensor pollicis longus; extensor digitorum communis; extensor digiti minimi; extensor carpi ulnaris) and one ankle tendon (tibialis posterior) was performed blindly, independently and bilaterally in each patient. Intraobserver and interobserver reliability were calculated by κ coefficients.

Results A three-grade semiquantitative scoring system was agreed for scoring tendon damage in B mode. The mean intraobserver reliability for tendon damage scoring was excellent (κ value 0.91). The mean interobserver reliability assessment showed good κ values (κ value 0.75). The most reliable were the extensor digiti minimi, the extensor carpi ulnaris, and the tibialis posterior tendons. An ultrasound reference image atlas of tenosynovitis and tendon damage was also developed.

Conclusions Ultrasound is a reproducible tool for evaluating tendon damage in RA. This study strongly supports a new reliable ultrasound scoring system for tendon damage.

  • Rheumatoid Arthritis
  • Tendinitis
  • Ultrasonography

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Tenosynovitis is one of the key features of the clinical pattern in patients with rheumatoid arthritis (RA).1–3 Histologically, tenosynovitis exhibits similar features as joint synovitis, including hyperplasia of the synovial lining and infiltration of particular types of leukocytes, notably CD4 T cells and CD 68+ macrophages.4 Longstanding tenosynovitis may result in tendon damage either by synovial proliferation or by bony attrition resulting in tendon rupture with consequent disability.5–7 The most common ruptures of the tendons of the hand involve the extensor pollicis longus (EPL) tendon and the extensor digiti minimi (EDM) tendon. It is assumed that partial tears in tendons progressively evolve into complete ruptures. Although clinical examination (CE) may disclose complete rupture, CE of partial tears is notoriously unreliable.5–7

Musculoskeletal ultrasound is a readily available, useful and versatile imaging modality with high patient acceptability. Musculoskeletal ultrasound has proven to be more accurate than CE in detecting synovitis and tenosynovitis.8

Despite these attractive features, the technique is still considered examiner-dependent and machine-dependent. This opinion is based mainly on the fact that both acquisition and interpretation of ultrasound images determine the metric properties. Over the past decade, the Outcome Measures in Rheumatology in Clinical Trials (OMERACT) ultrasound Task Force, a group of interested international sonographers, has worked to address the metric qualities of musculoskeletal ultrasound in RA.9 ,10 More recently, the Task Force has looked at the intraobserver and interobserver reliability of ultrasound for detecting and grading of greyscale tenosynovitis and tenosynovial power Doppler activity in patients with RA.11–13 The present study is an extension of these tendon studies and focuses on tendon damage.

The aim of the present study was threefold, that is, to achieve consensus on elementary lesions and definition of tendon damage in RA; to develop a novel ultrasound scoring system for tendon damage in B mode, and to assess the intraobserver and interobserver reliability of this scoring system for tendon damage in RA patients among rheumatologists with extensive experience in musculoskeletal ultrasonography (MSUS).

Patients and methods

A two-step study

The study was carried out in two steps. The first step consisted of a Delphi exercise, aiming to find agreement on ultrasound definitions of normal tendons, peritendinous structures, tenosynovitis and tendon damage in RA; furthermore, the Delphi exercise was done to reach consensus on the ultrasound grading of tenosynovitis and tendon damage in RA patients. Details on the methodology of the first step have previously been reported by Naredo et al.13

ultrasound reliability assessment

The first step of the study was followed by a two-day patient-reliability exercise, which took place in Amsterdam, The Netherlands. Each day was divided in a morning and an afternoon session. The afternoon session was a repetition of the morning session in order to assess the intraobserver reliability.

Patients

Twelve patients with RA according to the American College of Rheumatology 1987 criteria14 representing all degrees of disease activity (severe, moderate, low and remission as defined by DAS28) were recruited from the outpatient rheumatology clinic (MC Groep hospitals). Demographic and clinical data were recorded for all patients.

The 12 patients were equally divided over 2 days. Both wrists and ankles were studied for the ultrasound investigation. All patients were assessed twice, that is, during the morning and again in the afternoon. The local ethics committee approved the study and all patients gave written consent according to the Declaration of Helsinki.

Ultrasonographers

Twelve rheumatologists with extensive experience in ultrasound, that is, more than 10 years, participated in the present study.

Tendons

At the wrist, the following extensor tendons enclosed in a synovial sheath were selected: the second extensor compartment, that is, the extensor carpi radialis brevis and longus; the third, that is, the extensor pollicis longus (EPL); the fourth, that is, the extensor digitorum communis (EDC); the fifth, that is, the extensor digiti minimi (EDM); and the sixth, the extensor carpi ulnaris (ECU). At the ankle, the tibialis posterior tendon was included. Since flexor tendons at the wrist may show a high level of anisotropy making ultrasound evaluation of tendon damage difficult, they were not included in the ultrasound evaluation.12 ,13

Ultrasonography

Bilateral ultrasound investigation was performed with six Esaote ultrasound scanners (one Mylab 70 XVision and five Mylab Class C; Esaote, Genoa, Italy) by means of linear array transducers (6–18 MHz or 4–13 MHz). The B mode settings of each ultrasound machine were optimised and fixed. Dynamic investigation by flexion and extension of particular fingers was allowed to improve differentiation of tendon pathologies.

The 12 ultrasonographers independently, consecutively and blinded to the clinical data performed the ultrasound examination of the selected tendons and assessed tendon damage in B mode according to the agreed scoring system. The extensor tendons of the wrist were scanned from the level of Lister's tubercle downwards to the level of the extensor retinaculum; the tibialis posterior tendon was scanned from a level proximal to the medial malleolus to slightly distal of it.12 ,13 Maximal scanning time was 15 min per patient. The scanning time included the time to fill out the scoring sheet.

Atlas

All members of the OMERACT US task group collected images which were used to develop an US reference image atlas of tenosynovitis and tendon damage.

Statistical analysis

Statistical analysis was performed with the software package SPSS, version 17.0. Normally distributed continuous data were summarised with means and SDs or 95% CIs; non-normally distributed data were summarised with median and range.

Intra- and interobserver agreement was assessed by κ coefficients. Cohen's κ coefficient was calculated for intraobserver agreement, whereas Light's κ was calculated for interobserver agreement.15 ,16 The comparison of the κs between first and second occasion was conducted using the Root Mean Square Difference index, and by the product-moment correlation coefficient. Basic statistics and interobserver reliability represented by the intraclass correlation coefficient (ICC) with 95% CI were determined for each tendon compartment separately.

ICC and κ values are comparable; κ values were interpreted as follows: 0–0.20 poor, 0.20–0.40 fair, 0.40–0.60 moderate, 0.60–0.80 good and 0.80–1 excellent agreement.

Results

Delphi process

The results of the Delphi exercise regarding ultrasound definitions of normal tendons, anatomically related structures and tenosynovitis have been previously reported.13 Regarding the statements on tendon damage, there was group agreement on the definition of tendon damage and the scoring system after two rounds.

In particular, group agreement was achieved on the following items: tendon damage can be defined on B-mode as internal and/or peripheral focal tendon defect (ie, absence of fibres) in the region enclosed by tendon sheath, seen in two perpendicular planes; the grade of tendon damage should be assessed in both longitudinal and transverse planes; and, a four-grade semiquantitative scoring system (ie, grade 0, normal; grade 1, minimal; grade 2, moderate; grade 3, severe) can be used to score tendon damage on B mode.

Review of sent ultrasound images of tendons and consensus finding on scoring system

Out of 28 consulted experts, 19 (68%) sent a set of ultrasound images covering all grades of tendon damage to the organisers of the study (GAWB and EN). All the participants of the reliability exercise reviewed these images in a consensus meeting on the evening prior to the exercise. During this review process, it was noted that a four-grade semiquantitative scoring system did not work for most experts. Based on their opinion, the following scoring system for tendon damage in B mode was concurred: grade 0, normal tendon; grade 1, partial tendon damage seen in two orthogonal planes, and grade 2, complete tendon rupture seen in two orthogonal planes. Greyscale examples of tendon damage are shown in figures 13.

Figure 1

(A) Transverse scan in B mode of a normal extensor carpi ulnaris tendon, residing in its groove on the distal ulna. Tendon damage grade 0. Dimension unit indicates 10 mm. (B) Longitudinal scan in B mode of a normal extensor carpi ulnaris tendon. Tendon damage score grade 0. Dimension unit indicates 10 mm.

Figure 2

(A) Longitudinal scan of extensor carpi ulnaris tendon. Tendon damage score grade 1. Asterisks indicate an area of synovial proliferation within the tendon sheath, arrow point to partial rupture. (B) Transverse scan of extensor carpi ulnaris tendon. Tendon damage score grade 1. Asterisks indicate tenosynovitis, arrows point to partial rupture.

Figure 3

(A) Transverse sonogram showing a stump (arrow) of the completely ruptured extensor digiti minimi tendon. The stump is surrounded by fluid and synovial proliferation. (B) Longitudinal sonogram of the extensor digiti minimi tendon, showing the site of complete rupture (**) and distension of the tendon sheath due to fluid and synovial proliferation (arrow).

Patient characteristics

The demographics and disease-related characteristics of the patients with RA are summarised in table 1.

Table 1

Demographical, disease-related characteristics and ultrasound grading

Prevalence of ultrasound abnormalities

Overall, 3456 tendon compartments were assessed by ultrasound in B mode (144 per investigator). Of these, 804 ultrasound investigations showed either a grade 1 or a grade 2 tendon lesion (23%). The prevalence of lesions per tendon compartment is shown in table 2.

Table 2

Absolute frequencies and percentages of tendon damage according to tendon compartment

Intraobserver and interobserver agreement

Table 3 shows the κ coefficient estimates of interobserver agreement calculated for pairs of investigators at first and second (italic) occasion. Intraobserver agreement is shown as bold numbers on the diagonal line. Means are calculated below the table.

Table 3

Cohen's and Light's κ estimates of intraobserver and interobserver agreement on the first occasion and second occasion; both days analysed together

In table 4, the interobserver reliability of the tested scoring system within particular compartments is estimated.

Table 4

Descriptives and interobserver reliability (ICC) of the tested scoring system within particular tendon compartments

Atlas

GS and PD US images of tendon lesions were collected into a US reference image atlas. The reference images include US images of tenosynovitis and tendon damage of various grades affecting tendons frequently involved in RA. Multiple examples covering semiquantitative grades of tenosynovitis (0-3) and tendon damage (0-2) are shown in the online supplementary material. In addition to the typical images, the atlas comprises a series of challenging Doppler images. With the guidance of the reference images displayed in this atlas, US scans of tendon abnormalities in RA can easily be scored for various grades of tenosynovitis and tendon damage both in clinical practice and in research trials (see online supplementary material).

Discussion

To our knowledge, the present exercise is the first multiobserver study that assesses the reproducibility of ultrasound in scoring tendon damage in patients with RA. The results show a high intraobserver and interobserver reliability among experienced rheumatology ultrasonographers. The findings may be relevant for both daily clinical practice and trials. As yet, no ultrasound studies have assessed the grading of tendon damage in RA; a reliable imaging scoring system may be used to identify and follow-up tendons at risk of rupture. Furthermore, a ultrasound scoring system for tendon disease may serve as an imaging biomarker for clinical drug trials.

Only a few studies have looked at the ultrasound assessment of tendon damage in RA, observing a wide variability. Filippucci and colleagues found partial tendon tears in 12% and complete tears in 3% in a cross-sectional analysis of 90 patients with RA.17 Micu and colleagues found tendon damage in over 50% of their patients.18 Our work reveals a partial rupture in 21% and a complete rupture in 2%.

While the Delphi exercise resulted in a four-grade tendon damage score, the common opinion of the rheumatologists attending the patient exercise meeting was that a three-grade semiquantitative score (normal, partial and complete rupture), was more practical. As the meeting consensus was final, we used this score for the patient exercise.

Very few observers finally scored a complete rupture, that is, grade 2. This low prevalence of complete tendon rupture is probably due to the current treat-to-target strategy and the effectiveness of the new treatments of RA, that is, biological therapy.19 This finding was taken into account by the statistics processing and data were dichotomised where 0 was a completely normal tendon and 1 represented damage.

Some differences in reliability of scoring of tendon damage within particular compartments were found. This is probably due to various difficulties of investigations of particular tendons. The most reliable tendons were the ECU and tibialis posterior tendon—both non-splitting and relatively thick and straight-running tendons. One other very reliable tendon was the EDM. The ECU proved also to be the most often damaged tendon of all tendons investigated in the study. All the above mentioned tendons have been reported to be frequently involved in RA.5 ,20 ,21 The presence of involvement of the ECU predicts the development of erosions21; therefore, the outcome of this study may support the use of the ECU as a reliable biomarker.

The least reliable tendons were EDC and EPL, probably due to splitting in some finger extensor tendons which may cause difficulties in interpretation of tendon damage, especially in transverse view of the EDC tendon. The EPL tendon can be difficult to follow in its course while crossing other tendons.

There are limitations inherent in our study. First, only 12 patients were assessed. However, similar numbers of patients have been assessed in other multiexaminer reliability studies for feasibility reasons. The difference in reliability noted between the right and left compartments three and four is related to the small prevalence of positive findings on the left side compared with the right, rather than true differences among observers in scoring lesions. Second, the lack of a gold standard, for example, MRI or surgery prevented to determine the true prevalence of tendon damage lesions. However, a concurrent validity study showed comparative accuracy in diagnosing tendon damage between ultrasound and MRI.22 Additionally, this was not a validity but a reliability study. Finally, the rheumatologists involved were all expert in ultrasonography. Thus, it is not taken for granted that these reliability results can be extrapolated to a population of less experienced rheumatologists. It is reassuring, however, that the broad US7 experience in Germany has revealed a good correlation between experts and less experienced rheumatologists.23

A strength of this study is the inclusion of the acquisition phase of ultrasound images. Another strength is that the reliability was assessed separately for particular tendon compartments.

In conclusion, the present study suggests that rheumatologists-ultrasonographers can have a high reliability in their performance of ultrasound assessment of 12 target tendon compartments in RA patients, with the best scores in the EDM, ECU and tibialis posterior tendons.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

Footnotes

  • Handling editor Tore K Kvien

  • Correction notice This article has been corrected since it was published Online First. Figures 3A and 3B have been replaced and the legend amended. In addition, the grant numbers have been included in the Funding section.

  • Acknowledgements We thank the patients who participated in the reliability session. We are grateful to Esaote Netherlands BV for providing the ultrasound machines. We express our gratitude to Ria de Kort and Marian de Waal for logistical support.

  • Collaborators OMERACT Ultrasound Task Force members: Sibel Aydin, Artur Bachta, Paz Collado, Cristina Estrach, Jane E Freeston, Frederique Gandjbakhch, Marwin Gutierrez, Hilde B Hammer, Kei Ikeda, Frederick Joshua, Sandrine Jousse-Joulin, David Kane, Helen I. Keen, Juhani M Koski, Peter Mandl, Zunaid Karim, Wolfgang A Schmidt, Nanno Swen, Philip G Conaghan.

  • Contributors Study design: EN, GAWB, M-AD. Acquisition of data: ENaredo, M-AD, PH, IM, PVB, EF, AI, CP, LT, MB, PB, RvV, RW, GAWB. Analysis and interpretation of data: GAWB, PH, JG, OP, EN. Manuscript: GAWB, EN. Statistical analysis: PH, OP, JG.

  • Funding Roche Netherlands BV provided funding for the reliability exercise. Roche Netherlands BV did not participate in the study design, data collection, data analysis, or writing of the manuscript. Supported by the project (Ministry of Health, Czech Republic) for conceptual development of research organization 023728 (Institute of Rheumatology) and by project No. NT12437.

  • Competing interests None.

  • Ethics approval MEC hospital approval.

  • Provenance and peer review Not commissioned; externally peer reviewed.