RT Journal Article SR Electronic T1 Ultrasound as an Outcome Measure in Gout. A Validation Process by the OMERACT Ultrasound Working Group JF The Journal of Rheumatology JO J Rheumatol FD The Journal of Rheumatology SP 2177 OP 2181 DO 10.3899/jrheum.141294 VO 42 IS 11 A1 Lene Terslev A1 Marwin Gutierrez A1 Wolfgang A. Schmidt A1 Helen I. Keen A1 Emilio Filippucci A1 David Kane A1 Ralf Thiele A1 Gurjit Kaeley A1 Peter Balint A1 Peter Mandl A1 Andrea Delle Sedie A1 Hilde Berner Hammer A1 Robin Christensen A1 Ingrid Möller A1 Carlos Pineda A1 Eugene Kissin A1 George A. Bruyn A1 Annamaria Iagnocco A1 Esperanza Naredo A1 Maria Antonietta D’Agostino YR 2015 UL http://www.jrheum.org/content/42/11/2177.abstract AB Objective. To summarize the work performed by the Outcome Measures in Rheumatology (OMERACT) Ultrasound (US) Working Group on the validation of US as a potential outcome measure in gout.Methods. Based on the lack of definitions, highlighted in a recent literature review on US as an outcome tool in gout, a series of iterative exercises were carried out to obtain consensus-based definitions on US elementary components in gout using a Delphi exercise and subsequently testing these definitions in static images and in patients with proven gout. Cohen’s κ was used to test agreement, and values of 0–0.20 were considered poor, 0.20–0.40 fair, 0.40–0.60 moderate, 0.60–0.80 good, and 0.80–1 excellent.Results. With an agreement of > 80%, consensus-based definitions were obtained for the 4 elementary lesions highlighted in the literature review: tophi, aggregates, erosions, and double contour (DC). In static images interobserver reliability ranged from moderate to almost perfect, and similar results were found for the intrareader reliability. In patients the intraobserver agreement was good for all lesions except DC (moderate). The interobserver agreement was poor for aggregates and DC but moderate for the other components.Conclusion. These first steps in evaluating the validity of US as an outcome measure for gout show that the reliability of the definitions ranged from moderate to excellent in static images and somewhat lower in patients, indicating that a standardized scanning technique may be needed, before testing the responsiveness of those definitions in a composite US score.