Abstract
Objective. To investigate the associations of Outcome Measures in Rheumatology (OMERACT) ultrasound scores for knee osteoarthritis (OA) with pain severity, other symptoms, and OA severity on radiographs and magnetic resonance imaging (MRI).
Methods. Participants with symptomatic and mild to moderate radiographic knee OA underwent baseline dynamic ultrasound (US) assessment according to standardized OMERACT scanning protocol. Using the published US image atlas, a physician operator obtained semiquantitative or binary scores for US pathologies. Clinical severity was measured on numerical rating scale (NRS) and Knee Injury and Osteoarthritis Outcome Score (KOOS) symptoms and pain subscores. OA severity was assessed using the Kellgren-Lawrence (KL) grade on radiographs and MRI Osteoarthritis Knee Score (MOAKS) on noncontrast-enhanced MRI. Separate linear regression models were used to determine associations of US OA pathologies with pain and KOOS subscores, and Spearman correlations were used for US scores with KL grade and MOAKS.
Results. Eighty-nine participants were included. Greater synovial hypertrophy, power Doppler (PD), and meniscal extrusion scores were associated with worse NRS pain [β 0.92 (95% CI 0.25–1.58), β 0.73 (95% CI 0.11–1.35), and β 1.01 (95% CI 0.22–1.80), respectively]. All greater US scores, except for cartilage grade, demonstrated significant associations with worse KOOS symptoms, whereas only PD and meniscal extrusion were associated with worse KOOS pain. All US scores, except for PD, were significantly correlated with KL grade. US pathologies, except for cartilage, revealed moderate to good correlation with their MOAKS counterparts, with US synovitis having the greatest correlation (0.69, 95% CI 0.60–0.78).
Conclusion. OMERACT US scores revealed significant associations with pain severity, KL grade, and MOAKS.
Osteoarthritis (OA) is one of the most prevalent chronic health conditions causing pain and disability among elderly adults1. Approximately 15.4% of the adult population have symptomatic OA2. By 2030, OA is predicted to be the single greatest cause of disability globally, with an estimated 35% prevalence3.
The pathophysiology of knee OA is complex and involves multiple tissue pathologies affecting the whole joint structure4. Pathologies include synovitis, synovial hypertrophy, effusion, power Doppler (PD) signals, meniscal damage, cartilage loss, and bony osteophyte5,6. Imaging tools are used to visualize the severity of these pathologies, but each has its own limitations7. The plain radiograph involves radiation and can view only the bony structure, while magnetic resonance imaging (MRI) is expensive and not readily accessible in clinical practice4. Ultrasound (US) is a noninvasive imaging tool that can detect soft tissues as well as the bony cortex, including osteophytes, in OA6.
One concern expressed about US has been observer dependence. As such, the Outcome Measures in Rheumatology (OMERACT) group8 used international consensus and reliability testing to develop standardized knee US scanning methods and grading scores for synovitis, synovial hypertrophy, effusion, PD, cartilage thinning, osteophytes, and meniscal extrusion; however, the validity of these grading scores has not been tested. Therefore, the objective of this study is to examine the associations of the OMERACT knee OA US grading scores by testing their relationship with pain severity, clinical symptoms, and severity on plain radiograph and MRI findings.
MATERIALS AND METHODS
Study design and participant selection. This is a cross-sectional analysis using baseline data from the Sydney, Australia, site of the ongoing RESTORE (platelet-Rich plasma as a symptom- and disEaSe-modifying Treatment fOR knee ostEoarthritis) clinical trial (trial registration number: ACTRN12617000853347)9. Inclusion and exclusion criteria were the same as for the RESTORE study9. Briefly, eligible participants met the following inclusion criteria: (1) aged > 50 years; (2) knee pain on most days in the last month; (3) osteophytes on radiographs; and (4) a minimum pain score of 4 on an 11-point numeric rating scale (NRS) for the last week.
The exclusion criteria included (1) Kellgren-Lawrence (KL) grade 1 or grade 4; (2) predominant lateral tibiofemoral disease; (3) systemic or inflammatory joint disease; (4) history of crystalline or neuropathic arthropathy; and (5) unwillingness to discontinue nonsteroidal antiinflammatory drugs and other analgesic usage for knee pain, except for acetaminophen (paracetamol) for rescue pain relief, from 2 weeks prior to baseline assessment.
For those participants with bilaterally eligible knees, the most symptomatic knee was deemed the study knee. The cohort included here is a convenience sample recruited from the baseline visit, and all participants available for an US visit between September 2017 and February 2019 were included.
Participants’ demographic data such as age, sex, height, weight, and symptom duration were collected as previously described9. BMI was calculated using height and weight (kg/m2). This study was approved by the Northern Sydney Local Health Districts Human Research Ethics Committee (HREC/16/HAWKE/430).
Clinical assessment. On the same day of the US scan, average overall knee pain severity over the last week was measured using an 11-point NRS with terminal descriptors “no pain” (score 0) and “worst pain possible” (score 10), with the highest scores denoting the worst pain, and this outcome measure is recommended to be included in knee OA clinical trials by the Osteoarthritis Research Society International10. The Knee Injury and Osteoarthritis Outcome Score (KOOS) pain and other symptoms subscores were collected. The KOOS is a knee-specific self-report outcome measure with high test-retest reliability, internal consistency, and face and content validity. Likert responses range from none to extreme, and scores range from 0 to 100, with lower scores indicating worse symptoms. The KOOS pain subscale is scored from 9 questions about knee pain frequency experienced in the last week and the amount of knee pain experienced during specific activities such as twisting, bending, and walking. The KOOS other symptoms subscale is scored from 7 questions regarding other symptoms experienced in the last week, such as swelling, restricted range of motion, and mechanical symptoms.
Radiological assessment. Participants underwent bilateral weight-bearing posteroanterior radiography (model R-20 J; Shimadzu Corporation) before US and MRI examinations. KL grade was assessed by a rheumatologist (SY, with 7 yrs of experience in grading radiographs of knee OA) who was blinded to clinical, US, and MRI scores.
US evaluation. A physician operator [WMO, with 6 yrs of musculoskeletal US experience and certified with musculoskeletal US in rheumatology (RhMSUS) by the American College of Rheumatology] blinded to clinical, radiograph, and MRI findings, performed and scored the US scans of the study knee11. These were done dynamically and extensively in a wide area with a multifrequency linear 14L5 transducer (using 10 MHz) of the Aplio Platinum 500 machine (Toshiba), according to the standardized OMERACT scanning protocol4,8. The US scores for 7 disease manifestations were then graded by the same operator using the OMERACT knee US OA atlas: semiquantitative scores for (1) synovitis (0–3; combined synovial hypertrophy and effusion); (2) binary scores (0–1) for synovial hypertrophy ≥ 4 mm, (3) effusion ≥ 4 mm12; and (4) PD signals separate from suprapatellar recess in a longitudinal plane, medial and lateral parapatella recesses in a transverse plane, semiquantitative scores for (5) osteophytes (0–3) from the medial and lateral joint aspects in a longitudinal plane and (6) meniscal extrusion (0–2; only the medial joint aspects) in a longitudinal plane, and for (7) cartilage abnormalities (0–3) in a transverse plane on a maximally flexed knee (Supplementary Data 1, available with the online version of this article). The application specialist from Toshiba machine settings optimized the machine setting, providing greyscale gain = 85%, probe frequency = 10 MHz, Doppler frequency = 6 MHz, Doppler gain = 40%, pulse repetition frequency = 14.8 kHz, and wall filter = 5. The US operator was not allowed to change these, except for depth and focus, throughout the study.
The maximum score approach (i.e., the highest score of the same US features such as synovitis and osteophyte from different scanned sites taken as the final score of the whole knee)13 was then used to correlate with clinical, radiographic, and MRI data of the study knee. For the whole knee scan for these 7 disease manifestations, it took around 8 min for scanning and about 13 min for scoring.
Interrater and intrarater reliability. Testing of interrater reliability was limited to suprapatellar synovitis and PD, medial osteophytes, and medial meniscal extrusion. A second trained reader (DP, with 8 yrs of musculoskeletal US experience) independently performed the US scans of the study knee in 20 patients after the first US operator finished scanning, and provided the independent grading. To evaluate intrarater reliability of all 7 ultrasound OA manifestations, the same operator (WMO) rescanned 10 patients 1 week later and calculated US scores while blinded to the previous scores.
MRI evaluation. On the same day as the US scanning, the study knee was imaged with a 3T MRI scanner (Siemens Skyra, Siemens Healthcare) using a 15-channel transmit/receive knee coil. The following 5 MRI sequences were performed: (1) sagittal T2-weighted dual-echo steady-state; (2) sagittal proton density–weighted fat-suppressed noncontrast turbo spin-echo (TSE); (3) coronal proton-density–weighted TSE; (4) coronal proton density–weighted fat-suppressed TSE; and (5) axial proton density–weighted fat-suppressed TSE. Technical details of the sequences can be found in Supplementary Data 2 (available with the online version of this article).
The semiquantitative MOAKS grading involves evaluation of the cartilage loss (any or full-thickness) from patellofemoral, medial, and lateral tibiofemoral compartments; osteophytes from 12 different sites; medial meniscal extrusion; effusion-synovitis over the suprapatellar and parapatellar areas; and Hoffa synovitis over the Hoffa fat pad at the infrapatellar area as described by Hunter, et al13. The maximum score of the same MRI features, such as cartilage loss (any or full thickness), and osteophytes from all sites, was taken as the whole knee score for that MRI feature.
Interrater and intrarater reliability of MRI. Scoring of the MOAKS was performed by WMO, who obtained imaging training from an experienced musculoskeletal radiologist (JML, with 25 yrs of experience in musculoskeletal MRI). Both readers independently scored the MRI images of 10 consecutive participants. The readers were blinded to clinical features and symptoms, and radiograph and US scores. WMO also performed the second reading of all MRI images 1 month apart to obtain the intrarater reliability.
Statistics. Descriptive statistics of categorical variables were expressed as frequencies and percentages. Descriptive statistics of continuous variables were calculated as mean and SD for normally distributed data, and median and range for nonnormally distributed data. Although it might seem that the OMERACT ultrasound scoring system is 1 single scoring system, in fact, it consists of 7 US scoring systems, covering both structural and inflammatory features present in knee OA. For all these scoring systems, the relationship has to be assessed separately. To investigate whether these US features were associated with pain and other symptoms, separate linear regression models were fit with each US feature as predictor, adjusting for age, sex, BMI, duration of disease, and radiographic KL grade. Spearman correlations were calculated to determine the relationship of US features with radiographic KL grade and MRI MOAKS. Correlation coefficients were interpreted according to the Evans classification14: < 0.20 = very weak; 0.20–0.39 = weak; 0.40–0.59 = moderate; 0.60–0.79 = strong; and > 0.80 = very strong. The study was powered for the association of the 7 US pathologies with visual analog scale joint pain. With 7 potential predictors, testing at the 5% significance level with 80% power, and assuming a minimum R2 of 0.3, forty-two patients were required to show that the US scores explained a statistically significant amount of the variation in joint pain. All statistics were conducted with SPSS version 23, and a significant association/correlation was defined as a P value < 0.05.
RESULTS
Demographic and clinical characteristics, and US and MRI findings. Eighty-nine participants were included in this study, with 48 (53.9%) female, BMI of 27.5 ± 6.4, pain of 5.8 ± 1.5 on an NRS scale, 59.6% of participants having KL grade III, and 95.5% and 47.2% showing US synovitis grade ≥ 1 and PD signals, respectively. However, synovial hypertrophy and effusion on US were present in 47.2% and 59.6% of the participants, respectively, using quantitative cutoffs of 4 mm. All participants had osteophytes and meniscal extrusion on US, with 95.5% having cartilage abnormalities. Table 1 demonstrates the other characteristics in detail.
Baseline clinical, radiographic, ultrasound, and MRI data of study participants.
Reliability for US scores. The κ statistics for interrater reliability ranged from 0.55 to 0.88, indicating moderate to excellent agreement, and the κ statistics for intrarater reliability ranged from 0.63 to 1.00, indicating good to excellent reliability (Table 2).
Intrarater and interrater reliability of OMERACT ultrasound scores in knee OA.
Reliability for MOAKS. The κ statistics for the interrater reliability ranged from 0.42 to 0.90, indicating moderate to excellent agreement for individual MRI lesions, while intrarater reliability was mostly good to excellent, as shown by κ statistics ranging from 0.64 to 0.92 (Supplementary Data 3, available with the online version of this article).
Association of US findings with clinical symptoms. After adjusting for the confounders, only OMERACT scores of synovial hypertrophy, PD signals, and meniscal extrusion scores were significantly associated with pain severity on NRS (Table 3). For example, when power Doppler was present (0–1), the pain NRS increased by 0.54 units (β 0.54, 95% CI 0.11–0.96).
The association of OMERACT ultrasound knee OA scores with NRS pain, KOOS symptoms, and KOOS pain.
All OMERACT scores except for cartilage grade demonstrated significant associations with KOOS other symptoms (Table 3). For example, when PD signals were present (0–1), the KOOS other-symptoms score decreased (worsened) by 6.1 units (β –6.12, 95% CI –10.93 to –1.31). Only meniscal extrusion and PD signals were significantly associated with KOOS pain (Table 3). For example, for a 1 unit increase on meniscal extrusion grade (0–2 on a semiquantitative score), knee pain on the KOOS score decreased (worsened) by 10.8 units (β –10.84, 95% CI –18.57 to –3.10).
Association of US findings with radiographic KL grade. The US synovitis, synovial hypertrophy, effusion, osteophyte, and meniscal extrusion were significantly correlated with KL grade, except for PD signals and cartilage scores (Figure 1 and Supplementary Data 4, available with the online version of this article).
The association of OMERACT ultrasound OA scores with KL grade on radiograph. KL: Kellgren-Lawrence; OMERACT: Outcome Measures in Rheumatology; US: ultrasound.
Association of US findings with MOAKS. The associations between US features and their MRI counterparts are presented in Figure 2 and Supplementary Data 5 (available with the online version of this article). Synovitis, synovial hypertrophy, effusion, PD signals, osteophyte, and meniscal extrusion on US were significantly associated with their respective MRI counterparts with the largest correlation for US synovitis (Figure 3). Measures of osteophytes and meniscal extrusion showed significant associations between the 2 imaging modalities, while US cartilage thickness showed a significant but weak relationship with MRI cartilage thickness (any or full) on MRI.
The association of OMERACT ultrasound OA scores with MOAKS on MRI. OMERACT: Outcome Measures in Rheumatology; MOAKS: Magnetic Resonance Imaging Osteoarthritis Knee Score; MRI: magnetic resonance imaging; US: ultrasound.
The demonstration of ultrasound and MRI synovitis from 3 synovial recesses of the knee in the same patient. (A) OMERACT grade 3 synovitis at the suprapatellar recess on a longitudinal scan. (B) OMERACT grade 3 synovitis at the medial parapatellar recess on a transverse scan. (C) OMERACT grade 3 synovitis at the lateral parapatellar recess on a transverse scan. (D) MOAKS grade 3 effusion-synovitis on the axial noncontrast-enhanced MRI scan. BML: bone marrow lesion; MOAKS: Magnetic Resonance Imaging Osteoarthritis; MRI: magnetic resonance imaging; OMERACT: Outcome Measures in Rheumatology.
DISCUSSION
To our knowledge, this is the first study examining the associations of OMERACT knee US scores against pain severity and other symptoms using well-validated self-reported questionnaires and standard imaging tools widely used in the OA clinical and research setting. We found significant associations of US scores such as PD signal, synovial hypertrophy, and meniscal extrusion with NRS pain and KOOS pain subscore as well as KOOS symptoms. Significant associations with radiographic severity were detected in all US pathologies except for PD signals and cartilage grades, with meniscal extrusion showing the highest associations. US synovial and structural disorders had significant associations with their MRI counterparts with moderate to strong correlation for synovitis, synovial hypertrophy, PD signals, meniscal extrusion, and osteophytes. Thus, our findings further support the use of the OMERACT US scores in the knee OA research setting. The OMERACT scanning protocol involved scanning over a wide area as well as multiple sites instead of a single predefined location. This can increase the chance of detecting more pathologies, if present, compared to a single predefined scan, due to the capability of scanning the entire joint. In addition, the maximum score of a certain US pathology from different scanning sites was used as a single final score in our study instead of adding them because the semiquantitative score is an ordinal and not an interval scale15. This method is commonly used in MRI research13,16. It might provide better coverage of pathologies present in the whole knee compared to single location-specific score. As an example, out of 16 patients with grade 0 synovitis in suprapatella recess in our study, 8 people demonstrated ≥ grade 1 synovitis in medial parapatellar recess. This is also supported by the fact that the prevalence of MRI effusion-synovitis, which takes into account synovitis in all synovial recesses on axial MRI scan, is almost the same in our study (93.3%).
The reliability was done in medial compartment because our study participants had predominant medial OA. On comparison with OMERACT reliability exercises, which reported moderate to good agreement across 2 κ rounds (κ = 0.52 and 0.51 for synovitis, κ = 0.54 and 0.58 for meniscal extrusion, and κ = 0.57 and 0.62 for osteophytes), our results were comparable for synovitis (κ = 0.55) and meniscal extrusion (κ = 0.55), whereas we have better agreement for osteophytes (κ = 0.88). In addition, in this study, we have recruited the sonographer to perform and score the US scan independently in 20 patients (only 22% of the whole study sample). In order to get away from the conception of operator dependency in US, it would be helpful in future studies to also have an uninvolved reader assess the US images and determine the agreement between those 2 US readers, which could support the lack of operator dependency.
The prevalence of synovitis, when assessed using the OMERACT atlas maximum score approach8, is high (> 95%). However, for synovial hypertrophy and effusion, which used the strict criteria of 4-mm cutoffs (for which there is no published atlas), the prevalence of these synovial disorders reduces to approximately 50%, in agreement with a metaanalysis report on knee OA (49%, 95% CI 30.5–67.6)17. This may indicate that the OMERACT atlas for grade 1 synovitis might include people with normal physiological fluid, which can be up to 3 mm thick, as the semiquantitative grading score is visually based on the amount of distension of knee recesses using the standardized atlas12.
The association of synovial pathologies with pain and symptoms did not show consistent results in the literature. Some authors reported significant associations18,19,20,21, whereas others determined no association22,23,24,25. This may be due to using different cutoffs (4 mm vs 2 mm for synovial hypertrophy), different grading methods (semiquantitative or qualitative), and application of varying case definitions and inclusion of different disease severity in the study protocols. The utilization of standardized OMERACT US knee score in future studies will help minimize heterogeneity of such scanning protocols and grading methods. Our study using the OMERACT synovitis atlas and quantitative cutoff (4 mm) for synovial hypertrophy demonstrated significant correlation.
US synovitis is strongly correlated with MRI effusion-synovitis. This finding further supports the symptom-structure discordance widely recognized in the OA imaging literature26. This is due to the fact that pain is a very subjective phenomenon27, and psychosocial factors and neurobiological mechanism such as pain sensitization28 can influence the association. Although synovial hypertrophy had significant correlations with NRS pain, KOOS symptoms, and KOOS pain, it had only a moderate correlation with MRI synovitis. As a note, MRI is not contrast-enhanced in our study and so not optimal for detecting the synovial hypertrophy29, thereby placing MRI at a disadvantage on the level of association. Our magnitude of association is consistent with the report by 2 studies20,30, although they utilized different US scanning methods and grading definitions (different quantitative cutoffs for semiquantitative scores) for both MRI and US scores.
Only PD signals and meniscal extrusion are important predictors for NRS pain. This finding is reinforced by the significant associations of these US pathologies with KOOS pain, a different composite measure of pain characteristics involving pain frequency and amount of pain during specific activities. Although PD signals had been a focus of interest in rheumatoid arthritis31, there is a paucity of publications that reported the isolated association of PD signals with pain severity due to very low prevalence of PD observations in the studies19,23,32, or because the extent of association was based on total inflammatory score combining synovitis and PD signals33,34, or the scanning protocol did not include evaluation of PD signals. Iagnocco, et al32 observe PD signals in only 1 patient in their sample (n = 17), while Hall, et al obtain 10 observations in 62 patients with symptomatic knee OA23, leading to lack of power to detect any significant associations. Song, et al reported that PD signals revealed the significant association of PD signals with pain (r = 0.37, P =0.02)20, which is confirmed by our study.
As expected, PD is not a significant predictor of KL grade, perhaps due to the fact that PD is a sensitive and reliable marker only for the acute and active inflammatory phase of arthritis35,36. However, knee OA is recognized as off-and-on disease with exacerbation and remission27, while KL grade reflects the collective structural outcome accumulated over long-term disease process and is focused on change in the bone37,38.
Discordant results were reported for the association of meniscal extrusion with pain, some with significant results22,39 and others with negative results21,40,41. Chan, et al22 reported that medial meniscal extrusion measured in mm showed significant association with extent of pain during stair climbing, while the degree of meniscal extrusion was significantly increased in painful knee OA compared with painless knee39. On the other hand, significant association was not detected between presence of meniscal extrusion (cutoff > 3 mm) and pain severity in a case-control design40,41. In a previous study, Kijima, et al reported that meniscal extrusion > 4.3 mm cutoff provided high sensitivity (85%) and specificity (85%) for presence of knee pain in the general population42. In MRI studies, meniscal extrusion plays a crucial role in OA pathogenesis, progression, and symptom genesis43,44.
The meniscal extrusion showed the strongest association with KL grade, perhaps due to the fact that our sample was limited only to KL grades II and III, the difference of which is only joint space narrowing (JSN). Hunter, et al reported that the meniscus accounts for a substantial proportion of the variance explained in JSN45.
Unexpectedly, cartilage grade did not reveal a significant association with KL grade. Several reasons might contribute to this: (1) The location where cartilage US measures were taken might not exactly represent the actual maximal weight-bearing area on standing; and (2) cartilage thinning might be on the tibial cartilage, which is inaccessible to US. However, further analysis after dichotomizing the cartilage (cartilage thinning present or not by combining grade 0 and 1, and grade 2 and 3, respectively) is nonsignificant. The authors of the OMERACT US OA atlas discussed that US cartilage grade needs further research due to assessment problems8. US cartilage grade also failed to show a significant association with all other outcome measures except for MRI cartilage loss, which revealed a significant but weak association. In the MRI literature, the associations between cartilage abnormalities and symptoms are not consistent46,47.
While it is important to standardize outcome tools in clinical trials, and this study does provide the usefulness of OMERACT US knee OA protocol as a scoring system, the utility of this US scoring tool for a meaningful clinical practice needs further research for several reasons. Cartilage loss correlated with nothing but MRI, and PD did correlate with NRS pain, but as yet, antisynovial/antiinflammatory therapies have not been very promising in knee OA, and baseline inflammation has not consistently been shown to predict response to antiinflammatory/antisynovial therapies48,49.
This study has limitations. We did not include psychosocial factors that can have an effect on the level of symptom-structure association. However, the important known confounders are adjusted in our analysis. Another limitation is that the anatomical site of US scoring might take place in a different location from measurements on an MRI in the absence of an invasive marker, as in the cartilage and osteophyte scores. Similarly, the radiographs were obtained in weight-bearing position, whereas the US and MRI were obtained with a person lying supine. The last limitation is that the study relies mainly on results of linear regression and correlation analyses. Therefore, the lack of correlation between variables may not necessarily represent a lack of a relationship, as some relationships may be nonlinear.
In conclusion, most of OMERACT US OA scores had a significant but modest association with symptoms and imaging scores from radiographs and MRI. These results support the construct validity of the OMERACT US scores and their use in future US studies as a useful outcome. As this is a cross-sectional study, longitudinal studies are required to determine its responsiveness to change to further determine its value as an outcome measure in interventional studies.
ACKNOWLEDGMENT
This study was funded by the National Health and Medical Research Council (ID: 1106274). We acknowledge Canon Medical Systems for providing the Toshiba ultrasound machine and the study participants.
Footnotes
W.M. Oo received a Presidential scholarship of Myanmar for his PhD course. D.J. Hunter is supported by an NHMRC practitioner fellowship.
DJH provides consulting services to Pfizer, Lilly, Merck Serono, and TLC.
- Accepted for publication April 28, 2020.
- Copyright © 2021 by the Journal of Rheumatology
REFERENCES
DATA AVAILABILITY
Data are available from the corresponding author on reasonable request.
ONLINE SUPPLEMENT
Supplementary material accompanies the online version of this article.