Abstract
Objective. To investigate criterion validity and intraobserver reliability of magnetic resonance imaging (MRI) in hand osteoarthritis (HOA).
Methods. In 16 patients with HOA (median age 57 yrs, 62% women, 13 with erosive OA), 3 Tesla MRI scans with gadolinium-chelate administration of right second to fifth distal interphalangeal/proximal interphalangeal joints were scored according to the Oslo HOA scoring method for synovial thickening, bone marrow lesions (BML), osteophytes, joint space narrowing (JSN), and erosions (grade 0–3). Ultrasound (US) was scored for synovial thickening and osteophytes, radiographs for osteophytes and JSN (Osteoarthritis Research Society International score), and anatomical phases (Verbruggen-Veys score). Pain was assessed during physical examination. Correlations of MRI with US and radiographic features were assessed with generalizability theory. With generalized estimating equations analyses, MRI features were associated with pain, adjusting for confounding.
Results. Forty-three percent, 27%, 77%, and 61% of joints had synovial thickening (moderate/severe), BML, osteophytes, and erosions on MRI, respectively. Intraobserver reliability, assessed in 6 patients, was good (ICC 0.77–1.00). Correlations between osteophytes, JSN, and erosions on radiographs and MRI were moderate, substantial, and fair (ICC 0.53, 0.68, and 0.32, respectively); MRI showed more lesions than radiography. Correlation between synovial thickening and osteophytes on MRI and US was moderate (ICC 0.43 and 0.49, respectively). MRI was more sensitive for synovial thickening, US for osteophytes. Pain was associated with moderate/severe synovial thickening (adjusted OR 2.4, 95% CI 1.06–5.5), collateral ligaments (4.2, 2.2–8.3), BML (3.5, 1.6–7.7), erosions (4.5, 1.7–12.2), and osteophytes (2.4, 1.1–5.2).
Conclusion. MRI is a reliable and valid method to assess inflammatory and structural features in HOA. It gives additional information over radiographs and US.
Hand osteoarthritis (HOA) is a prevalent musculoskeletal disease that can lead to pain or functional limitations1,2. The OA process results in structural involvement of all compartments of the joint, including cartilage, subchondral bone, synovium, capsule, and ligaments3. In HOA, several subsets can be distinguished, of which nodal and erosive OA preferentially involve the interphalangeal (IP) joints1,4.
Patients with nodal OA in the IP joints present with bony enlargements, deformities, and loss of range of motion4. These classical structural hallmarks of HOA can be visualized on conventional radiographs as osteophytes, malalignment, and joint space narrowing (JSN)5. In addition in erosive OA, subchondral erosions with widening can be seen4. However, radiography is an insensitive imaging modality and a more sensitive method visualizing not only structural changes but also soft tissues is needed. Ultrasound (US) has been introduced to visualize osteophytes and soft tissues in HOA. It has been shown that US is more sensitive than radiography to detect osteophytes and erosions, and moreover that synovitis is frequently seen in HOA1,6,7,8.
In knee OA, magnetic resonance imaging (MRI) seems to be a valid imaging modality that enables visualization of the subchondral bone, including bone marrow lesions (BML) and soft tissues9,10. For HOA, few studies used MRI to investigate abnormalities in soft tissue and subchondral bone4,11,12,13. An MRI scoring method supported by an atlas was proposed that facilitates research with MRI in HOA. The Oslo Hand OA MRI score (OHOA-MRI score) was developed as a reliable method to assess key features in HOA14. To be able to use MRI and a scoring system for HOA, it is, however, necessary to prove validity, reliability, and feasibility.
The purpose of our present study was therefore to test the intraobserver reliability and criterion validity of the MRI in a severe HOA population.
MATERIALS AND METHODS
Patient population
Sixteen patients with HOA, fulfilling the American College of Rheumatology criteria15, were recruited from the rheumatology outpatient clinic from July 2008 to October 2010. The patients were all participants of an international placebo-controlled medication study (Clinical Trial Governance reference: EudraCT 2007-003, 994-18). For our study, baseline data of the participants in the Netherlands were used. Participants had at least 1 joint in J, E, or R phase of the Verbruggen-Veys score (defined below) in the IP joints on conventional radiographs and pain ≥ 30 mm on the visual analog scale (VAS). Patients were excluded if they had chronic inflammatory rheumatic diseases [e.g., rheumatoid arthritis (RA), spondyloarthritis, psoriatic arthritis, hemochromatosis, gout, or chondrocalcinosis].
Approval of the study by the medical ethics committee of the Leiden University Medical hospital and signed informed consent were obtained.
Clinical assessment
Demographic characteristics were collected by standardized questionnaires. All patients completed a 100-mm VAS to assess hand pain over the past 48 h. Use of analgesics was allowed during the study. Pain upon palpation, bony, and/or soft swelling (“absence”/”presence”) for each distal and proximal IP joint (DIP, PIP) was assessed by a single observer (WYK) during physical examination using the Doyle Index, which has been validated for HOA16.
MRI examinations
The second to fifth DIP and PIP joints of the right hand were imaged in a 4-channel wrist coil using a 3T MRI Unit (Achieva 3T; Philips Medical Systems) with the patient positioned supine with the arm in neutral position parallel to the body. In all patients, the following sequences were obtained: coronal turbo spin echo [TSE; slice thickness (ST) 2 mm, repetition time/echo time (TR/TE) 1139/20 ms], coronal frequency selective fat-suppressed T2-weighted images (ST 3 mm, TR/TE 4013/60 ms), sagittal T1-TSE (ST 3 mm, TR/TE 450/20 ms), sagittal frequency selective fat-suppressed T2-weighted images (ST 3.5 mm, TR/TE 7768/60 ms), coronal post-gadolinium-chelate (Gd)-DOTA fat-suppressed images (ST 2 mm, TR/TE 1138/20 ms), and sagittal post-Gd-DOTA fat-suppressed images (ST 3 mm, TR/TE 995/20 ms; 0.1 mmol/kg, Dotarem). In 4 patients, additional images were obtained with the following sequences: axial native T1-weighted images (ST 3 mm, TR/TE 633/20 ms), and post-Gd-DOTA frequency selective fat-suppressed T1- (ST 3 mm, TR/TE 570/20 ms) and axial frequency selective fat-suppressed T2-weighted images (ST 3 mm, TR/TE 4490/60 ms). MRI examinations were obtained on the same day as clinical assessments and radiographs.
MRI features were scored by a single reader (WYK) after a training session of 1 week with the developers of the OHOA-MRI score. MRI features were scored using T1-weighted fat-suppressed Gd images for synovial thickening (grade 0–3), flexor tenosynovitis (grade 0–3), and bone cysts (grade 0–1, proximal and distal) using T1-weighted images for collateral ligaments (present or absence: the absence of the collateral ligament was defined as a nonvisible or noncontinuous collateral ligament, grade 0–1, radial and ulnar), bone erosions (grade 0–3, proximal and distal), osteophytes (grade 0–3, proximal and distal), JSN (grade 0–3), and malalignment (grade 0–1, sagittal and frontal plane) and using T2-weighted fat suppressed images to detect BML at insertion sites of collateral ligaments (grade 0–1, radial, ulnar, proximal, and distal) and other BML (grade 0–3, proximal and distal). For the analyses, collateral ligaments, cysts, and erosions were dichotomized as present/absent. To be able to compare osteophytes on MRI with osteophytes on radiographs and US, the highest score given to either the distal or proximal part of the joint on MRI images was used. For instance, when a joint had a score 1 at the distal part and score 3 at the proximal part of the joint, score 3 was assigned to that joint.
MRI sequences were adopted according to the original article of the OHOA-MRI score, with the exception of the T1-weighted fat-suppressed images that are normally not used in MRI. Instead T1-weighted images without fat suppression were acquired.
Because the study was designed before the OHOA-MRI was published, and the axial planes were not included in the original protocol but sagittal planes were, only the last 4 patients had additional axial planes.
MRI images of 6 patients (3 with coronal and sagittal planes only, and 3 with coronal, sagittal, and axial planes) were scored twice with an interval of 5 weeks to determine intraobserver reliability.
US assessment
US was performed by 1 experienced ultrasonographer (WYK) while always in the presence of a second ultrasonographer (MCK) scoring together in consensus using a Toshiba Applio scanner (Toshiba Medical Systems) with a 10–14 MHz linear array transducer. Settings were optimized by the application specialist of the manufacturer of the machine. US was performed 3–19 weeks in advance of the MRI and clinical assessment (median 6 weeks) because of logistic/practical reasons.
All hand joints were scanned from the dorsal side only in longitudinal and transverse planes. Features had to be present in both planes. Each joint was scored for osteophytes, power Doppler signal (PDS), and synovial thickening7,17. All US features were scored on a 4-point scale (0 = none, 1 = mild, 2 = moderate, 3 = severe). The intraobserver reliability was substantial to almost perfect (ICC 0.62–0.91)7.
Conventional radiographs
Radiographs (dorso-volar) of the right hand, using a standardized protocol, were read by WYK and scored for osteophytes (grade 0–3), JSN (grade 0–3), and cysts (grade 0–1) using the OARSI-atlas18. Erosions were scored according to the Verbruggen-Veys scoring method, defined as an erosive (E-phase) or remodeled phase (R-phase). J-phase was defined as a joint with complete joint space loss in part or the whole joint19. The intraobserver reliability was good to excellent (ICC 0.62–0.94) for all radiographic features.
Statistical analysis
Data were analyzed using SPSS, version 20.0 (IBM SPSS statistics).
Reliability was determined by estimating ICC using generalizability theory, a random factor model ANOVA approach that estimates the components of variance within each model. Using this method was more suitable compared with the traditional ICC analyses because of the separate outcomes on joint level with unique joints clustered within a patient. The ICC calculated in our study was not similar to the classical definition of ICC, and were called G-coefficients as defined by Streiner and Norman20. We retained the term ICC to indicate that the results were comparable to the classical ICC. Interpretation of the correlations were 0–0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1.00 almost perfect.
Elementary sources of variance in data were called facets in generalizability theory. For intraobserver reliability, relevant facets in our study were patients (0–16) and hand joints (0–8). Dependent variables were the separate features of each imaging modality.
In generalizability theory, a distinction was made between fixed and random facets. The facets “patient” and “hand joints” were defined as random facets. The facet “hand joints” was nested within the facet “patient” because each patient had a unique set of hand joints.
To study criterion validity of MRI features, concurrent validity was evaluated by comparing MRI with radiograph and US features in the second to fifth DIP/PIP joints of the right hand only (128 joints). Subsequently, generalizability theory was used to determine correlations between MRI and US or radiographic features, because for these analyses, separate outcome per joint were of relevance in a situation where in a patient, 8 unique joints were clustered. Generalizability theory is a statistical method capable of analyzing this nested model.
For the different imaging modalities, the facets were defined as “patient” (0–16), “hand joints” (0–8), and “method” (MRI, US, conventional radiograph). The dependent variables were imaging features. The facets “patient” and “hand joints” were defined as random facets, the imaging modality as fixed facet. The facet “hand joints” was again nested within the facet “patient”.
Because we expected, based on results from earlier studies21,22,23, that radiographs would be less sensitive in detecting features compared with MRI, we also expected to find correlations between the imaging modalities, but these correlations were predicted not to be 1, but to range between about 0.4 and 0.8.
We expected to find higher correlations between MRI and US because they were both considered to be more sensitive imaging modalities when compared with radiographs.
Mann-Whitney U test was used to compare affected joints between the different imaging modalities. A p value < 0.05 was considered significant.
To study the relationship between MRI features (as independent variables) and pain on the individual joint level, we associated MRI features with pain upon palpation in hand joints using generalized estimating equations with robust variance estimators to account for the correlation of observations within the same person. Adjustments were made for age, sex, and body mass index (BMI). Results were presented as OR with 95% CI.
RESULTS
Study population
Sixteen patients were included [median age 56.7 yrs (range 42.0–70.7), 62% women, median BMI 25.7 kg/m2 (range 20.2–32.4)]. The median symptom duration was 6.5 years. Erosive OA was found in 13 patients and median VAS pain was 70 mm (range 35–93). The median number of swollen and tender joints was 2.5 (1–6) and 5 (1–12), respectively. Bony swelling was present in 61% and soft swelling in 18% of the joints palpable during clinical assessment.
In 1 patient, the contrast arrived subcutaneously instead of intravenously. Therefore, (teno)synovitis could not be assessed in 8 joints and consequently the number of joints assessed by MRI for the presence of synovial thickening and structural changes varied. In 2 DIP joints, correct scoring was not possible for some features because of incorrect positioning of the joint in the coil.
MRI-detected synovial thickening was present in 117 joints (98%). If the cutoff for MRI synovitis was set on grade ≥ 2 (moderate to severe), 51 joints (43%) had synovial thickening. Flexor tenosynovitis was seen in 36 (30%), erosions in 77 (61%), bone cysts in 16 (13%), and BML in 36 (27%) joints on MRI. Collateral ligaments were present in 84 joints (66%) and BML at the insertion sites of collateral ligaments in 17 joints (13%). Osteophytes and JSN were seen in 98 (77%) and 116 (91%) joints on MRI, respectively. Malalignment was only seen in the 2 DIP joints on MRI. Table 1 shows the distribution of these features stratified for DIP/PIP joints.
Reliability
The intraobserver reliability of MRI features as determined in 6 patients with 48 hand joints was substantial to almost perfect, as depicted in Table 2.
Validity of MRI versus US
US detected synovial thickenings (grade ≥ 1) in 54 (42%) of 128 joints (20 DIP, 34 PIP), PDS in 29 joints (23%, 13 DIP, 16 PIP), and osteophytes in 127 joints (64 DIP, 63 PIP). MRI was significantly more sensitive for the detection of synovial thickening compared with US (p < 0.0001), while MRI was less sensitive for osteophytes (p < 0.0001).
A moderate correlation coefficient of 0.43 was found between synovial thickening on MRI (graded 0–3) and on US (graded 0–3). When presence of MRI synovial thickening was defined as grade > 1, an ICC of 0.54 was found.
Correlation coefficient between osteophytes on US (grade 0–3) and MRI (grade 0–3) was 0.49.
Validity of MRI versus radiography
Radiographic osteophytes (grade ≥ 1) were present in 53 (41%) and JSN (grade ≥ 1) in 97 (76%) joints, significantly less than on MRI (77%, p < 0.001 and 91%, p = 0.001, respectively). Radiographic erosions were detected in 23 joints (18%), significantly less than on MRI (61%, p < 0.001). Twenty-two joints with radiographic erosions were erosive on MRI as well. Radiographic bone cysts were seen in 25 joints (20%), significantly more than on MRI (12%, p < 0.001; Table 3).
The correlation coefficient for osteophytes (0–3), JSN (0–3), erosions (0–1), and cysts (0–1) were 0.53, 0.68, 0.32, and 0.43, respectively, indicating fair to substantial correlations between the MRI versus radiographic features.
Validity of MRI features with pain upon palpation at joint level
We hypothesized that joints with osteoarthritic MRI features would be painful more often. Therefore, associations between pain upon palpation and synovial thickening were calculated.
Only 3 joints were classified as grade 0 for synovial thickening and could not be used as reference category. Therefore, synovial thickening was dichotomized into no/mild (grade 0/1) versus moderate/severe (grade 2/3) for the analyses. All other features were dichotomized as presence (grade 1–3) or absence (grade 0).
Pain upon palpation was significantly associated with the presence of moderate/severe synovial thickening, BML, erosions, and abnormal collateral ligaments after adjustments for age, sex, and BMI (Table 4). A positive trend was seen with BML at the insertion sites of collateral ligaments and JSN.
DISCUSSION
In this severe, (pre)erosive, HOA population, MRI was found to be a reliable method to investigate OA characteristics in HOA, as shown by substantial to almost perfect intraobserver reliability of all MRI features.
MRI criterion validity was confirmed by comparing MRI with US, radiography, and clinical features showing substantial correlations.
Comparison with physical examination showed that MRI abnormalities, such as synovial thickening and osteophytes, but also abnormal collateral ligaments, BML, and bone erosions, were associated with pain upon palpation in individual joints.
Until now, radiographs have been used as the gold standard for detection of HOA features for diagnosis and research purposes. Unfortunately, this imaging modality has limitations because it is unable to show soft tissue. US has been used not only for visualization of structural but also inflammatory features. A drawback of this imaging modality is, however, the inability of the US beam to penetrate through bone, making it more difficult to visualize subchondral abnormalities, such as BML, although erosions might be sensitively detected by US. MRI has the possibility to identify both soft tissue and structural abnormalities, as well as abnormalities in subchondral bone, and is, therefore, potentially a better alternative to radiographs as the gold standard.
To test this hypothesis, concurrent validity was assessed by comparing features detected on radiographs and US with those found on MRI. As expected, correlations found were between 0.40 and 0.80 for all features, except for erosions. MRI is, therefore, a valid method.
Erosions detected on MRI versus radiographs showed a lower correlation than expected (0.32). This might be explained by the fact that erosions on MRI were not always identified as erosions on radiographs, but were classified as cysts. The latter became obvious when comparing the presence of cysts and/or erosions on MRI and radiographs on joint level. The observation that cysts found on radiographs appear to be erosions on MRI was also made by Haugen, et al21.
In the present study, MRI showed far more joints with synovial thickening than did US. Only a few studies compared synovial thickening on MRI and US earlier.
Vlychou, et al studied metacarpophalangeal, PIP, and DIP joints of 1 hand of patients with erosive HOA (n = 13) and nonerosive HOA (n = 7). In this study population, means of affected joints appeared higher in US compared with MRI, but results have to be interpreted with caution because of the small sample sizes; analyses were done on patient level22.
Wittoek, et al8 studied 8 IP joints of 14 patients (9 erosive HOA, 5 nonerosive HOA) and found more synovitis using 3 Tesla MRI (20% of all joints) compared with US (15% of joints) with a percentage exact agreement of 87%. The authors used recommendations for hand joint pathology in RA. In these recommendations, synovitis on contrast-enhanced MRI is defined as an area in the synovial compartment that shows above normal post-gadolinium enhancement of a thickness greater than the width of the normal synovium.
After contrast administration, normal synovial tissue enhances as well as abnormal and thickened synovial tissue. The threshold for abnormal synovial thickening is most likely set too low in the present study. A reason for this might be that more detail could be visualized on the high-resolution images of the 3 Tesla MRI machine. Thin synovial tissue is seen in these images while this is less visible on the atlas used as a reference, which is based on images derived from a 1 Tesla MRI machine. Moreover, sequences used were not obtained directly but were constructed afterward, a practice that results in a lower resolution of images.
When MRI synovial thickening scores 0 and 1 were considered within the normal limits, MRI and US demonstrated synovial thickening in 43% and 42% of hand joints, respectively, and correlation between the 2 modalities increased.
It was expected that US and MRI would show more osteophytes compared with radiographs because these 2 imaging modalities are capable of scanning in different planes, thus enabling osteophytes at locations other than on the sides to be detected. US, however, detected more osteophytes compared with MRI. This is in concordance with earlier studies8,21. The reason for this higher sensitivity might be the ability to scan around the joint in a continuum using US while MRI is performed in coronal and sagittal slices. Maybe this makes it more difficult to discern osteophytes that are in between 2 images.
MRI features of OA were frequently seen in the hand joints of our HOA population. The prevalence of MRI abnormalities is comparable with those described earlier. In our present study, 61% erosions, 77% osteophytes, and 27% BML were found. Wittoek, et al8 studied 9 patients with erosive HOA using 3 Tesla MRI and found 63% erosions, 57% osteophytes, and 52% BML. In another study in patients with HOA, done by the developers of the OHOA-MRI score11, osteophytes were found in 89%, erosions in 51%, and BML in 13% of joints.
The association of MRI features with pain was also investigated to increase the understanding of causes of pain in HOA and validate MRI with clinical features. We showed that presence of moderate/severe synovitis and BML were positively associated with pain, suggesting that inflammation is an underlying cause of pain in HOA. This is in line with an earlier study in HOA11 and an US study in HOA showing that synovial thickening and PDS are associated with more pain per joint7. In our present study, we did not study the association between pain upon pressure and US or radiographic features.
The MRI images were scored by the developed OHOA-MRI score14. Our 3 Tesla MRI images (supplementary content is available from the authors on request) were of good quality with higher spatial resolution compared with the 1 Tesla images of the atlas that were made by the developers of the OHOA-MRI score.
After implementing and using the scoring method, we observed some items that need consideration.
First, it is not common practice to use T1-weighted fat-suppressed images, as the OHOA-MRI developers recommend. In T1 sequences, all water-containing structures appear black in the image, leaving good visualization of fat-containing structures. After suppression of the latter, it is difficult to discern any structure. Therefore, T1-weighted images were used instead.
Also, the present scoring method scores collateral ligaments as “absence” or “presence,” suggesting that the absence of collateral ligaments is a rupture of these ligaments. However, if abnormalities around collateral ligaments are present, more signal will be visualized on MRI, mimicking the “absence” of the ligament as illustrated in the MRI-atlas and, therefore, we suggest scoring collateral ligaments as “normal”/”abnormal” in further studies.
Although the objective of our study did not allow the investigation of feasibility, it was noticed during scoring of MRI images that a considerable amount of time was needed for the assessment of 1 patient (about 75–90 min). Feasibility should be a topic for further studies.
Several limitations can be addressed in our study. MRI images were obtained in a highly selected population with severe complaints. The sample size was small. This could influence the results, especially on the patient level. All analyses were, however, performed on joint level, taking into account patient effect. Therefore, we believe that results are of importance. Further studies including larger samples of patients are warranted to confirm these findings.
No finger joints of a control group were imaged with MRI, because our study focused on the validity of MRI in patients with HOA.
For logistical reasons, US was performed some weeks before the MRI. This might have influenced the results on the correlation between MRI- and US-detected synovial thickenings, because synovial thickening can fluctuate over time24. Therefore, it is possible that the correlation is underestimated.
Because the OHOA-MRI scoring method was published during the course of the present study, axial sequences were not performed for all patients. Therefore, features such as synovitis could not be scored optimally in the patients for whom these sequences were lacking. This might have underestimated correlations.
Regarding the scoring of MRI, only 1 observer reviewed all MRI images because the scoring was time consuming. However, the intraobserver reliability is substantial to almost perfect and the reader was trained by the developers of the OHOA-MRI scoring method. In the future, MRI studies in less-selected HOA populations with followup data are needed to confirm these findings. In addition, further investigation in a longitudinal study is recommended to examine other metric properties of the scoring method: longitudinal inter-and intraobserver reliability and sensitivity to change. In addition, the influence of variation in the acquisition of the MRI images should be studied for further validation of this method.
Acknowledgment
The authors acknowledge Ida Haugen for her effort in teaching the OHOA-MRI scoring method.
- Accepted for publication March 16, 2015.