Article Text

Download PDFPDF

Assessing the prevalence of hand osteoarthritis in epidemiological studies. The reliability of a radiological hand scale

Abstract

OBJECTIVE The hands are often involved in the osteoarthritic disease process. A radiological grading scale is presented, derived from a published atlas, to assess the prevalence of hand osteoarthritis (OA) involvement in clinical and epidemiological studies and its reproducibility is studied.

METHODS This hand scale is based on the radiological feature “joint space narrowing”, which represents the macromorphological process of cartilage loss. Osteophytes and sclerosis are less important unless seen in conjuction with joint space narrowing. Nine individual joints per hand (four proximal interphalangeal joints (PIP), four distal interphalangeal joints (DIP), first carpometacarpal joint (CMC-1)) are scored dichotomously for the presence of OA. To save time and to increase reliability a severity grading of radiological features is not performed. To determine inter-rater and intra-rater reliability of the individual joints and the presence of OA in two separate joint groups (⩾ 2 PIP or DIP and at least one CMC-1, used to define “generalised OA” in the ongoing Ulm Osteoarthritis Study) 50 pairs of anteroposterior hand radiographs were read by two investigators twice within one month. The κ coefficient was calculated to quantify the strength of associations.

RESULTS On average five minutes were needed to score one hand radiograph. Both raters were able to reproduce their own readings in all individual joints and for the presence of OA in two separate joint groups after one month. Reliability was highest for the PIP joints (κ: 0.56–1.00) it was slightly lower for the DIP joints (0.38–0.87), for the CMC-1 joints (0.58–0.69) and for OA in two separate joint groups (0.54). The values for inter-rater agreement were good as well, κ coefficients ranged from 0.52 to 0.92.

CONCLUSION This grading scale was shown to be reliable within and between readers for all the individual joints as well as for the presence of OA in two separate joint groups. Scoring a limited number of joints dichotomously makes this scale efficient and therefore useful for clinical and epidemiological trials, when dealing with large patient samples.

  • hand osteoarthritis
  • radiographic grading scales
  • reliability

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

The hand is a frequent site of disease involvement in osteoarthritis (OA).1 It has been estimated that more than 70% of people aged 65 years or older2 are affected. While many previous studies on the prevalence of hand OA were based on the presence of Heberden's knodes3-5 more recently performed investigations use radiological changes to grade the disease.6-9

The instrument that was mainly applied in the past to score OA in hands was the Kellgren and Lawrence score.6 ,7 This overall score distinguishes five degrees of severity of OA according to the presence of the radiological features: osteophytes, joint space narrowing (JSN), subchondral sclerosis, cysts and flattening of condyles. Subsequent investigations identified certain deficiencies of that score and its appropriateness in clinical and epidemiological studies was questioned.8 ,9 One important point of criticism is the significance of osteophytes as the main radiological feature in scoring the severity of disease.8Identification of OA can be performed only if definite osteophytes are present. Another point of criticism is the moderate reliability of this score.8 ,9

These deficiencies have led to the development of other radiological scores that grade individual features of OA separately and independently from the presence of osteophytes. New radiological scores on hand OA were created and have been proved to be reliable.8 ,9 However, these scores assess these features in great detail and scoring is therefore time consuming. This might be a disadvantage for clinical and epidemiological studies dealing with large patient samples.

We present a hand scale based on a simple dichotomy whether or not OA is present by grouping several radiographic features, to assess the prevalence of hand OA and to measure the number of joints being included in the disease process. As JSN resulting from cartilage loss is the main radiological feature of OA, our scale is based on the presence of that feature. Other radiological parameters are only used in conjunction with JSN.

Methods

SUBJECTS AND RADIOGRAPHS

Anteroposterior radiographs of both hands of 50 patients from the Ulm Osteoarthritis Study10 were chosen for this investigation. Subjects ranged in age between 51 and 79 years. The Ulm Osteoarthritis Study is a cross sectional and longitudinal investigation evaluating radiographic and clinical patterns of hand, hip and knee OA in patients with advanced OA of a large weight bearing joint in south west Germany.

THE HAND SCALE

Hand radiographs were scored for the presence of OA in nine finger joints of each hand: the four distal interphalangeal joints (DIP), the four proximal interphalangeal joints (PIP) and the first carpometacarpal joint (CMC-1). These joints were chosen because they have been shown to be most frequently affected in the OA disease process of the hand11 and have been previously used to define generalised OA.10 Using the atlas published by Altman et al 12 as guideline whether or not significant JSN, osteophytes or sclerosis were present, the scoring was performed in the following manner: if JSN grade two or more or JSN grade one and either of sclerosis or osteophytes grade two or more were present, the joint was regarded as being affected. OA was not diagnosed if osteophytes or sclerosis without JSN were observed. The number of affected joints of both hands was counted separately for interphalangeal joints (0–16) and for the CMC-1 joints (0–2).

TRAINING OF RATERS

After two orthopaedic surgeons (SK, JF) had familiarised themselves with the OA hand scale and the atlas of Altmanet al 12 they subsequentely scored 50 hand radiographs (25 patients) in three two-hour training sessions. For that purpose, radiographs were randomly selected from the Ulm osteoarthritis study population.

Both investigators compared their results and discussed them until consent was achieved.

RATING OF RADIOGRAPHS

Both anteroposterior hand radiographs of another 50 patients of the Ulm Osteoarthritis Study (100 hands) were randomly selected. After blinding for individual data (date of birth, name and sex), all radiographs were independently read by both raters. To determine intra-rater reliability, the rating was repeated by both investigators one month later.

DATA ANALYSIS

κ Statistics were used to quantify the inter-rater and intra-rater agreement of both raters for each individual joint as well as for the presence of OA in two or more finger joints (PIP or DIP) and at least one CMC-1 joint (used to define generalised OA in the Ulm Osteoarthritis Study in patients with hip or knee OA). All analyses were performed with SAS (Statistical Analysis System, Version 6.12, SAS Inc, Cary, NC).

Results

EFFICIENCY

The average time required to read one anteroposterior hand radiograph with the new hand scale was five minutes. The average time that was necessary using Kallman's8 or Lane's9 scale ranged between 10 and 15 minutes per hand.

PREVALENCE OF OA ACCORDING TO THE HAND SCALE

Table 1 and table 2 show the prevalence of radiographic OA according to the hand scale for the nine joints of each hand. Results are presented separately for the right and the left hand as well as for both readings of both raters.

Table 1

Radiographic osteoarthritis according to the new hand scale for the joints of the right hand (second rating in parentheses) (n=50)

Table 2

Radiographic osteoarthritis according to the new hand scale for the joints of the left hand (second rating in parentheses) (n=50)

The most frequently affected joint in the right and the left hand in all readings was the second DIP joint, followed by the third and the fifth DIP joint.

INTER-RATER AND INTRA-RATER RELIABILITY OF INTERPHALANGEAL JOINTS, CMC JOINTS AND PRESENCE OF OA IN TWO SEPARATE JOINT GROUPS

Figures 1 and 2 show the intra-individual agreement of both readers. Both raters were able to reproduce their own readings in all the joints after one month. Reliability was generally higher for the PIP joints than for the DIP joints and slightly lower for the CMC-1 joints. Results of the inter-rater reliability are shown in figure 3. Raters were able to reproduce each others readings in an acceptable way. Agreement between raters ranged from 0.52 (DIP-2, right hand) to 0.92 (CMC-1, left hand).

Figure 1

Intra-rater reliability of rater 1 (κ statistics) for the right (first column) and the left (second column) hand. (CMC-1 = first carpometacarpal joint, PIP = proximal interphalangeal joints, DIP = distal interphalangeal joints).

Figure 2

Intra-rater reliability of rater 2 (κ statistics) for the right (first column) and the left (second column) hand. Abbreviations as in figure 1.

Figure 3

Inter-rater reliability of rater 1 and 2 (κ statistics) for the right (first column) and the left (second column) hand. Abbreviations as in figure 1.

The reliability of presence of OA in two separate joint groups was acceptable as well. Table 3 lists the results

Table 3

Inter-rater and intra-rater agreement for the presence of osteoarthritis (OA) in two separate joint groups (⩾2 finger joints (PIP or DIP) and at least 1 CMC-1 joint)

Discussion

In this study we present a radiological scale based on the atlas of Altman et al 12 to measure the prevalence of hand OA in clinical and epidemiological study samples, which is short, time efficient and as reliable as already existing scores. Our scale is a simple dichotomy whether or not OA is present to assess prevalence and to measure the number of joints being included in the disease process. Reasons for creating such a short scoring system are limitations of already existing radiological grading systems like the Kellgren and Lawrence score6 ,7 and newer parameters, for example, published by Kallman et al 8 or Lane et al.9 These limitations include the number of joints being graded and so the amount of time needed for the procedure, especially in large patient samples.

The proposed hand scale is based on the presence of JSN, as the most important individual radiological feature in OA. Other radiological features are regarded to be less important in this index, unless in conjunction with JSN. The background of this index, focusing on the presence of JSN is in contrast with the idea of the Kellgren and Lawrence score,6 which has accepted osteophytes to be the dominant radiological feature of OA. In hand OA, investigators argue that finger joints often appear narrowed and sclerotic on radiographs without showing significant osteophytes.8 We had the same impression, when reading the anteroposterior hand radiographs within the Ulm Osteoarthritis Study.10

Scoring features dichotomously makes our scale easy to handle and less time consuming (on average five minutes for one radiograph). An additional effect of the dichotomous reading seems to be an increase in the reliability of the scale. Within this scale only nine individual joints per hand are scored (eight IP joints and the CMC-1 joint). We have selected these joints, because they have shown to be the most often affected ones11 in the OA disease process and those jonts have been used in conjunction with knee or hip OA to decide whether or not generalised OA is present.10 ,13 ,14 ,15

Other investigators like Kallman et al 8 or Lane et al 9 have included the scaphotrapezoid joint into their radiographic grading scores, a joint that we have tested to be difficult and not reliable enough to score, at least within our scale that is based on JSN, because of the oblique projection of this feature on anteroposterior hand radiographs. Including further hand or finger joints certainly would bring additional information but judging the benefit by doing this, the disadvantage losing the time efficiency of our scale in handling large patient samples is seen more severe.

Within the training period both readers tested in a limited number of hand radiographs (n=10) the amount of time, necessary to read films with other scales, having included more joints. On average more than double the time (10–15 minutes) was necessary when compared with our hand scale.

Our instrument was reliable between and within raters for all the individual joints (figs 1, 2 and 3). A comparison with the results of the Kellgren and Lawrence score6 ,7 or with those of Kallman et al and Laneet al 8 ,9 is not possible because of the nature of our assessment. We have measured the reliability of scoring an individual joint whether or not OA is present, whereas other authors have measured the reliability of individual radiological features of OA. In our scale, reliability was highest for PIP joints and slightly lower for DIP joints and for CMC-1 joints.

Regarding the frequencies of joint affection, we registered an increasing prevalence of OA in CMC joints but not in interphalangeal joints within all second readings of both raters. We think that this phenomenon is a result of the ongoing learning process in scoring hand joints, especially CMC joints, which are, possibly because of a slightly oblique projection of the joint space on anteroposterior radiographs, more difficult to score when compared with interphalangeal joints.

In summary, our hand scale has been shown to be reliable within and between readers, for all individual joints as well as the presence of OA in two separate joint groups. Scoring a limited number of joints dichotomously makes this scale very efficient and therefore interesting for investigations on hand OA in large patient samples.

References