Abstract
Objective. Radiographic progression is usually assessed by Sharp-based methods (van der Heijde-modified Sharp score and the Genant-modified Sharp score). The aim of this study was to evaluate, in a range of randomized controlled trials (RCT), the presence of erosions and joint space narrowing (JSN) in all individual joints, as well as progression in these joints, and to determine if any redundancy exists due to infrequently involved joints.
Methods. Four databases of rheumatoid arthritis RCT that were all scored according to van der Heijde’s modification of the Sharp score were included in a descriptive analysis.
Results. Irrespective of different readers, different patient populations, and different disease durations per trial, similar patterns emerged. Both erosions and JSN occurred in all sites. Erosions occurred most frequently in the feet, preferentially in 5th metatarsophalangeal joint (MTP-5). JSN occurred most frequently in the wrist. Change from baseline in erosions and JSN followed the pattern of involvement at baseline, so that MTP-5, and to a lesser extent MTP-3 and MTP-4, preferentially showed progression in erosive damage. Joints in the wrist showed highest tendency to worsen over time with respect to JSN.
Conclusion. These data indicate that both erosions and JSN must be assessed for damage, and that a more abbreviated joint count cannot be used for radiographic scoring.
Radiographic damage is an important outcome in rheumatoid arthritis (RA)1, because it is the consequence of joint inflammation2 and is independently associated with irreversible impairment of physical function3. In order to grant a claim of prevention of structural damage, regulatory authorities require data proving that a particular medicine can retard the progression of radiographic damage during a 6 to 12-month period. The main reason that radiographic progression is measured in hands and feet is 2-fold: (1) RA classically involves the small joints of hands, wrists, and feet; and (2) radiographs of hands and feet cover the highest number of involved joints using the lowest number of radiographs. At the group level, radiographic damage in hand and feet joints associates acceptably with damage in large weight-bearing joints with a dominant influence on physical function4.
In order to measure radiographic progression in clinical trials, several instruments have been developed. These instruments can largely be divided into Larsen-based methods and Sharp-based methods. Larsen proposed his method in 19775, and modifications were subsequently proposed by Scott6 and by Rau7. Sharp proposed his Sharp score in 1971, with modifications proposed by himself8, van der Heijde (van der Heijde-modified Sharp score)9, and Genant (Genant-modified Sharp score)10.
The main difference between Larsen-based and Sharp-based methods is that Larsen applies a global grading of joint damage, while Sharp quantifies number and size of erosions per joint and amount of joint space narrowing (JSN). The measurement of radiographic progression in the context of a short-term randomized controlled trial (RCT) requires that a method have a high level of reliability and sensitivity to change, and Sharp-based methods have been shown to outperform Larsen-based methods, in that they have a better “signal-to-noise ratio”11. As a consequence, radiographic progression in modern clinical trials is usually assessed by Sharp-based methods, mostly the van der Heijde-modified Sharp score and the Genant-modified Sharp score. While both methods have been successful in demonstrating differences in radiographic progression in separate trial arms, some concerns remain. Scoring is time-consuming, especially in the case of significant damage or a high level of progression. Measurement error, reflected by disagreement among independent readers, jeopardizes precision and may decrease the signal-to-noise ratio, which in turn leads to impaired discrimination. Theoretically, time to score could be decreased and precision could be increased by deleting those joints that neither contribute to damage nor to progression in RCT, by deleting joints that inappropriately add to measurement variation, or by leaving out either the domain of erosions or the domain of joint space narrowing.
The aim of this study was to evaluate and compare the frequency of both involvement and progression of radiographic erosions, and JSN in individual joints. By including various datasets with populations with different levels of severity, generalizability of results could be tested.
MATERIALS AND METHODS
Radiographic data from baseline and 12-month timepoints from 4 RA RCT that employed van der Heijde-modified Sharp scoring were used for this analysis after permission was kindly granted by the sponsors. The following trial databases were included: (1) The Combination of Methotrexate (MTX) and Etanercept in Active Early Rheumatoid Arthritis (COMET) trial12, which included patients with early RA naive to MTX with a disease duration < 2 years (mean duration 0.8 yrs). (2) The Trial of Etanercept and Methotrexate with Radiographic Patient Outcomes (TEMPO)13, which included patients with established RA, the majority being MTX-naive and whose mean disease duration was 6.7 years at inclusion. (3) The RA Prevention of Structural Damage-1 (RAPID-1) trial14. This trial, comparing certolizumab plus MTX and MTX monotherapy, included patients with active, established RA despite a stable dose of MTX who had mean disease duration at baseline of 6.2 years. (4) The RA Prevention of Structural Damage-2 (RAPID-2) trial15, comparing certolizumab plus MTX and MTX monotherapy, and including patients with active RA despite MTX. The mean disease duration at baseline was 6.2 years. All trials were scored by panels of different readers.
Analysis
The percentage of patients with involvement by erosions or JSN per joint was extracted from each database. Data were extracted per reader. Some databases (COMET, TEMPO) allowed a distinction between left and right sides while for others (RAPID-1 and RAPID-2) left and right joints were pooled. Per-joint data were extracted for baseline only (containing data of all treatment arms) and for change between baseline and endpoint. For the latter analysis, use was made of the treatment arm that did not contain tumor necrosis factor (TNF)-blocking treatment (usually the MTX monotherapy arm) because under TNF-blocking therapy, the normal relationship between disease activity and radiographic progression is lost16 and radiographic progression is almost completely blocked.
In order to visualize erosion and JSN involvement on a per-joint basis, histograms were plotted for all 4 trials. These histograms express all individual joints on the x-axis and the percentage of patients with involvement on the y-axis. Joint patterns of involvement were clustered according to 4 subgroups: proximal interphalangeal (PIP) joints, metacarpophalangeal (MCP) joints, wrist joints, and metatarsophalangeal joints.
RESULTS
In general, all different readers showed similar patterns across the trials, both with respect to joint involvement at baseline, and with respect to change from baseline in the MTX monotherapy arms.
The RAPID-2 trial was scored by 3 instead of 2 readers as in the other trials, with each read overlapping one-third of the radiographic reading; this made it impossible to compare the performances of these readers within the trial, but the general pattern that emerged was entirely similar. Since RAPID-1 and RAPID-2 were very similar trials in design, we chose to include only RAPID-1 data here. For reasons of brevity, we chose to show the data of only the COMET trial in this article as a representative example while mentioning essential data of the other trials in the text.
Figure 1 shows the percentage of involvement of joints with erosions at baseline (entire trial population). In general, involvement of the left side (dark bars) is similar to involvement of the right side (light bars), although individual joints show some variation. Foot joints were most frequently involved (14%–16% of all patients had MTP-5 involvement), followed by MCP joints, wrist joints, and PIP joints. Within the clusters and across all trials, MTP-5 was most frequently involved at baseline, followed by other MTP joints, although at different rates. For example, MTP-5 involvement was far higher in RAPID-1 (37%) and TEMPO (45%) compared to COMET (18%), reflecting the far longer disease duration in the former 2 trials.
If one looks on a per-cluster basis at the 3 trials, of the MCP cluster, MCP-2 was most frequently involved [11% in COMET (Figure 1), 30% in RAPID-1, and 28% in TEMPO]. Of the wrist cluster and the PIP cluster a dominant pattern of joint involvement could not be identified. Of note, none of the joints that were scored at baseline had involvement in zero percent of the patients, neither in the COMET trial nor in the RAPID-1 trial and the TEMPO trial.
Figure 2 (COMET) shows the percentage of joints with change in erosions from baseline. Change was predominantly seen in the foot cluster, and to a lesser extent in the MCP cluster. In line with patterns of involvement at baseline, change in the COMET trial was most frequently seen in MTP-3 to MTP-5 followed by MCP-2 to MCP-3 (Figure 2). But the picture in the RAPID-1 and TEMPO trial was slightly different: MTP-5 was scored as changed in approximately 10% of the patients in the COMET trial (Figure 2), but in only 7% of the TEMPO trial, and a little bit more than 1% of the RAPID-1 trial. Although change was rare in some individual joints, we could not find joints in any trial in which change in erosions was never detected.
Figure 3 (COMET) shows the percentage of joints with JSN at baseline per joint and per cluster. Here, the cluster of the wrist was primarily affected, followed by foot joints, MTP joints, and PIP joints. This pattern was more or less recognizable in the other trials, although the most frequently involved joint in the wrist cluster (and in other clusters) differed across trials. In the COMET trial (Figure 3), the joint between os multangulum and os naviculare was most frequently involved (approximately 10% of the patients). The radiocarpal joint was most frequently affected in the RAPID-1 trial (23%) and in the TEMPO trial (40%). Importantly, JSN was occasionally found in all possible sites, although differences in the rate of involvement were found. The picture of change per cluster (Figure 4) followed the picture of involvement at baseline, in that the wrist sites most frequently showed change in the COMET trial (up to 5% in the joint between the os capitatum and the os naviculare) and in the RAPID-1 trial (up to 1.5% in the joint between the os capitatum and the os naviculare). In the TEMPO trial, change in JSN was seen slightly more frequently in MCP-2 (approximately 4% of the patients).
DISCUSSION
The results of this analysis showed a remarkably consistent picture that can be summarized as follows: irrespective of different readers, different patient populations, and different disease durations in each trial, similar patterns emerged. Both erosions and JSN occur in all sites that are scored according to van der Heijde’s modification of the Sharp score. Erosions occur most frequently in the feet, and MTP-5 is preferentially involved. JSN occurs most frequently in the wrist. In general, changes in erosions and JSN follow the pattern of involvement at baseline, so that MTP-5 and to a lesser extent MTP-3 and MTP-4 preferentially show progression in erosive damage. Similarly, the joints in the wrist that are preferentially affected at baseline show the highest tendency to worsen over time. Such a pattern of individual joint involvement has been described with the same joint preferences in an observational cohort setting17.
The main aim of this study was to determine if particular joints could be excluded from scoring due to lack of disease involvement and/or limited progression. In this analysis we could not find joints that were consistently negative in terms of involvement in damage and/or change over time. We had considered the hypothesis that erosions rarely, if ever, are seen in the PIP joints, and that change in erosions was also rare in these joints. Although involvement and change in erosions was definitely less prevalent in the PIP joints in comparison to other sites, all trials showed change in these joints in a proportion of patients. It was also hypothesized that the carpometacarpal joints of the 3rd, the 4th, and the 5th ray, which are considered joints belonging to the wrist area, rarely if ever show change. However, we found that these joints were also contributing to change in JSN in all trials studied.
Although some joints contributed to change in less than 2% of a particular trial population, it is important to realize that the total Sharp score is a sum score of all separate joint erosion scores and JSN scores, and that the change in total Sharp score in modern RCT with effective treatments rarely, if ever, exceeds 2 units per year. That means that every joint that shows change contributes importantly to the individual total Sharp score change. It is to be expected that future clinical trials, in which a new standard of care in the control group may imply an even lower progression rate, will show less contrast between treatment arms, which poses a greater challenge to the scoring methods in terms of discriminative power. A limitation of this study obviously is that we have not investigated to what extent the removal of particular joints would influence performance of the score in terms of sensitivity to change and discrimination.
An interesting observation in this descriptive study was that change in erosions and JSN was highest in the COMET trial with early DMARD-naive patients, whereas it was much lower in the RAPID trials and in the TEMPO trial, which included patients with more longstanding disease, despite a far higher rate of involvement at baseline. Although the general statement that the pattern of joint change follows the pattern of joint involvement is true, our results seem to confirm that the overall rate of progression per joint in early DMARD-naive RA is higher than in more advanced RA, with the proviso that an appropriate comparison between trials read by different readers is limited. Translated to the level of the individual patient, this means a confirmation of the parabolic configuration of the radiographic damage-over-time curve, with steepest progression in the early phase and leveling off thereafter. This consistent finding has always been explained by a truly higher progression score in the early disease18.
In summary, we have shown here in different RCT databases, using individual joint scores for erosion and JSN, that all scored joints contribute to some extent in quantifying change over time. In light of the need to detect subtle changes in future clinical trials in RA, we do not recommend reducing the current joints scored by modified Sharp methodologies.