Abstract
Composite disease outcome measures have been used in rheumatology for some time, but a disease-specific composite measure for psoriatic arthritis (PsA) has not yet been validated. Currently, instruments developed for use in rheumatoid arthritis are employed in PsA and include the American College of Rheumatology response criteria (ACR20, 50, and 70) and the Disease Activity Score for 28 and 44 joints (DAS28 and DAS44); however, these instruments do not cover the full spectrum of psoriatic disease. A composite measure is one way of incorporating an assessment of all relevant clinical outcomes into one single measure. By definition, it incorporates several dimensions of disease status, often by combining these different domains into a single score, which in the case of PsA includes joints, skin, entheses, dactylitis, and axial disease. New indices that combine these diverse clinical manifestations of PsA are under development and, in some cases, in the validation phase. The Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA) established the GRAPPA Composite Exercise (GRACE) project to compare existing and emerging composite measures and to develop a new index. At the GRAPPA 2010 meeting, initial results from this project were presented, and existing and new candidate measures were compared.
Psoriatic arthritis (PsA) is a heterogeneous disorder affecting peripheral and axial joints as well as having other features such as dactylitis, enthesitis, and skin and nail disease. Although not all these clinical features may occur together at any one time, it is important to be able to assess them all in order to assess their influence on the patient and the response to treatment, which may not be consistent across features.
A composite measure is one way of assessing all relevant clinical outcomes in a single instrument. By definition, it incorporates several dimensions of disease status, often by combining these different domains into a single score. Such instruments are well established in rheumatoid arthritis (RA), and these have been adopted for use in clinical trials involving patients with PsA. Measures adopted from RA include the American College of Rheumatology (ACR) responder index1 and the Disease Activity Score for 28 joints (DAS28)2. The ACR responder index measures improvement in tender and swollen joint counts plus improvement in at least 3 of the following 5 measures: acute-phase reactant, patient global assessment of disease activity by visual analog scale (VAS), physician global assessment of disease activity by VAS, pain by VAS, and physical function using the Health Assessment Questionnaire (HAQ). The ACR20, 50, and 70 scores refer to ≥ 20%/50%/70% improvements in these measures1.
In PsA, the number of joints assessed optimally involves a 68-tender, 66-swollen joint count, which includes the distal interphalangeal joints of the fingers3. The DAS28 in RA includes 28-joint tender and swollen counts, patient global, and either erythrocyte sedimentation rate or C-reactive protein (CRP). Although the DAS28 has been shown to distinguish PsA patients treated with anti-tumor necrosis factor agents from those receiving placebo, it was noted that 25% of the patients would not have been included in this study because the primary joints involved were below the knees, which are not assessed as part of the DAS284.
A number of additional composite measures for assessing disease activity in PsA have been proposed. A composite measure for defining “minimal disease activity” (MDA) has been validated and includes assessments of joints, skin, entheses, and physical function5. The MDA criteria define a low disease state and can be used as a responder index in addition to a target for treatment interventions. Three other disease-specific measures have been suggested. First, an adaptation of the Disease Activity index for Reactive Arthritis (DAREA)6 has been renamed the Disease Activity index for PSoriatic Arthritis (DAPSA), developed from a clinical cohort7 and validated using clinical trial data8. Second, a weighted articular responder index, the Psoriatic Arthritis Joint Activity Index (PsAJAI), has been developed from pooled data from random clinical trials of biologic agents in PsA9,10. A response, according to this measure, is defined as a 30% improvement in core measures, with weights of 2 given to tender and swollen joint counts, CRP, and physician global assessment of disease activity, and weights of 1 given to patient pain and global scores, and the Health Assessment Questionnaire. In both of these instruments, the analytic method used in their development led to factoring out skin disease, which was therefore recommended to be measured separately from the musculoskeletal components, as has been the case with the ACR and DAS scoring systems.
Third, a domain-based approach has been proposed with the development of a composite measure known as the Composite Psoriatic Disease Activity Index (CPDAI)11. In the CPDAI, disease involvement is assessed in up to 5 domains: peripheral joints, skin, entheses, dactylitis, and spinal manifestations. For each domain, instruments are used to assess both the extent of disease activity and the effect of involvement in that domain on patient function and health-related quality of life. Domains are scored 0–3, with empirical cutoffs for disease severity/activity proposed in each, largely based on the literature. Individual domain scores are summed to give an overall, composite score (range 0–15)11. This instrument has also been validated in a large clinical trial dataset [Psoriasis Randomized Etanercept Study in Subjects with Psoriatic Arthritis (PRESTA)]12. In an open-label period of the PRESTA study where 2 dose regimes of etanercept were administered, the composite measure (CPDAI) was able to demonstrate the differential response to the 2 doses, whereas the DAPSA was unable to discern this difference.
The GRAPPA Exercise to Develop a Composite Measure for PsA
The process of developing a composite measure for psoriatic disease started at the 8th meeting of OMERACT (Outcome Measures in Rheumatology) and was further developed at the annual Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA) conferences in Leeds 2008 and Stockholm 200913,14. The methodological approach followed the one used in the development of the Ankylosing Spondylitis Disease Activity Score (ASDAS)15 and that suggested by Fransen, et al16. Further details of the assessments and instruments used in the clinical record form are given in a previous publication14. In brief, we aimed for a sample size of 300 with baseline, 3, 6, and 12 month data. At each timepoint, the surrogate for disease activity was a change in disease-modifying medication. At the time of the GRAPPA meeting in December 2010, 471 baseline subject case report forms had been received from the GRAPPA Composite Exercise (GRACE) project, and data on 268 of these were available for the 3-month visit. Table 1 gives the baseline characteristics of these subjects divided into 2 groups according to the disease activity construct.
Baseline characteristics of patients in the GRACE (GRAPPA Composite Exercise) study.
Analysis of the baseline data using methodology similar to that used in the development of the ASDAS (transformation of variables, factor analysis, discriminant function analysis, and linear regression) revealed that just 3 variables explained over 90% of the variance in scores. This putative index is given as: PsA index = [0.539 × patient global (mm)] + [0.194 × patient skin (mm)] + [0.438 × physician global (mm)].
In the GRACE study, the patient global scores were identical to those proposed by Cauli, et al17, which used the following questions:
-
Global VAS: In all the ways in which your PSORIASIS and ARTHRITIS as a whole affects you, how would you rate the way you felt over the past week?
-
Skin VAS: In all the ways your PSORIASIS affects you, how would you rate the way you felt over the past week?
No specific question was given for the physician global VAS; the respondent was simply asked to mark on a 100-mm line the global assessment from “not active at all” to “extremely active.”
To establish the desirability function (DF) cutoffs for disease activity, disease states were derived from members of the GRAPPA group using an online survey technique. These cutoffs were used to derive a series of linear functions for each variable so that the variable was rescaled to give a value from 0 (completely unacceptable) to 1 (no better outcome possible). These individual functions were then combined into a single measure to give the arithmetic mean (AM), again with a score range of 0 to 1. Two separate composite scores were developed:
-
AM_DF1: includes swollen joint count, tender joint count, patient skin VAS, patient joints VAS, patient global VAS, HAQ, and a quality of life measure, the PsAQoL18
-
AM_DF2: the same as AM_DF1 with the addition of the psoriasis area and severity index (PASI).
The performance of these 3 new measures (PsA Index, AM_DF1, AM_DF2) was compared to that of existing composite measures and the results are given in Table 2. In this table, the new measures are compared to the DAS28, DAPSA, CPDAI, and a new version of the CPDAI in which the cutoffs for each domain have been revised to be consistent with those used in the derivation of the DF. Two sets of data are used to compare the measures: the z statistic of the Mann-Whitney U test and the area under the receiver operating curve. In both cases, the larger the figure the better the measure discriminates between those with “active” disease and those with stable disease. For both statistics, the largest figure is found for the PsA Index. Similar results were obtained for subgroup analysis when patients with oligoarthritis (< 5 swollen/tender joints) were selected. Selecting for severe skin involvement (PASI score > 10) gave statistics of smaller magnitude such that, in the case of the CPDAI new and DAPSA, the z statistic did not reach the usual level of significance (0.05).
Comparison of measures at baseline, based on decision to change treatment (GRACE study; patients who had treatment change at baseline).
Table 3 gives the GRACE data for those who had a treatment change at baseline and for whom 3-month followup data were available. The scores for each measure are given and compared using the t statistic. Two other statistics were calculated: Cohen’s effect size19 and the standardized response mean20. For both measures, the larger the score the better the measure is able to record a response to the intervention — they are measures of responsiveness. Once more the PsA Index performs well, as do the DF and the CPDAI.
Changes in composite measures from baseline to 3 months in those subjects who had treatment change at baseline (GRACE study; n = 158).
Results of Breakout Group Discussions
Following presentation of the data, GRAPPA delegates split into a number of small groups charged with discussing the content and performance of the measures. Delegates were asked to consider the measures in terms of the OMERACT filter: truth, discrimination, and feasibility. At a plenary session, group representatives discussed these deliberations and the main issues are listed below:
-
Cultural differences may be apparent with response measure. Should we do different analyses by continent?
-
What would be the influence of fibromyalgia and other comorbidities on the VAS scores?
-
With reference to the proposed PsA Index, it was felt that the measure lacked face validity. In addition, one group questioned the inter- and intrarater reliability of the physician VAS scores. Another group reported recent results indicating good intrarater reliability for patient VAS scores, with an ICC of 0.87 for global disease activity17.
-
An assessment of enthesitis should be included in the DF scores
-
Will the composite measures truly measure the full spectrum of disease?
-
In the development, the measures should be tested for their prognostic ability
-
The concept of a unidimensional instrument for a multidimensional disease was questioned. In particular, the conceptual model of combining skin and joint measures into one measure was thought to lack validity. It was suggested that one solution to this issue might be to express the measures within a 2-dimensional framework, with skin on one axis and musculoskeletal (possibly a composite of joints, spine, dactylitis, and enthesitis) on the other
-
The measure should be useful in the clinic and in the research setting
-
The skin component needs to include all the different ways and sites that can be affected by psoriasis, such as the scalp and genital areas
-
Much discussion was devoted to the physician global VAS. Many people felt that this was an unreliable estimate of the true state of the patient’s disease as assessed by the physician. Others countered that to make the judgment about global VAS, the physician must have examined the joints and the skin, so that it was equivalent to performing the objective measures
-
In order to measure the possible differential response of the different domains to different treatments, the composite measure must have representation of individual domains nested within it. Otherwise, valuable information would potentially be lost
-
To measure low disease activity states and remission, a composite measure must include all relevant domains of disease, and scores using this measure must reflect this, such that a low score cannot be obtained if any one domain still demonstrates significant disease activity
-
Some discussants suggested that a comorbidity component be added to the composite measure, e.g., a measure of risk factors for cardiovascular disease.
Some discussants argued that it is conceptually flawed to attempt to combine assessments of skin and joints into one measure. Following the suggestion of one of the breakout groups, a new concept for recording disease activity in 2 dimensions was developed (Figure 1). Disease activity in skin is recorded along one axis and disease activity in the musculoskeletal system on the other. Whatever measure of disease activity is used, a division is made such that high and not-high disease activity can be defined. The resulting 4-quadrant graph allows the observer to identify the disease activity of the patient at a glance, and this can be quantified using vector mathematics.
A proposed method for combining disease activity data for skin and musculoskeletal systems. For a particular patient, disease activity is represented by a vector. Changes in activity from visit to visit can be tracked by vector mathematics, but the observer can see at a glance the activity in each component of the disease. PsA: psoriatic arthritis; PSO: psoriasis; Q: quadrant.
Voting Procedures
At the end of the plenary sessions, delegates were asked to vote on each of the measures tested; the questions and results are provided in Table 4. The CPDAI ranked highest in the domains of truth and discrimination, and the proposed PsA Index in the feasibility domain. The attendees voted the following measures to consider for further study: PsA Index, AM_DF2, and CPDAI.
Voting responses from the attendees. The figures denote the percentage of positive (“yes”) responses to each question.
Additional Remarks
The GRAPPA initiative to develop a new composite measure for PsA has been under way for 3 years and is reaching the point where candidate measures are being adopted for further study in new interventional studies. In addition, if the data are appropriate, these measures can be tested in existing databases, as described14. At this time, GRAPPA is not committed to any single measure. Development of a single composite measure that has linear properties will enable the determination of cutoffs for low, moderate, and high disease activity, as well as the magnitude of any change. A single index also permits an assessment of disease activity at a glance. However — and this point was highlighted in the plenary session — it is important to retain the information on the different domains of disease, if only to provide reassurance to the observer who has traditionally relied on these individual assessments to guide their therapeutic choices.
Conclusions
The search for a better composite measure for PsA is further advanced but requires more developmental and validation work. Incorporating clinical assessment into a single composite measure presents several challenges. Existing composite measures mostly assess the articular component of the disease, but to combine all manifestations of this heterogeneous disease may be conceptually incorrect. Once candidate measures are available, the performance of these measures against existing measures will be necessary before final acceptance by the rheumatology community.
Footnotes
-
Supported by GRAPPA.