Abstract
A composite measure is one way of incorporating an assessment of all relevant clinical outcomes into one single measure. By definition it incorporates several dimensions of disease status often by combining these different domains into a single score. Such instruments are well established in rheumatoid arthritis (RA), and these RA-specific measures have successfully been adopted for use in clinical trials involving patients with psoriatic arthritis (PsA). However, the need for a more PsA-specific composite measure has led to a number of proposals, which, for the large part, incorporate only peripheral articular disease activity. New indices that combine the diverse clinical manifestations of PsA are now under development. These issues were discussed at the 2009 annual meeting of GRAPPA (Group for Research and Assessment of Psoriasis and Psoriatic Arthritis) in Stockholm, Sweden, and are summarized here.
Psoriatic arthritis (PsA) is a heterogeneous disorder affecting peripheral and axial joints as well as having other features such as dactylitis and enthesitis; the majority of cases also have involvement of the skin. Although not all these clinical features may occur together at any one time, it is important to be able to consider them all in order to assess the complexity of the disease and its effects on the individual patient.
A composite measure is one way of assessing all relevant clinical outcomes in one single instrument. By definition, it incorporates several dimensions of disease status often by combining these different domains into a single score. Such instruments are well established in rheumatoid arthritis (RA), and these have been adopted for use in clinical trials involving patients with PsA.
It should be noted that 2 types of composite index have been developed: first, instruments in which the response to a treatment intervention is measured (the responder indices); second, instruments that measure disease state (the disease activity measures). Note that a disease activity measure can be adapted to function as a responder index. Ideally, a composite index should be simple to use and apply in the clinical setting. It should combine practicability and feasibility with validity and clinical relevance. It should be able to guide clinical treatment decisions by providing an absolute measure of disease activity as well as a measure of response to therapy.
Measures adopted from rheumatoid arthritis
In RA, the commonly used composite tools are the American College of Rheumatology (ACR) responder index1 and the Disease Activity Score for 28 joints (DAS28)2. The ACR responder index measures improvement in tender and swollen joint counts plus improvement in at least 3 of the following 5 measures:
-
acute-phase reactant: erythrocyte sedimentation rate (ESR) or C-reactive protein (CRP)
-
patient global assessment of disease activity by visual analog scale (VAS)
-
physician global (MD global) assessment of disease activity by VAS
-
pain by VAS
-
physical function questionnaire, the Health Assessment Questionnaire (HAQ)
In PsA, the number of joints assessed can be augmented to a 68-tender, 66-swollen joint count, which includes the distal interphalangeal joints of the fingers3. In the ACR responder index, the ACR20, 50, and 70 refer to ≥ 20%/50%/70% improvements, respectively1. The ACR20 has become the standard outcome measure for new interventions in RA and has been adopted in intervention trials with biologic drugs for PsA; however, it is not easily utilized in clinical practice. In PsA, the ACR responder index works best in polyarticular disease, and although most patients enrolled in randomized controlled trials (RCT) have polyarticular disease, it may not be appropriate for those with lower joint counts typically seen in day-to-day practice4.
The DAS28 in RA includes 28-joint tender and swollen joint counts, patient global, and either ESR or CRP2. A formula is used to calculate a single score, with a range of 0–10. Improvement is categorized by low, moderate, and good responses based on baseline as well as change scores, including cutoff levels for remission and low disease activity states in RA5. Although the DAS28 utilizes the 28-joint count, which does not include the ankle or foot, it appears to function well in polyarticular PsA6; however, some have questioned its psychometric properties7.
Both the ACR and the DAS28 emphasize one dimension of disease in PsA, the articular component, peripheral joint inflammation and its influence on pain and function. In RCT, this may be the reason they work well; trials are powered on this basis, and other aspects of the disease, such as the skin, are assessed as secondary outcomes.
Measures of clinical response in psoriatic arthritis
The PsA Response Criteria (PsARC) was specifically designed for use in an RCT in PsA and thus acquired the label of a PsA-specific response measure8. However, it remains largely a peripheral articular responder index. Improvement is recorded in at least 2 of 4 areas: ≥ 20% improvement in MD global, ≥ 20% improvement in patient global, and ≥ 30% improvement in tender and swollen joint counts. Improvement in both joint counts is mandatory, and there should be no worsening of any component. The PsARC discriminates well between effective treatment and placebo in RCT3. However, a relatively high placebo rate (up to 45% in some studies) may be a disadvantage in powering studies that use this instrument as the main outcome measure9.
The Psoriatic Arthritis Joint Activity Index (PsAJAI) was developed in a project led by Gladman and Mease, assisted by statisticians Farewell and Tom, in which data from 3 trials of anti-tumor necrosis factor (anti-TNF) agents in PsA were analyzed to create models, based primarily on statistical considerations and some clinical input, that best distinguished active drug from placebo10. Note that in this analysis, addition of a skin assessment, the Psoriasis Area and Severity Index (PASI), was problematic in that not all patients in these trials could be assessed for PASI given low skin scores. Indeed, inclusion of the PASI reduced the ability to discriminate between placebo and treatment. Anti-TNF therapy had a large effect on the PASI score; therefore, it was recommended that skin be scored separately. From the same data, response criteria currently used for PsA were examined, and logistic regression models based on the individual components of these response criteria were analyzed. The PsAJAI, modeled as ACR30, performed better than the ACR20 and PsARC, and was comparable to previously developed models. The PsAJAI is a weighted sum of 30% improvement in 6 measures with weights of 2 given to tender joint count, CRP, and physician global assessment of disease activity. Weights of 1 are given to the remaining 30% improvement measures including pain, patient global assessment of disease activity, and HAQ11.
In addition, a composite measure for defining “minimal disease activity” (MDA) has been developed. Coates led this project, which was assisted by GRAPPA members and involved a review of hypothetical cases, culminating in the definition of MDA criteria for PsA shown in Table 112. These criteria were validated by assessing patients in Gladman’s patient cohort in Toronto13 and in interventional trial datasets14. The development of this instrument is a step toward “treatment to target” in PsA.
Following a recent GRAPPA PsA-specific workshop at the international consensus conference, Outcome Measures in Rheumatology Clinical Trials (OMERACT) 815, 2 additional disease-specific measures have been proposed in addition to the PsAJAI described above. A Viennese group collected cross-sectional clinical and laboratory data on 105 patients with PsA and performed principal-component analysis on those clinical and laboratory variables recommended by the OMERACT workshop16. Four principal components were derived: patient global and pain VAS scores, tender and swollen joint counts, acute-phase reactant (CRP), and skin, although the latter did not reach statistical significance. The group then studied the existing composite measures and determined that these domains were best served by using the Disease Activity Index for the Assessment of Reactive Arthritis (DAREA)17. This measure, now renamed the Disease Activity in Psoriatic Arthritis (DAPSA) score, performed reliably, demonstrated sensitivity to change, and compared favorably to the DAS28 and SDAI (Simple Disease Activity Index) when applied retrospectively to a PsA trial dataset18. Further comparison of this measure with other existing instruments for PsA is awaited.
Second, a new approach to constructing a disease assessment measure in PsA is based on a grid originally proposed by the GRAPPA group to guide treatment decisions in PsA19. Disease involvement is assessed in up to 5 domains: peripheral joints, skin, entheses, dactylitis, and spinal manifestations. For each domain, instruments are used to assess the extent of disease activity as well as the effect of involvement in that domain on patient function and health-related quality of life. Domains are scored 0–3, with empirical cutoffs for disease severity/activity proposed in each, largely based on the literature (Table 2). Individual domain scores are summed to give an overall, composite score (range 0–15). The Composite Psoriatic Disease Activity Index (CPDAI) shows significant correlation with patient and physician global assessments and appears to discriminate well between patients requiring treatment escalation and those in whom treatment is deemed to be effective20. In a cohort of 25 patients in whom treatment was changed, the median CPDAI score had decreased from 8.5 at baseline to 5.5 at 3 months of followup (p = 0.02), with the standardized response mean of 0.60.
The CPDAI has also recently been compared to the DAPSA using the large dataset obtained as part of the PRESTA study (Psoriasis Randomized Etanercept STudy in Subjects with Psoriatic Arthritis)21. During the first study period in the PRESTA trial, 752 patients were randomized to a double-blind, 2-period study that evaluated the safety and efficacy of 2 dosages of etanercept on skin and joint disease in psoriasis subjects with active PsA. Both CPDAI and DAPSA are effective in determining treatment response in patients treated with etanercept for active psoriasis and PsA. Joint responses were equally reflected by both composite scores; however, CPDAI, which better reflects other domains such as skin, enthesitis, and dactylitis, is the only composite score that can distinguish global treatment response between the 2 etanercept doses.
Development of a disease activity measure for ankylosing spondylitis
The methodological challenges of developing a composite disease activity measure are exemplified by recent publications in ankylosing spondylitis (AS)22. Currently available measures, including the composite Bath Ankylosing Spondylitis Disease Activity Index (BASDAI)23, do not correlate with structural damage and may not accurately reflect the full spectrum of disease where both axial and peripheral joints can be involved. The Assessment of SpondyloArthritis international Society (ASAS) initiated this process by conducting a Delphi exercise to determine important domains and instruments for measuring disease activity in AS. This was followed by a larger study where rheumatologists determined which patients required initiation of TNF inhibitors, a treatment decision that differentiated between active and inactive disease in this group of patients. To derive the principal component of the composite index, discriminant function and finally regression analyses were performed, using the discriminant function as the dependent variable and clinical and laboratory variables as independent variables. Each variable was weighted by the derived regression coefficient, and the following items were selected: fatigue (BASDAI question 1), back pain by VAS (BASDAI question 2), morning stiffness by VAS (BASDAI question 6), patient global by VAS, peripheral joint complaints (BASDAI question 3), and either CRP or ESR. Four candidate indices were proposed with CRP, ESR, fatigue, or patient global excluded22. All 4 indices performed well and better than the BASDAI. Subsequently, one index was selected by consensus within the ASAS group24. Based on feasibility, the ASAS-endorsed disease activity score (ASDAS) included back pain, duration of morning stiffness, patient global assessment, peripheral joint complaints, and CRP (substituting ESR if CRP unavailable). Further development is in progress.
OMERACT and the development of composite outcome measures
Since 1992, the international consensus conference, OMERACT, has led the way in the development of outcome measures for RCT and longitudinal observational studies in rheumatologic diseases. This is a data-driven process where candidate responder indices and their components are proposed and subsequently demonstrated to satisfy the “OMERACT filter” of truth, discrimination, and feasibility. Composite responder indices facilitate assessment of multiple domains of disease involvement, comparison of efficacy across populations, disease indications, and therapies, and may lead to a tiered approach to label indications. Composite responder and disease activity indices increase statistical power, particularly if they include domains that are not closely correlated. Such indices do not function well if they are not developed with data from RCT, and ideally they should be validated in both longitudinal studies and RCT.
The GRAPPA exercise to develop a composite measure for psoriatic arthritis (GRACE project)
The process of developing a truly composite measure for psoriatic disease started at OMERACT 8 and was developed further at the annual GRAPPA conference in Leeds, UK, in September 200825. The methodological approach followed that used in the development of the ASDAS index22. At OMERACT 8, domains and instruments appropriate to assess these domains were selected. When several instruments were available for certain domains, such as health-related quality of life, it was decided to be inclusive rather than exclusive and evaluate all, both generic and disease-specific. The construct used to determine active disease was change in treatment due to uncontrolled disease or adverse effects. Completed clinical research forms were circulated to all members of GRAPPA (n = 400), including dermatologists and rheumatologists. Collaborators were asked to include consecutive patients with PsA irrespective of whether they were undergoing treatment changes; it was hypothesized that change in treatment would indicate either failure of medication or change in activity of disease. It was anticipated that about one-third would have active disease as defined in the protocol. Data are being collected prospectively at baseline and at 3, 6, and 12 months. The sample size was determined to be 300 subjects and at the time of the GRAPPA 2009 Stockholm meeting, the group had recruited 220 patients (123 male, 91 female, 6 as yet unknown), with mean age 46 years, mean duration of arthritis 8.6 years, and mean duration of psoriasis 16.1 years. Treatment change due to active disease occurred in 84 patients (38%). Analyses are in progress.
It should be noted that GRAPPA is not committed to any single approach in developing a composite measure of response. Data collected will permit a number of approaches and facilitate comparison of any proposed index with existing ones discussed here. A similar analysis is planned to that used in the development of the ASDAS index, designed to develop a measure represented by a single score. This approach has several advantages: it permits an assessment of disease activity at a glance, and appropriately defined cutoffs for high and low disease activity and remission can provide quantitative estimates of improvement according to both baseline score and change. There are disadvantages: a single score may underestimate improvements in some domains and deterioration in others that may be of importance in RCT of therapies that improve disease domains in different ways. Examples include agents that may improve skin manifestations more than articular, or those that benefit peripheral joints but not spinal manifestations or involvement of the enthesis. A single score may have the advantage of qualifying a patient for further treatment when they may not qualify based on disease activity in a single domain. Thus, a composite measure in an individual patient should also indicate which domains are affected.
If a single-score composite measure is adopted, which includes all important domains of disease involvement, it will be important to keep it from becoming a cumbersome tool that is difficult to apply in clinical practice because of calculations required to generate the score. It is worth noting that both the DAS28 and ASDAS, developed using similar methodology, require mathematical calculations and involve transformed variables (e.g., log normal ESR). Nonetheless, the DAS28 remains widely used in clinical practice, partly because of the commercial distribution of small calculators. It is also worth noting that the requirement to have an acute-phase response as part of the score may prohibit immediate treatment decisions, and clinicians are relying on more simplified scoring systems to make their treatment decisions26.
Another proposal is to examine disease assessment in a modular fashion, a hybrid approach akin to that used by FitzGerald and colleagues16, an organ-system approach utilized by the British Isles Lupus Assessment Group (BILAG) scoring system27. In such an instrument, each domain would be assessed individually for “no” to “maximal” involvement. Domains involved would be assessed numerically or on the basis of intention to treat, the latter approach represented either numerically or by using a letter as in the original BILAG index (A to E), where category A indicated severe disease thought to be sufficiently active to require disease-modifying treatment, and category E indicated that the domain had never been involved. Such an instrument might look like that given in Table 3, where the cutoffs for each domain have been selected arbitrarily.
The GRACE database will not only permit development of new indices, but will also enable a comparison of these new indices with existing instruments. The proposal is to compare these using receiver-operating characteristic analysis, standardized response means, and effect sizes. The longitudinal nature of the database will permit an examination of responsiveness, magnitude of change scores, cutoffs for high and low disease activity, and estimates of a patient-acceptable symptom state (PASS). Data will allow further examination of metric properties of existing and proposed instruments used to assess PsA. As an example, data should permit development of new tools to assess enthesitis, as the case report forms for GRACE include most of the points used in existing enthesitis-scoring indices such as MASES (Maastricht Ankylosing Spondylitis Enthesitis Score), LEI (Leeds Enthesitis Index), and SPARCC (Spondyloarthritis Research Consortium of Canada)28,29,30. These and new indices would be tested in other databases from RCT and existing treatment registries.
Conclusions
Psoriatic arthritis, perhaps better termed psoriatic disease, is a complex heterogeneous condition with diverse clinical manifestations. Incorporating clinical assessment into a single composite measure presents several challenges. Existing composite measures mostly assess the articular component of the disease. Composite measures of response that reflect the full spectrum of disease are currently in the developmental stage.