Abstract
Objective. The Psoriatic Arthritis (PsA) Core Domain Set for randomized controlled trials and longitudinal observational studies has recently been updated. The joint counts are central to the measurement of the peripheral arthritis component of the musculoskeletal (MSK) disease activity domain. We report the Outcome Measures in Rheumatology (OMERACT) 2018 meeting’s approaches to seek endorsement of the 66/68 swollen and tender joint count (SJC66/TJC68) for inclusion in the PsA Core Outcome Measurement Set (COS).
Methods. Using the OMERACT Filter 2.1 Instrument Selection Process, the SJC66/TJC68 was assessed for (1) domain match, (2) feasibility, (3) numerical sense (construct validity), and (4) discrimination (test retest reliability, longitudinal construct validity, sensitivity in clinical trials, and thresholds of meaning). A protocol was designed to assess the measurement properties of the SJC66/TJC68 joint count. The results were summarized in a Summary of Measurement Properties table developed by OMERACT. OMERACT members discussed and voted on whether the strength of the evidence supported that the SJC66/TJC68 had passed the OMERACT Filter as an outcome measurement instrument for the PsA COS.
Results. OMERACT delegates endorsed the use of the SJC66/TJC68 for the measurement of the peripheral arthritis component of the MSK disease activity domain. Among patient research partners, 100% voted for a “green” endorsement, whereas among the group of other stakeholders, 88% voted for a “green” endorsement.
Conclusion. The SJC66/TJC68 is the first fully endorsed outcome measurement instrument using the OMERACT Filter 2.1 and the first instrument fully endorsed within the PsA COS.
Psoriatic arthritis (PsA) is a chronic inflammatory musculoskeletal and skin disease that is clinically heterogeneous with distinct manifestations including peripheral arthritis, spondylitis, enthesitis, and dactylitis, as well as skin and nail features. Additionally, the disease affects many domains of patients’ lives including fatigue, participation, and emotional well-being. The Group for Research and Assessment of Psoriasis and Psoriatic Arthritis (GRAPPA)-Outcome Measures in Rheumatology (OMERACT) working group developed a core domain set (Figure 1) to specify which key domains should be measured in randomized controlled trials (RCT) and longitudinal observational studies (LOS) for PsA. This was endorsed at the 2016 OMERACT meeting1,2. Since that time, many work streams have been initiated as part of the Core Outcome Measures for Psoriatic Clinical Trials (COMPACT) study3,4. The GRAPPA-OMERACT working group has been evaluating the measurement properties of multiple outcome measurement instruments to develop a PsA Core Outcome Measurement Set (COS) that would assist in standardizing what is measured in RCT and how they are measured (domains and instruments)5,6.
Among the domains included in the COS, musculoskeletal (MSK) disease activity is considered one of the most important for both patients and clinicians1. The MSK disease activity domain includes peripheral joints, enthesitis, dactylitis, and spine symptoms. The tender and swollen joint counts are central to the measurement of the peripheral arthritis element of MSK disease activity. While several joint counts exist7, there are no existing recommendations about which joint count to use in RCT or LOS measuring peripheral arthritis in PsA, and none have moved through the instrument selection process described by OMERACT.
The goal of the PsA workshop at OMERACT was to seek endorsement of the 66/68 swollen and tender joint counts (SJC66/TJC68; Figure 2) as one of the instruments for the PsA COS. In this paper, we describe the instrument selection process as recommended by OMERACT, summarize the plenary presentation, and present the voting results and discussion points from the PsA workshop and breakout groups at the OMERACT 2018 meeting.
METHODS AND RESULTS
Patient engagement in the working group
One of the key tenets of the OMERACT process is involving patient research partners (PRP) in the process of developing core outcome sets. In the work presented in this paper, PRP have been involved in all aspects of the project: 3 PRP are part of the GRAPPA-OMERACT working group steering committee. They have reviewed and provided feedback on protocols, prereading materials, and presentations, and helped plan the workshop. Further, PRP from GRAPPA and OMERACT have participated in small groups and were involved in surveys and Web-based seminars.
Instrument selection process
Using the OMERACT Filter 2.1 Instrument Selection Process (Figure 3), an instrument is first assessed for “Truth: domain match” and “Feasibility,” and if these 2 steps are met, the instrument may progress to the subsequent steps, “Truth 2: Numerical Sense” (i.e., construct validity) and “Discrimination” (measured by test-retest reliability, longitudinal construct validity, ability to distinguish between treatment and placebo groups in clinical trials, and thresholds of meaning)8,9. To seek endorsement of an instrument, the working group assembles the evidence for the instrument, appraises it, and provides an overall assessment of the instrument using a Summary of Measurement Properties (SOMP) table. In the absence of evidence in the available literature, new studies may be performed by the working group to fill the evidence void. The working group makes a recommendation for endorsement and the attendees then vote on whether they agree with this recommendation. At OMERACT, the voting groups are split into PRP and others to ensure that the patient voice is adequately represented. At least 70% agreement among voting attendees at the session from both groups suggests consensus with the working group recommendation6. For a more in-depth review of the instrument selection process, see the OMERACT Handbook8. The research protocol was reviewed and approved by the Institutional Review Board (IRB) of the University of Pennsylvania (IRB PROTOCOL#: 829776) for the PRP surveys and Webinars, while the rest of the project components were deemed exempt from IRB review. Trial participants in the original trials completed informed consent prior to participation. Patients who did not give consent for their data to be used for other studies were excluded from the additional trial analyses.
Evaluation of joint counts using the OMERACT process
A systematic literature search (SLR) was first performed to identify instruments that had been used to measure MSK disease activity, which includes peripheral joint activity, enthesitis, dactylitis, and spine symptoms in PsA, and to assess their measurement properties10,11. In our report, we focused on the evidence evaluating the SJC28/TJC28, SJC66/TJC68, and SJC76/TJC78. We addressed domain match and feasibility at the GRAPPA meeting in 2017 (Amsterdam, the Netherlands), as well as with the working group and PRP (described in more detail below). We assessed the measurement properties of the joint counts in the literature (and applied the OMERACT Good Methods Checklist to assess data quality) and analyzed measurement properties in clinical trial and LOS datasets (obtained from companies and principal investigators). The working group requested data from phase III trials published between 2010–2017 and was included from 7 phase III RCT, The Tight Control of Inflammation in Psoriatic Arthritis trial, and 1 LOS, the Psoriatic Arthritis Research Consortium. A priori, a standardized protocol was designed to address content validity, construct validity, responsiveness, and discrimination.
We used these data to complete the SOMP table and presented this to the working group for a final recommendation. The results were then presented at the OMERACT meeting in Terrigal, Australia.
The PsA OMERACT Core Set Workshop at the GRAPPA 2017 meeting: domain match and feasibility of the joint counts as discussed by clinicians and other stakeholders
Domain match and feasibility for the SJC66/TJC68 were addressed at GRAPPA 2017 in a breakout group discussion, and following the meeting, among working group members using a Web-based survey. During the GRAPPA meeting, content validity and feasibility were addressed within a small group with clinicians, 2 patients, and a patient advocate; the voting sheets were completed by 22 people12. There was consensus (20/22, 91%) among the group that the SJC66/TJC68 was a match for the MSK disease activity/peripheral arthritis domain and that there was adequacy of content and no redundancies. Regarding feasibility, all the voters agreed that the SJC66/TJC68 was feasible.
Eighteen working group members completed a followup online survey. This survey documented the reasons for selecting SJC66/TJC68 count over the comparators (28 and 76/78 joint counts). The 28-joint count is a core measure for rheumatoid arthritis (RA) and is frequently performed in clinical practice. The 76/78 joint count is performed in some trials. Other joint counts beyond the 28, 66/68, and 76/78 (i.e., 32, 44, Ritchie index) were not sufficiently used in RCT or LOS to merit inclusion7.
Concerns have been raised about these joint counts in PsA: the 28-joint count does not include the joints of the feet, and those joints are frequently affected in PsA; this concern was raised by both PRP and clinicians. The 76/78 joint count includes the carpometacarpal (CMC) joints, typically involved in osteoarthritis (OA), and thus tenderness in this joint is difficult to attribute to PsA, and it separately includes the toe proximal and distal interphalangeal joints; these joints are difficult to decipher individually on examination, decreasing feasibility.
The 28-joint count did not meet domain match (does not cover key joints) and the 76/78-joint count had lower feasibility (difficult to distinguish between toe joints) and reduced domain match (CMC joint more often an OA joint) compared to the SJC66/TJC68. Given the results of the above discussions and surveys with all stakeholders, the working group decided to move forward only the SJC66/TJC68 through the OMERACT Filter (Figure 4).
Domain match and feasibility of the joint counts: PRP
To assess domain match and feasibility from the PRP perspective, a Web-based survey was designed with an embedded video of a clinician (AO) performing the SJC66/TJC68. Respondents were asked to note whether they agreed that the SJC66/TJC68 measured their perception of “peripheral arthritis disease activity” and whether it was feasible to complete within RCT or LOS visits. PRP representatives of GRAPPA and OMERACT were invited to participate in the survey, and 14 responded. Among those who responded, 9 voted green, 3 voted amber, and 1 voted white. For feasibility, 13 voted green and 1 voted white. After completion of the survey, 2 Web-based seminars were held with the participating PRP to discuss the results. Points of confusion with the domain were that several patients did not endorse for “domain match” because the SJC66/TJC68 did not include the entheses or the spine. AO reminded the group that enthesitis and spine symptoms are assessed using separate measures, and this explanation was satisfactory to those who voted “no” (although the group did not re-vote because the vote was mainly used to start the discussion). Some patients advocated for inclusion of the CMC joint as a common source of pain. PRP also noted that the feet and ankles are essential for inclusion in assessing peripheral arthritis in PsA.
Regarding feasibility, all patients felt that the SJC66/TJC68 is feasible. The only concern raised was that when patients are in a lot of pain, getting shoes on and off is uncomfortable and can decrease feasibility. Additionally, the patient needs sufficient time to respond during examination (i.e., if the SJC66/TJC68 is performed too quickly, there will not be sufficient opportunity to say “yes” to a tender joint). It was also noted by several PRP that for the joint count to be a valid assessment of peripheral arthritis, particularly tenderness, there needs to be communication between the physician and patient. There was discussion about the fact that the joint examination may miss a joint that was active within the past week but is not active that day. Finally, patients said that there was no clear meaning for “tender” and that communication from the physician prior to the joint count is assessed is needed.
Numerical sense (construct validity) and discrimination
We addressed numerical sense and discrimination through an SLR and analysis of RCT datasets. In the SLR, 1921 unique references regarding the 4 components of the MSK disease activity domain were identified, 159 were eligible for full-text article assessment, and 87 of these were excluded in this phase. Fifty-nine of the 72 remaining were excluded because they involved other components of the MSK disease domain (e.g., dactylitis) that were not pertinent for this report or because of a lack of enough data regarding the SJC66 and TJC68. Thirteen SJC66/TJC68 unique references were included in the good methods analysis. The good methods checklist is applied at the level of the instrument and measurement property tested rather than the level of the study; in our case, no study had some red and other evidence that was amber or green. Three studies had all their evidence as red and therefore were excluded, leaving 10 studies for inclusion (Figure 5).
The list of articles and summary of findings were included in Table 113–27. The results suggest that SJC66 and TJC68 have construct validity. TJC68 has adequate interrater reliability while SJC66 does not have adequate interrater reliability (ICC < 0.75)28. Regarding responsiveness and discrimination, SJC66 and TJC68 change over time in response to treatment (placebo did change as well but less) and the change in SJC66 and TJC68 can distinguish between patients receiving an effective therapy compared to placebo. We similarly addressed measurement properties including responsiveness and discrimination in RCT datasets (manuscript in progress, data presented at the OMERACT meeting). Standardized response means ranged from −0.9 to −0.5 for the SJC66 and −0.9 to −0.4 for the TJC68, thus mostly in the moderate effect range. Standardized mean difference (treatment compared to placebo) range from −0.7 to −0.2 for the SJC66 and −0.6 to −0.2 for the TJC68.
The working group concluded that the SJC66/TJC68 meets the OMERACT criteria for domain match, feasibility, truth, and discrimination. The instruments’ shortfalls are relatively low interrater reliability for the SJC only and a lack of studies addressing intrarater reliability of the TJC/SJC in PsA (Table 1).
OMERACT 2018 PsA Workshop: plenary presentation and breakout group discussions
In the plenary presentation, we presented the evidence that addressed each of the 4 steps of the OMERACT Filter 2.1 Instrument Selection Process for SJC66/TJC68. Data from these studies were summarized in the SOMP table (Table 1).
After the plenary presentation, 8 breakout groups were asked to discuss the 4 measurement properties (content validity/domain match, feasibility, construct validity, and discrimination) and vote on agreement with the working group’s assessment of green (good to go), amber (some concerns raised), or red (not endorsed). Breakout groups were facilitated by 1 OMERACT-trained facilitator and 1 reporter; reporters were part of the working group or experienced researchers. During breakout groups, the participants had the option to raise concerns regarding the working group assessment of green. Overall, most participants agreed with our assignment of green for content validity/domain match, feasibility, and construct validity. Feasibility concerns came up for some groups in that the SJC66/TJC68 takes longer than the reduced joint counts, but overall, the majority felt that the SJC66/TJC68 is feasible in the setting of an RCT or LOS. In some groups, concerns were raised about discrimination, mainly centered around the insufficient data for test-retest reliability and thresholds of meaning (both with only 1 unpublished study available in PsA). Additionally, the concern about the relatively low interrater reliability of the SJC was raised. This was countered by the argument that in most RCT, the assessor is the same throughout the study and test-retest reliability, or intrarater reliability, in a single unpublished study was found to be quite high (ICC 0.8–0.9; Tillett, et al, unpublished). Further, clinicians are generally asked to undergo training prior to trial participation to increase interrater reliability13. Reasons for endorsement of the SJC66/TJC68 that were raised included sending a clear message that this is the preferred joint count based upon evidence to assist in standardizing joint counts among RCT.
A broader discussion was raised in the small groups regarding the meaning of full endorsement of an instrument (green) or provisional endorsement (amber). Some wondered whether a green instrument would then become mandatory, similar to the inner circle of a core domain set. However, in the PsA workshop, green was used to denote the sufficient measurement properties to confidently say that the instrument is good, and amber was used to indicate that although this is a good instrument that could be used, further research is still required on its measurement properties. It is possible that multiple instruments for the same domain will pass through the filter at a green level, thus requiring a subsequent consensus process to identify the best instrument. Additional discussion then turned to define “good enough.” We assigned an amber to test-retest reliability and thresholds of meaning because of only 1 unpublished study for each. The OMERACT Handbook suggests that the instrument should then be amber. However, the working group felt that the instrument (SJC66/TJC68) should be endorsed as “green” given the data in all other domains collectively being excellent; the studies evaluating test-retest reliability and thresholds of meaning being sufficient; and further research on these domains, though supportive, is not critical to further inform the preferred use of the SJC66/TJC68 over other joint counts.
Vote for the 66/68 joint counts
Following report back from the groups and discussion, a vote was held for the endorsement of the instrument. Among PRP, 100% (of 14 patient votes in total) voted for a green endorsement. Among all other stakeholders, 88% (84 of 96 votes in total) voted for a green endorsement.
DISCUSSION
Through the years, the lack of standardization of the instruments to measure peripheral arthritis in PsA has resulted in the use of different instruments in RCT and LOS.
After a careful assessment by PRP, clinicians, methodologists, representatives of the pharmaceutical industry, and other stakeholders, and in accordance with the OMERACT Filter 2.1, the evidence supporting the measurement properties of the SJC66/TJC68 was assessed and resulted in full endorsement (green) by OMERACT as an instrument to measure MSK disease activity/peripheral arthritis in PsA. The SJC66/TJC68 is the first green instrument to enter the PsA COS.
The MSK disease activity domain includes the heterogeneous disease manifestations of PsA: enthesitis, dactylitis, spondylitis/axial arthritis, and peripheral arthritis. An ongoing program will assess and eventually seek endorsement of the optimal instruments that measure the other components of the MSK disease activity domain. While the joint count is the first to go through the filter for this domain, others will be added in the future as the additional work streams proceed through additional systematic literature reviews and consensus processes. In the meantime, PRP, regulatory agencies, investigators developing protocols for RCT and LOS, and other stakeholders can be confident with the SJC66/TJC68, and adoption of the SJC66/TJC68 will be monitored in published RCT and LOS.
Acknowledgment
We thank Kathleen Bush and Christina Burgese for administrative support. We also thank Janssen Scientific Affairs LLC for its assistance in identifying access to trial data through the YODA (Yale Open Data Access) Project. We would like to thank UCB, Novartis, and Pfizer for their scientific partnership by analyzing their clinical data of the certolizumab RAPID-PsA, secukinumab FUTURE I & II, and tofacitinib PsA OPAL studies, respectively, to support the OMERACT-GRAPPA working group.
Footnotes
Several parts of this study were funded by the Rheumatology Research Foundation Innovative Research Award (PI Ogdie) and some by R01- AR072363 (PI Ogdie). The Parker Institute, Bispebjerg and Frederiksberg Hospital (RC) is supported by a core grant from the Oak Foundation (OCAY-13-309). A.M. Orbai is a Jerome L. Greene Foundation Scholar and is supported in part by a research grant from the US National Institute of Arthritis and Musculoskeletal and Skin Diseases of the National Institutes of Health under award number P30-AR070254 (Core B), a Rheumatology Research Foundation Scientist Development award, and a Staurulakis Family Discovery award. A. Duarte-García is supported by the Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery, which receives no industry funding.
- Accepted for publication January 30, 2019.