Abstract
Objective. To describe the development and validation of the Systemic Lupus Erythematosus Disease Activity Index 2000 (SLEDAI-2K) Responder Index 50 (SRI-50), an index to measure improvement in disease manifestations on followup visits.
Methods. We proposed 50% improvement of SLEDAI-2K scores as this was felt by clinicians to reflect a clinically important improvement. We determined the best definitions of 50% improvement in each of the SLEDAI-2K descriptors. The SRI-50 data retrieval form was developed to standardize the documentation of the descriptors. The new assigned scores for the descriptors of SRI-50 were derived by dividing the score of SLEDAI-2K by 2. To evaluate the construct validity of SRI-50, all patients attending the Lupus Clinic from September 2009 to December 2009 were studied. Patients were assessed initially and on a followup visit according to both SLEDAI-2K and SRI-50 along with physician response assessment on a Likert scale (LS), which was considered the external construct.
Results. SRI-50 is a 2-page document comprising 24 descriptors. The scoring method is simple, intuitive, and cumulative, and can be derived during the patient visit. Of the 298 patients enrolled in this study, 141 had a followup visit and were studied further. SRI-50 scores decreased more in patients with LS ≥ 50% compared to LS < 50% with a decrease of > 3. The decrease in SRI-50 scores was statistically and clinically more significant than the decrease in SLEDAI-2K scores. SRI-50 detected incomplete improvement, which would not have been discerned using SLEDAI-2K.
Conclusion. SRI-50 has construct validity and is able to demonstrate incomplete, but clinically significant, improvement in disease activity between visits in patients with lupus.
- SYSTEMIC LUPUS ERYTHEMATOSUS DISEASE ACTIVITY INDEX 2000
- SLEDAI-2K
- SRI-50
- RESPONDER INDEX
- DISEASE ACTIVITY
- SYSTEMIC LUPUS ERYTHEMATOSUS
- OUTCOME MEASURES
Systemic lupus erythematosus (SLE) is a complex disease characterized by various clinical manifestations that can be related to acute disease activity or to chronic damage, which makes the disease difficult to monitor. In 1998, Outcome Measures in Rheumatoid Arthritis Clinical Trials (OMERACT) 4 recommended that 5 domains be assessed in all SLE clinical trials and longitudinal observational cohort studies: disease activity, damage resulting from lupus activity or its therapy, health related quality of life, adverse events, and economic costs including health utilities1,2.
Disease activity is defined as a reversible manifestation of the underlying inflammatory process and is a reflection of the type and severity of organ involvement at each point in time3. The assessment of disease activity depends on the use of standardized, reliable, and validated indices. For this purpose several indices for scoring disease activity in SLE are currently used3,4,5,6,7,8,9,10,11,12,13,14,15. Of those that have been validated, 2 indices, Systemic Lupus Erythematosus Disease Activity Index (SLEDAI) and British Isles Lupus Assessment Group Index (BILAG), have been used most in clinical trials and have undergone changes to assure optimal performance in clinical and research settings3,8.
SLEDAI is a global index that was developed and validated and introduced in 1985 as a clinical index for disease activity. This index was modeled on clinicians’ global judgment. It was developed with a panel of experienced rheumatologists with expertise in SLE, using well established group techniques and index development methods. SLEDAI is based on the presence of 24 features in 9 organ systems and measures disease activity in SLE patients in the previous 10 days3. SLEDAI has been used successfully by both expert clinicians and trainees, and has been adopted in both research and clinical settings16,17,18,19. SLEDAI is sensitive to change in disease activity over time20. In 2002 a revised version of SLEDAI, SLEDAI-2000 (SLEDAI-2K), was introduced, in which the persistent ongoing active disease in the items rash, alopecia, mucosal ulcers, and proteinuria would be scored, as opposed to only new occurrences as in the original SLEDAI4. SLEDAI-2K was validated against the original SLEDAI and was shown to describe disease activity at various activity levels in a manner comparable to the original SLEDAI4. However, both SLEDAI and SLEDAI-2K document findings in the past 10 days prior to the visit4. Since patients in drug trials are followed at 30-day window intervals the SLEDAI-2K was validated for a 30-day window both in a cross-sectional study and longitudinally over 1 year6,7.
SLEDAI-2K records features of disease activity in lupus as present or absent2,3,4. Thus its utility in clinical trials is limited as it cannot reflect partial improvement in a disease manifestation. This led us to develop the SLEDAI-2K Responder Index-50 (SRI-50), which could document a minimum 50% improvement in disease manifestations among lupus patients.
Our aims were: (1) to describe the development of SRI-50, an index derived from SLEDAI-2K to measure at least 50% improvement in disease activity; (2) to describe the development of the SRI-50 data retrieval form that would standardize the method of scoring of the descriptors; and (3) to test the construct validity of SRI-50 as a responder index measuring global disease activity improvement.
MATERIALS AND METHODS
Derivation of SRI-50 definitions, SRI-50 data retrieval form, and SRI-50 scores
We used SLEDAI-2K to develop the new responder index, SRI-503. A minimum of 50% improvement was felt by clinicians to reflect a significant improvement.
SRI-50 definitions
We searched the literature and generated definitions to identify 50% improvement in each of the 24 descriptors of SLEDAI-2K. The agreed-upon descriptor definitions of SRI-50 and the SRI-50 data retrieval form were evaluated first by 3 rheumatologists (ZT, DDG, and MBU) on several occasions. These definitions were then discussed with and refined by other lupologists and rheumatologists at a series of inter-divisional meetings to establish areas for improvement, mostly concerning the wording of the index. Where appropriate, changes were made in accord with the suggestions.
Rules for ascertainment were provided for each of the descriptors, whether they were physical findings, laboratory findings, patient self-evaluation, physician evaluation of variables such as cognitive dysfunction, laboratory results, or diagnostic tools (Table 1). Each descriptor refers to the preceding 30 days as in the SLEDAI-2K 30 days and descriptors are measured in a generally accepted way (Table 2)6,7.
The SRI-50 data retrieval form was developed to standardize the recording of SLEDAI-2K descriptors in an efficient way to allow calculation of SRI-50 scores (Table 3).
SRI-50 score is evaluated at the followup visit and corresponds to the sum of each of the 24 descriptors’ scores found on the SRI-50 data retrieval form. For patients with multiple followup visits, we recommend determining the score of the SRI-50 using the baseline visit scores. For patients who become worse after a period of improvement in a specific manifestation, or if a new manifestation develops in a followup visit, a subgroup analysis can be conducted to include that visit in the determination of SRI-50.
Assessment of construct validity
Construct validity of the SRI-50 definitions and the SRI-50 data retrieval form was assessed.
Patient selection
We conducted a cross-sectional study on all patients who attended the Lupus Clinic from September 2009 to December 2009. Of the 298 patients enrolled, 141 had a followup visit and were studied further.
Patient assessment
Patients were assessed initially (at an anchor visit) and then reassessed, after treatment was initiated or adjusted, in 1 to 3 months. SLEDAI-2K 0 (anchor visit) was determined on the baseline visit and the SRI-50 data retrieval form was completed. SLEDAI-2K 1 (followup visit) and SRI-50 scores were determined on a followup visit at 1 to 3 months. During the first visit a physician global assessment was determined on visual analog scale (VAS) line of 100 mm, with anchors of 0 “no disease activity” and 10 for “very active disease.” During the followup visit a physician response assessment was determined on a 7-point Likert scale (LS); 7 = much improved, 6 = moderately improved, 5 = slightly improved, 4 = unchanged, 3 = slightly worse, 2 = moderately worse, and 1 = much worse. We defined a 50% improvement as LS score ≥ 6.
Clinician scoring of disease activity
A clinician who did not know the patients and who was not aware of the SLEDAI-2K scores evaluated each patient’s record and assigned a clinical activity score for each assessment according to the following scale: improved, same, and worse, using standardized predefined definitions. “Improved,” defined as one of the following: (1) stopping treatment in the presence of improvement of an already active system or in response to complete remission of an active system; (2) a decrease in medication dosage for the above reason; (3) indication of improvement in SLE disease activity in the physician’s notes; (4) use of the term improvement in the physician’s notes. “Worse,” defined as one of the following: (1) introduction of new treatment in the presence of worsening of an already active system, or in response to activation of a new system; (2) increase in medication dosage for the above reasons; (3) indication of concern regarding SLE disease activity in the physician’s notes — arrangement for an earlier appointment/investigation to assess SLE disease activity; (4) the use of the term flare/worsening in the physician’s notes; (5) new diagnosis of SLE (new presentation, not just the accumulation of American College of Rheumatology criteria21)22. “Same”, defined as no change in disease activity in patients who did not qualify for the definitions of improved or worse.
Method and analysis
Descriptive statistics were used to describe the characteristics of the patients. We determined the mean change of SRI-50 scores (Δ SRI-50 = SLEDAI-2K 0 – SRI-50) and the mean change of SLEDAI-2K scores (Δ SLEDAI-2K = SLEDAI-2K 0 – SLEDAI-2K 1) among patients who improved, got worse, and remained unchanged as determined by the external physician.
External construct
The external construct was the LS. We further divided into 4 groups the group of patients who improved as determined by the external physician; that is, LS 4 (unchanged), LS 5 (mildly improved), LS 6 (moderately improved; ≥ 50%), and LS 7 (much improved; ≥ 50%). We determined the mean change of SRI-50 (Δ SRI-50) scores and mean change of SLEDAI-2K (Δ SLEDAI-2K) in each of the 4 groups. Spearman correlation coefficient was determined between Δ SRI-50 and Δ SLEDAI-2K, with LS 4, LS 5, LS 6, and LS 7. The paired t test was used to compare the mean Δ SRI-50 and the mean Δ SLEDAI-2K scores in patients with LS ≥ 6 to those with LS 4–5. We hypothesized that patients who had ≥ 50% improvement (LS ≥ 6–7) would be identified better by SRI-50 than by SLEDAI-2K, and the change in their SRI-50 scores would meet the definition of improvement by SLEDAI-2K (decrease > 3).
Written consent was obtained from all patients and the study was approved by the Institutional Review Board at the University of Toronto, Toronto Western Hospital.
RESULTS
Derivation of SRI-50 definitions, SRI-50 data retrieval form, and SRI-50 score
The SRI-50 definitions were developed as a 2-page document including 24 definitions for the descriptors of SLEDAI-2K to define ≥ 50% improvement (Table 2).
The SRI-50 data retrieval form is a 2-page document to standardize the recording of descriptors to allow the calculation of SRI-50 (Table 3). The assigned scores for the descriptors of SRI-50 were derived by dividing the score of the corresponding SLEDAI-2K descriptor by 2.
For the individual descriptors, separate approaches are utilized by both physician and patient to evaluate improvement between visits. The physician analyzes the results of physical findings and laboratory and diagnostic results (radiological, electrocardiogram, and others), all based on hard, well defined outcomes, to complete the SRI-50 data retrieval form (Table 1). For the descriptors, which are more subjective and require patient self-evaluation (namely cranial nerve disorder, headache, the pain of pleurisy and pericarditis, and diffuse alopecia), the SRI-50 data retrieval form records the patient self-evaluation based on a numerical scale ranging from 1 to 10 (1 is mild and 10 is most severe). For descriptors requiring a health professional’s evaluation, e.g., ≥ 50% improvement in cognitive dysfunction, the percentage improvement discerned by the professional is recorded on the data retrieval form. The physician collects and records the patients’ information on the SRI-50 data retrieval form (Table 3).
For the descriptors related to neurolupus, and more specifically psychosis and organic brain syndrome, we left it to the rheumatologist to determine if there is ≥ 50% improvement or not. Presumably the rheumatologist will confer/consult with other healthcare providers with expertise in this area, e.g., neuropsychologists or psychiatrists, to help judge percentage improvement. In trials looking specifically at these outcomes such expertise could be included in the design. As an example, in a trial of therapy for the treatment of acute cognitive dysfunction, evaluation by a neurocognitive expert would be included.
Practical applicability, administration, scoring
Administration
The SRI-50 data retrieval form is completed by the physician during the visit based on the history and clinical and laboratory findings. A complete history and physical examination are required in addition to the laboratory results related to the index. Similarly to SLEDAI-2K, for most patients it takes a couple of minutes to complete the form.
Scoring
The method of scoring is simple, cumulative, and intuitive as in the original SLEDAI-2K. In general, if required, one session of training is enough to become familiar with the definitions of SRI-50.
When a descriptor is recorded as present at the initial visit, one of 3 situations can follow: (1) the descriptor achieves complete remission at followup, in which case the score would be “0”; (2) the descriptor does not achieve a minimum of 50% improvement at followup, in which case the score would be identical to its corresponding SLEDAI-2K value; or (3) the descriptor improves by ≥ 50% (according to the SRI-50 definition) but has not achieved complete remission, in which case the score is evaluated as one-half the score that would be assigned for SLEDAI-2K. If a descriptor was not present at the initial visit, the value for SRI-50 at the followup visit will be the same as that for SLEDAI-2K. This process is repeated for each of the 24 descriptors. Finally, the SRI-50 score at followup is evaluated as the sum of the 24 individual descriptor scores.
SRI-50 score is evaluated at the followup visit and corresponds to the sum of each of the 24 descriptor scores found on the SRI-50 data retrieval form.
As recommended by the US Food and Drug Administration (FDA) in clinical trials, landmark analyses are important for comparing current patient scores versus scores recorded at their baseline visit. These landmark comparisons can be made at a series of intervals, e.g., at 2, 4, and 6 months (vs anchor visit), even though the primary outcome may be 6 months. Any deterioration in SRI-50 at 2 or 4 months would indicate a worsening in the original disease manifestation or the development of a new disease manifestation. Such occurrences could be secondary outcomes. In clinical practice, the physician is interested in how the patient is today compared to the last visit and here the comparison to the last visit may be appropriate.
Testing of concurrent construct validity
Between September 2009 and December 2009, of the 298 patients enrolled, 141 patients had followup visits and were studied further. The majority of the 141 patients were female (89.4%). The patients’ ethnic distribution was Caucasian 57.4%, Black 16.3%, Asian 9.9%, and other 16.3%. The mean age at diagnosis of SLE was 29.1 ± 11.4 years and age at first visit in this study was 44.5 ± 12.8 years. Patient characteristics are presented in Table 4.
Change in SLEDAI-2K and SRI-50 scores in patients as determined by external physician
The external physician rated patients as follows: 14 patients were worse, 65 the same, and 62 improved. SRI-50 scores did not decrease significantly below their presenting SLEDAI-2K score in patients who remained stable or worsened. In patients who improved as determined by the external physician, the SRI-50 score decreased by a mean of 2.40 ± 3.11, while SLEDAI-2K scores decreased by 1.65 ± 2.91 (Table 5). This decrease in the score of SRI-50 reflects partial improvement in the descriptors that was not determined by SLEDAI-2K on followup visit.
Change in SLEDAI-2K and SRI-50 scores in patients who improved in association with the external construct
The correlation between the external construct, LS, was moderate with the SLEDAI-2K (r1 = 0.39; p = 0.02) and with the SRI-50 (r2 = 0.48; p < 0.0001). It is not surprising that SLEDAI-2K detected improvement when there was complete resolution of a feature, which could happen with LS improvement ≥ 6. Moreover, SLEDAI-2K scores decreased with LS 4–5 (0.69 ± 2.40) and to a greater extent with LS ≥ 6 (2.89 ± 3.09) (p = 0.03). However, SRI-50 scores decreased to a greater extent with both LS 4–5 (1.06 ± 2.48) and with LS ≥ 6 (4.15 ± 3.01) (p < 0.0001). Importantly, the decrease in SRI-50 scores compared to the decrease in SLEDAI-2K scores on followup visit in patients with LS ≥ 6 was statistically and clinically more significant, meeting the definition of improvement by SLEDAI-2K, with a reduction > 3 (Table 6).
DISCUSSION
In the development of SLEDAI and its updated version, SLEDAI-2K, investigators were focused on describing disease activity and documenting descriptors as present or absent3,4,5. In clinical trials and observational studies it is very important to identify improvement related to treatment between visits. The improvement need not be total resolution to suggest that a therapeutic agent is useful. Recognition that SLEDAI-2K and other disease activity measures adopted in clinical trials have limited ability to identify partial improvement led us to consider developing alternative measures for monitoring disease activity of lupus patients23. The development of a new measurement based on a simple known index is an important advance.
We describe development and initial validation of a novel clinical index measuring improvement of SLE disease activity between visits, the SRI-50. SLEDAI-2K is a reliable and valid index that has been adopted in clinical trials and observational studies3,24. Our goal was to modify SLEDAI-2K to allow it to record partial improvement in disease activity, which would be useful to detect response to treatment in both clinical trials and observational studies. A minimum 50% improvement was felt by clinicians to reflect a clinically important improvement. The SRI-50 comprises the same 24 descriptors and covers the 9 organ systems found in the original SLEDAI-2K. SRI-50 reflects disease activity over the previous 30 days. The SRI-50 data retrieval form, which standardizes the documentation of the descriptors, performed extremely well in all descriptors; this is especially relevant for multicenter studies that form the backbone of any therapeutic evaluation for SLE.
As a first effort toward validating SRI-50, we assessed its content validity, face validity, practical applicability including administration and scoring, and concurrent construct validity. Content and face validity are qualitative assessments of a measure that rely on understanding how the items or individual questions in a measure were derived. Since the SRI-50 is derived from the SLEDAI-2K, face and content validity related to selection of the 24 descriptors to study disease activity were assumed to be present3,4. Moreover, content validity and face validity of the SRI-50 definitions and the SRI-50 data retrieval form were confirmed according to the methodology adopted in the development of the SRI-50. The agreed-on descriptor definitions of the SRI-50 definitions and the SRI-50 data retrieval form were thoroughly revised as described in Materials and Methods.
The traditional way to validate a new measure is to determine its correlation with some other measure of the trait, ideally, a “gold standard,” concurrent criterion validity. In the absence of a gold standard measure, correlation is conducted on the most commonly adopted measure in the field. In our initial validation we studied the concurrent construct validity of SRI-50 and the physician response assessment, as determined by LS, both obtained at the same time25,26. A moderate correlation (0.30–0.70) and preferably strong correlation (> 0.70) is desirable in this step25,27,28. We evaluated the performance of SRI-50 on 141 patients seen in our lupus clinic and determined its correlation with physician response assessment determined by LS. We showed that the SLEDAI-2K and SRI-50 scores on followup visit correlated with LS score ≥ 6. Importantly, the decreases in SRI-50 scores were clinically significant, meeting the accepted definition of improvement of a decrease in SLEDAI-2K of > 3, but this was not achieved with the SLEDAI-2K followup visit. Indeed, this reflected the ability of the SRI-50 to determine partial improvement between visits in patients who improved, while the SLEDAI-2K did not discern this improvement. This confirmed the SRI-50 concurrent construct validity.
In the early stage of the development of SLEDAI in 1985, the authors retained the 24 “most important” descriptors along with their corresponding weighted scores to constitute what we know today as SLEDAI. In the SRI-50, descriptors are scored as present, absent, or improved by ≥ 50%. Similar to SLEDAI-2K, the weighted scores of the descriptors in SRI-50 are not affected by their severity but are weighted by their status. A 50% improvement in certain severe features might not have a great influence on the score when compared to a 50% improvement in certain milder features. However, in a moderate to large size study, the effect of such instances is likely to be relatively small.
A number of new agents have been introduced and are in various phases of drug development in lupus; nevertheless, none have to date been approved by the FDA, and few achieved their primary outcomes in clinical trials. Although the results of these studies were disappointing, it would be premature to conclude that these therapeutic strategies cannot be effective in SLE. Several aspects of clinical design could have affected the outcomes of these trials, namely, inclusion criteria and difficulty ensuring the enrollment only of patients with active disease, the choice of primary outcomes, and use of concomitant drugs. More importantly, the lack of a robust responder index for global disease activity in SLE patients is a serious limitation when designing clinical trials. The use of the SRI-50 has the potential to overcome these problems.
In the initial validation of the SRI-50, we have used the data available at baseline and at one followup visit. We are currently analyzing our data on a larger sample size and multiple followup visits for each patient. This will allow us to evaluate the situations when comparing the scores to the baseline visit in contrast to last visit or the visit with worsening. Nevertheless, we recommend determining the score of the SRI-50 by using the baseline visit scores, whereas in a subgroup analysis, the visit that includes the worsening can be used.
Additional validation for the SRI-50 will be necessary. The minimal clinically important difference of the SRI-50 and responsiveness in clinical trials have yet to be determined. Studies are currently under way to evaluate these aspects. This validation of the SRI-50 enables it to be used as an outcome measure in clinical trials.
Footnotes
-
Dr. Touma is a recipient of the Geoff Carr Award provided by Lupus Ontario and the Arthritis Centre of Excellence Fellowship provided by the University of Toronto.
- Accepted for publication September 27, 2010.