Abstract
Objective. To determine the most effective immunosuppressive therapy for the longterm management of proliferative lupus nephritis (PLN) based on the outcome of renal failure.
Methods. A systematic review of randomized controlled trials (RCT) was conducted. MEDLINE and EMBASE were searched. RCT designed to examine the maintenance treatment effectiveness of immunosuppressive agents for PLN were included. A Bayesian network metaanalysis of 2-arm and 3-arm trials was used. A skeptical prior assumption was used in sensitivity analysis. Four immunosuppressive agents were evaluated: cyclophosphamide (CYC), azathioprine (AZA), mycophenolate mofetil (MMF), and prednisone alone. The outcome of interest was renal failure during the study period, defined by serum creatinine (sCr) > 256 µmol/l, doubling of sCr from baseline, and/or endstage renal disease.
Results. The OR (95% credible interval) of developing renal failure at 2–3 years was 0.72 (0.11, 4.49) for AZA versus CYC, 0.32 (0.04, 2.25) for MMF versus CYC, 2.40 (0.22, 36.94) for prednisone alone versus CYC, and 0.45 (0.11, 1.48) for MMF versus AZA. The probability (95% credible interval) of developing renal failure at 2 years as expected for each agent was 6% (0.7%, 24%) for MMF, 12% (2%, 37%) for AZA, 16% (5%, 33%) for CYC, and 31% (5%, 81%) for prednisone alone. After applying a skeptical prior in the Bayesian analysis, there was no evidence of benefit for 1 therapy over another.
Conclusion. Although the data suggest that MMF may be superior to other treatments for the maintenance treatment of PLN, the evidence is not conclusive.
- SYSTEMIC LUPUS ERYTHEMATOSUS
- LUPUS NEPHRITIS
- METAANALYSIS
- IMMUNOSUPPRESSIVE AGENTS
- DRUG THERAPY
- CLINICAL TRIALS
Proliferative lupus nephritis (PLN) is an important cause of endstage renal disease (ESRD) in patients with systemic lupus erythematosus1. The longterm course of PLN is characterized by frequent relapses during the maintenance treatment phase that lead to a progressive deterioration of renal function2,3. The aim of maintenance treatment is therefore to achieve and sustain renal remission by preventing relapses, leading to the best longterm outcome.
Aggressive immunosuppressive treatment is associated with improved renal survival of patients with PLN4. However, the longterm use of cyclophosphamide (CYC) is associated with a wide spectrum of toxic effects5. Infections associated with longterm use of CYC often require hospitalization and may be fatal6. Amenorrhea has been reported in up to 37% of treated patients, with permanent amenorrhea occurring in 15%, varying with cumulative dose of CYC and age at treatment7. Renal relapses have been reported in about 40% of the patients treated with CYC8.
Alternatives, such as mycophenolate mofetil (MMF) and azathioprine (AZA), are associated with a better safety profile and are therefore used increasingly frequently to attain a better overall prognosis9. Similar to CYC, the effectiveness of these immunosuppressive agents, especially for longterm use, is not clear10. Treatment regimens vary and the “standard of care” is debated. This is best exemplified by the differing guidelines of the American College of Rheumatology and European League Against Rheumatism/European Renal Association — European Dialysis and Transplant Association11,12.
In a large trial [the Aspreva Lupus Management Study (ALMS) trial]1, MMF was suggested to be more effective than AZA in maintaining renal remission and in preventing renal relapses or progression to renal death during the 3-year study period. However, previous studies did not find a difference between these 2 agents at 18 months13 or at 3 years14. Therefore, there is no consistent evidence to suggest the best choice of maintenance therapy.
Our study aim was to examine and evaluate the relative effectiveness of the most commonly used immunosuppressive agents as maintenance therapies in PLN by determining the immunosuppressive agent that is associated with the highest probability of preserving renal function over a prolonged time period. The secondary goal was to determine what the 2-year probability is to develop ESRD or chronic renal failure with the different treatments.
MATERIALS AND METHODS
We used the same methods as reported in our review of induction treatments15. In brief, we conducted the review following the Cochrane Handbook for Systematic Reviews of Interventions16, performed synthesis following the guidelines of the Indirect Treatment Comparisons Good Research Practices Task Force17,18, and reported results following the PRISMA Statement (Preferred Reporting Items for Systematic reviews and Meta-Analyses)19.
Eligibility criteria
Studies that met the eligibility criteria were defined a priori. The study population was adult or pediatric patients with definite PLN who had been given an induction treatment and were receiving a maintenance treatment. PLN was defined by diagnostic biopsy proof of class III or IV (or III/V, IV/V) using either the World Health Organization Classification Criteria20 or the International Society of Nephrology/Renal Pathology Society 2003 Classification Criteria21 (for studies done after 2003). Induction treatment was defined as an initial intensified immunosuppression treatment used for at least 6 months that was intended to induce renal remission. Maintenance treatment was defined by an additional immunosuppression treatment phase immediately following the induction treatment that was intended to sustain renal remission over time and reduce treatment toxicity.
CYC, AZA, MMF, and/or prednisone alone was begun at the maintenance phase or continued from the induction treatment. A comparator was any therapy directly compared with any 1 of the interventions (including those listed above). Outcome was the number of patients who developed renal failure during the maintenance study period, which was defined by (1) serum creatinine (sCr) > 256 μmol/l, and/or (2) doubling of baseline sCr, and/or (3) ESRD that necessitated renal replacement therapy (dialysis or transplant).
The timing was longterm use of immunosuppression beyond the induction phase, longer than 12 months, and at 24, 36, and 60 months, respectively, as counted from the start point of the maintenance treatment phase until the end of the study period. The study design was randomized controlled trials.
Search strategy
We performed a comprehensive literature search using an optimized search strategy that was reported in our first review15. Relevant trials were retrieved from MEDLINE (from 1946 to the first week of July 2012) and EMBASE (from 1947 to 2012 week 27). The Cochrane Central Register of Controlled Trials (CCTR; up to third quarter – June 2012) was also searched for additional trials. Trial registries (clinicaltrials.gov, eudract.ema.europa.eu) were searched for unpublished clinical trials. Expanded search terms were used on the basis of the following key terms: “lupus nephritis”, “clinical trial”, “cyclophosphamide”, “azathioprine”, “mycophenolate mofetil”, “glucocorticoids”, and “tacrolimus”. In addition, a manual search was conducted on the reference lists of included studies and published reviews. Tables of contents of major journals in the field for the past 5 years were also searched.
Study selection
After abstracts of searched references were reviewed, full texts of eligible studies were independently reviewed by 2 authors (SYT and EDS). A flowchart of study selection is outlined in Appendix A (available online at jrheum.org). Disagreement was resolved by agreement, or by an adjudicator (BMF). Studies that were deemed irrelevant to study purposes were excluded.
Critical appraisal
Two reviewers (SYT, EDS) independently assessed risk of bias (ROB) using the Cochrane Collaboration tool16. Consensus was reached before excluding a study, and an adjudicator (BMF) was used when necessary.
Data extraction
Data extraction was performed in an independent duplicate manner. Some outcome data were extracted from plots, for example the 3-year outcome data of Donadio, et al22. An adjudicator (BMF) was used for confirmation. In particular, the following data were extracted:
Treatment. We evaluated the effectiveness of 4 therapies used for maintenance treatment: CYC, AZA, MMF, and/or prednisone alone. Table 1 lists the treatment arms of each study and whether the treatment assignment was continued from the induction phase or rerandomized at the maintenance phase. In the table, regardless of the induction phase, the maintenance phase was about 18–36 months in length, and the therapies used in 3 out of the 6 studies were a continuation from the induction treatment.
Outcome. The outcome of interest was the number of patients in each arm who developed renal failure as defined in our eligibility criteria. Table 2 presents the number of outcomes and sample size of each study. Study features and definitions for renal failure are also presented in this table.
Therapies and their durations used in the 6 studies included.
Outcomes of the included 6 studies.
Evidence synthesis
A Bayesian approach to network metaanalysis was used that facilitates all possible — direct or indirect — pairwise comparisons25,26. Figure 1 shows the schematic rationale. To our knowledge, head-on comparison of MMF versus prednisone alone had never been studied in a trial at maintenance phase. This indirect comparison, as denoted by a dashed line on the network, however, was made possible through contrasting direct comparisons that were available in the literature, as denoted by solid lines. We treated CYC in our synthesis as a common comparator because it had been accepted as a standard of care in many centers in most countries.
Network of pairwise comparisons. A solid line denotes a direct comparison between 2 basic nodes. A dashed line denotes an indirect comparison between 2 functional nodes. The number of all possible pairwise comparisons in this case is “4 choose 2” = 6. Numbers denote the study numbers in Table 2. Cyclo: cyclophosphamide; MMF: mycophenolate mofetil; Pred: prednisone alone; Aza: azathioprine.
As done in sensitivity analysis, analyses were performed under 2 sets of prior assumptions. First, an analysis with a minimum of prior assumptions was undertaken using a flat or noninformative prior distribution. Second, a skeptical analysis was undertaken using an informative prior distribution expressing a subjective belief that there was no difference between any pair of immunosuppressive agents in preserving renal function over time25. A normal prior distribution on the log odds variable was used for this purpose (Appendix B, available online at jrheum.org).
Heterogeneity
Between-study heterogeneity was measured using the Q and I2 statistics, as shown in Figure 2. The chi-square is for the Q statistic; I2 is a more accurate measure than Q because it corrects for the number of studies combined16.
Caterpillar plots of conventional (frequentist) metaanalysis of the 6 studies included, with 5 pairwise comparisons. The outcome is renal failure at a mean of 18–36 months of maintenance treatment. For the Mantel-Haenszel method used to combine OR see Cochrane Handbook16. Data also presented in Table 2. Aza: azathioprine; Cyclo: cyclophosphamide; MMF: mycophenolate mofetil; Pred: prednisone alone.
Direct and indirect evidence needs to be examined for consistency before study results can be combined using a network metaanalysis17. A consistency evaluation was conducted as shown in Appendix C (available online at jrheum.org).
Publication bias
We also searched trial registries (clinicaltrials.gov, eudract.ema.europa.eu) and CCTR for unpublished and potentially negative trial results (see Search Strategy, above). We used a funnel plot to detect for this bias. We also used, in sensitivity analysis, a skeptical prior to correct for potential publication bias27.
Statistical analysis
Bayesian network metaanalysis of 3-arm trials was used28. Because 2 of the 6 included trials had 3 comparison arms, we used a model for 3-arm trials to adjust for within-trial covariance when synthesizing data29,30. A random-effects synthesis was decided a priori to incorporate between-study heterogeneity16,17. Results were compared between using Bayesian network synthesis and a conventional (frequentist) approach. We also performed sensitivity analysis by using different sets of prior assumptions to examine the consistency and robustness of synthesized results. OR was used as the effect measure. Results were interpreted from a Bayesian perspective, and 95% credible interval was calculated for each effect measure.
The analysis was done in R (ver. 2.15.1)31 using the R2WinBUGS package (ver. 2.1–18)32 to communicate with WinBUGS (ver. 1.4.3)33. RevMan (ver. 5.1)34 was used to generate standard caterpillar plots.
RESULTS
A total of 2004 abstracts was identified using the search strategy outlined, and a review of the title and/or abstract eliminated 1823. After review of the full texts of the remaining 181 studies, 135 were excluded for reasons outlined in Appendix A (available online at jrheum.org). ROB assessment of the remaining 46 studies eliminated 40 studies for reasons as presented in Appendix D (available online at jrheum.org). The remaining 6 studies were graded of low ROB and constituted the analysis.
The duration of the maintenance phase varied among and within the 6 studies, with a mean followup time ranging from 18–36 months (Table 1). We first used a conventional metaanalysis, followed by a Bayesian approach. The primary outcome variable was renal failure as defined in the eligibility criteria of the methods section.
Conventional analysis
In a comparison of the effect of the 4 therapies on the development of renal failure (Figure 2) with CYC, the comparison of AZA to CYC suggested that AZA and CYC were likely equivalent when used as maintenance therapies, MMF appeared to be superior to CYC in preventing renal failure over time, and prednisone alone appeared to be inferior to CYC. However, none of the differences were statistically significant, with all of the 95% CI being wide and crossing 1; the sample sizes were generally quite small for all comparisons.
MMF seemed likely to be superior to AZA, and prednisone alone seemed inferior to AZA in 2 small studies, and as stated, CYC and AZA seemed to be equivalent.
Both AZA and CYC appeared to be inferior to MMF (as described). Because there were not any studies that met the eligibility criteria for the comparison of prednisone alone to MMF, an indirect comparison was performed. This comparison was empirically calculated using the 2 OR of MMF versus AZA contrasted to prednisone versus AZA, which is given by 0.56 ÷ 1.85 = 0.30, or MMF versus CYC contrasted to prednisone versus CYC, which is given by 0.16 ÷ 2.45 = 0.07, and the average between the 2 measures, with similarity (or combinability) assumed, is given by (favoring MMF):
As described above, both AZA and CYC appeared to be superior to prednisone alone while there were no eligible studies comparing MMF to prednisone alone.
For heterogeneity, the I2 = 0 indicated a negligible quantity of between-study heterogeneity16. However, interpretation of this value was limited by the small number of studies used. A random-effects synthesis was chosen a priori to incorporate any potentially existing heterogeneity.
In consistency evaluation, it was shown that there was no evidence to reject the null hypothesis that the direct and indirect evidence to be combined is consistent (Appendix C, available online at jrheum.org), which justified the use of a network metaanalysis.
In checking for publication bias, search results using trial registries and CCTR did not show additional trials that satisfied our eligibility criteria. The fact that the majority of our searched trial results were negative (null) suggests that publication bias was unlikely. Published results were located roughly symmetrical around the null effect (OR = 1), especially at a larger sample size (where standard error is small), as shown in the funnel plot of Appendix E (available online at jrheum.org), again suggesting that publication bias was unlikely to be a major concern in our synthesis.
Network metaanalysis
Following the conventional metaanalysis, a Bayesian network synthesis of the 6 included studies was performed. Table 3 presents OR and associated 95% credible intervals (caterpillar plot shown in Appendix F, available online at jrheum.org). This analysis allowed for rigorous indirect comparisons of treatments, such as prednisone alone versus MMF. In most cases, the available evidence was limited and therefore more uncertainty was associated with those comparisons.
Expected OR (95% credible interval). Agent in columns is the numerator and agent in rows is the denominator.
The first row of Table 3 shows that MMF was the therapy with the lowest odds of renal failure over the study period for the maintenance treatment, followed by AZA, and then CYC, while prednisone alone had the highest odds of renal failure. The OR 0.72 of renal failure associated with AZA versus CYC is interpreted as the odds of developing renal failure for patients treated with AZA being 0.72 times as high as the odds for those treated with CYC during the study period. The associated 95% credible interval (0.11, 4.49) was large, indicating limited evidence available and a high level of uncertainty. The odds of developing renal failure for MMF were 0.32 times as high as those for CYC, and 0.45 times as high as for AZA. Prednisone alone was inferior to any of the other agents, with the highest OR 7.56 of developing renal failure when compared with MMF. The associated uncertainty, however, was large — because all of the 95% credible intervals covered 1.
As shown in Table 4, MMF had the highest probability, at 81%, of being the best therapy while the other therapies had low probabilities (CYC: 10%, AZA: 6%, and prednisone alone: 4%). AZA had the highest probability of ranking second at 55% while CYC had a 23% probability of ranking second. CYC had the highest probability of ranking third at 48% while AZA had a 32% chance of ranking third, and prednisone alone had the highest probability of ranking fourth at 73% while CYC and AZA had low probabilities of ranking fourth at 19% and 7%, respectively.
Expected probability of ranks for each therapy (95% credible interval). Agent in columns is ranked. Rank in rows indicates the probability for a rank. Rank is a comprehensive measure.
We also calculated the expected probability (95% credible interval) of developing renal failure over the study period for each agent, which was given by (1) MMF 6% (0.7%, 24%), (2) AZA 12% (2%, 37%), (3) CYC 16% (5%, 33%), and (4) prednisone alone 31% (5%, 81%). Therefore, about 12% and 16% of the patients would be expected to develop renal failure at 2 years if treated with AZA or CYC, respectively, while only 6% would develop renal failure if treated with MMF. However, the 95% credible intervals overlapped.
Sensitivity analysis
We also used a skeptical prior in the Bayesian analysis to examine the robustness of results (see Appendix B, available online at jrheum.org, for the specification of prior assumptions). No therapy was likely to be superior in this analysis. The results seem to be reasonably equivalent as shown in the second part of Table 3 and Table 4.
DISCUSSION
High-dose glucocorticoids with the addition of another immunosuppressive agent are the mainstay in the treatment of PLN5,35. Several groups have published independent guidelines for the management of PLN11,12,36,37, but the optimal regimen is unclear because there are insufficient data to allow for a high level of certainty regarding the recommendation of the best therapy. This study confirms that the addition of a second-line immunosuppressive agent was superior to prednisone alone during the maintenance phase of treatment, and expands on current knowledge by deriving a hierarchy of probability of preventing renal failure in the longterm use.
Studies suggest that MMF therapy is associated with the best longterm outcome of PLN1,23. AZA is recommended by some, and in particular when MMF is not tolerated and for women who are pregnant or pursuing pregnancy37,38. Our analysis demonstrated that MMF therapy was associated with the greatest chance of preventing ESRD (it had the highest ranking when used as maintenance therapy for PLN). We showed that MMF was likely superior to CYC; however, the associated uncertainty was wide and rank orders changed when incorporating a skeptical prior belief in the analysis. These facts suggest that trial evidence was insufficient to allow for strong evidence-based recommendations and more studies are needed. Further, we have shown that AZA and CYC were equivalent when the therapy was continued into the extension phase. This suggests that AZA should be considered as the maintenance therapy of choice over CYC and when MMF is contraindicated, such as during pregnancy or following failure of MMF therapy. Although Bayesian and conventional approaches differ fundamentally, our results using these 2 different approaches were consistent.
The authors of the ALMS trial1, the largest trial of maintenance therapies, concluded that MMF was better than AZA because there was a greater chance of maintaining renal remission over a 36-month study period38. However, by combining this result with evidence from other well-conducted trials, we showed that the 2 agents could not be clearly ranked when the outcome was renal failure prevention. Considering the fact that AZA is often underdosed (2 mg/kg/day)39, it may be potentially more useful if used at a higher dose; however, this is purely speculative because there were no studies using this dose. Although not studied here, safety is also important to consider in determining medication use. AZA has been considered safe for women who are pregnant (as discussed above) and has a good longterm safety profile37,38; it may therefore be potentially very useful in a vulnerable population with this disease, e.g., young women of child-bearing age or perhaps children, especially in the longterm management.
There were limitations to our study. Included trials were few, and sample sizes were small, which limited the internal and external validity of our analysis, and as a result, the uncertainty associated with the effect measures was large. The durations of 2 to 3 years of included studies were likely too short to fully examine the development of ESRD, and these studies did not report renal remission data because they were designed and conducted for maintenance treatment. We did not examine safety profile, while adverse events, and costs, may be important in driving clinical decisions on longterm maintenance therapy in this disease. This should be addressed in future studies. As a result of the sparseness of evidence available, we could not use metaregression to adjust for any background discrepancy, such as differential distribution of demographic features, disease severity, treatment dose, or induction treatment used, and this information was not available in many of the included studies. A few patients with pure class V or II lupus nephritis were enrolled in some of the included studies, and we had to allow for up to 15% of patients to have histological classes other than PLN, or no biopsy proof, in studies to be included. Some included studies might have design issues. For example, the induction phase was shorter than 6 months, patients enrolled to maintenance phase study responded differentially to induction treatment (e.g., responders entered in the ALMS trial, but all entered in the MAINTAIN trial), or treatment assignment was not re-randomized for the maintenance phase and therapies were switched too often or applied in a nonstandard way. However, heterogeneity among combined trials may enhance the external validity of a metaanalysis16,17. Also, rather old trials (e.g., from the 1970s) were included, and concomitant treatments may have changed over time, particularly the current standard use of antihypertension agents. However, it was the rigor of the trial conduct that was important to consider, and well-conducted old trials should be valued as well because longterm controlled trials are very difficult to successfully perform in this rare disease.
Finally, outcome measures were not defined uniformly or consistently among the included trials. For example, various sCr cutoffs were used for renal function insufficiency. Therefore, in our synthesis, we had to use a composite outcome definition that may have led to an inaccurate count of outcomes of individual trials. Better-designed studies are needed in the future for maintenance treatment of this disease.
We have shown that MMF may be the better therapy in preventing renal failure at 2–3 years, but there is insufficient evidence to support a firm conclusion about the relative treatment effectiveness for PLN, and therefore longterm safety profile and cost of each medication should be an important consideration for maintenance therapy. Clearly, more studies are needed. We suggest that (1) future trials consider comparing MMF with high dose AZA for 2–3 years for the outcome of renal remission15, (2) future maintenance trials consider a study period longer than 5 years (ideally 10 yrs) for the outcome of renal failure, and (3) future trials should have a standardized conduct protocol for methods, outcome measure, and reporting40,41. More collaborative work is needed in designing and conducting these trials.
ONLINE SUPPLEMENT
Supplementary data for this article are available online at jrheum.org.
Footnotes
S.Y. Tian’s PhD research is funded in part by the Queen Elizabeth II–Edward Dunlop Foundation Scholarships in Science and Technology, administered at the University of Toronto.
- Accepted for publication March 24, 2015.