Abstract
Objective. To estimate the probability of early remission with conventional treatment for each child with juvenile idiopathic arthritis (JIA). Children with a low chance of remission may be candidates for initial treatment with biologics or triple disease-modifying antirheumatic drugs (DMARD).
Methods. We used data from 1074 subjects in the Research in Arthritis in Canadian Children emphasizing Outcomes (ReACCh-Out) cohort. The predicted outcome was clinically inactive disease for ≥ 6 months starting within 1 year of JIA diagnosis in patients who did not receive early biologic agents or triple DMARD. Models were developed in 200 random splits of 75% of the cohort and tested on the remaining 25% of subjects, calculating expected and observed frequencies of remission and c-index values.
Results. Our best Cox logistic model combining 18 clinical variables a median of 2 days after diagnosis had a c-index of 0.69 (95% CI 0.67–0.71), better than using JIA category alone (0.59, 95% CI 0.56–0.63). Children in the lowest probability decile had a 20% chance of remission and 21% attained remission; children in the highest decile had a 69% chance of remission and 73% attained remission. Compared to 5% of subjects identified by JIA category alone, the model identified 14% of subjects as low chance of remission (probability < 0.25), of whom 77% failed to attain remission.
Conclusion. Although the model did not meet our a priori performance threshold (c-index > 0.70), it identified 3 times more subjects with low chance of remission than did JIA category alone, and it may serve as a benchmark for assessing value added by future laboratory/imaging biomarkers.
- JUVENILE IDIOPATHIC ARTHRITIS
- PREDICTION
- COHORT STUDIES
- RISK STRATIFICATION
- PROGNOSIS
The prognosis of children with juvenile idiopathic arthritis (JIA) has improved with modern treatments1. With stepwise treatment escalation consistent with the 2011 Treatment Recommendations of the American College of Rheumatology (ACR)2, herein referred to as conventional treatment, 45% of children attain inactive disease within 1 year of diagnosis1. However, there is ongoing concern that children who do not attain early remission may miss a hypothesized window of opportunity to alter the disease trajectory, and pilot studies of initial aggressive therapy with biologic agents or triple disease-modifying antirheumatic drug (DMARD) therapy have reported encouraging results3,4.
Most prognostic research on JIA has focused on identification of baseline predictors associated with a subsequent outcome5,6, rather than on combining predictors into a clinical prediction tool to estimate the likelihood of the outcome for individual patients. Clinical prediction tools that calculate individual risk, such as the Acute Physiologic Assessment and Chronic Health Evaluation (APACHE) score7 or the Framingham score8 have been available for decades, but JIA prediction tools are a recent development. In 2012, Bulatovic, et al reported a prediction model for nonresponse to methotrexate [MTX; area under the receiver-operating characteristic curve (AUC) 0.65]9. More recently, van Dijkhuizen, et al reported a model to predict MTX intolerance (c-index 0.667)10 and our group reported a model to predict a severe JIA disease course (c-index 0.85)11. Harrell’s c-index (equivalent to the AUC) is the most-quoted performance measure when testing clinical prediction tools12. A value of 0.5 corresponds to chance alone, while 1.0 means perfect prediction. In the cardiovascular literature, values > 0.70 are considered helpful prediction and values above 0.80 are considered excellent12.
We recently used data from the Research in Arthritis in Canadian Children Emphasizing Outcomes (ReACCh-Out) prospective inception cohort to develop a prediction model for a severe JIA disease course with remarkable accuracy11. We hypothesized that similar methods would lead to an accurate model (c-index > 0.70) that could be used at diagnosis to predict attainment of early clinical remission with conventional treatment. Such a model would help target initial treatment with biologics or triple DMARD to children with a low chance of remission with conventional treatment, and avoid such treatment in children who do not need it.
MATERIALS AND METHODS
The ReACCh-Out study recruited children newly diagnosed with JIA from 2005 to 2010 at 16 Canadian Pediatric Rheumatology centers and followed them for up to 5 years or until May 20121,13. The study received ethics approval at each of the 16 centers with primary ethics approval at Montreal Children’s Hospital, McGill University Health Centre (no. PED-04_065). Study visits at 0, 6, 12, 18, 24, 36, 48, and 60 months after enrollment included a full complement of physician-reported and patient-reported measures1,13. During other visits to the clinic (interim visits), these variables were reported: a physician’s global assessment (PGA) of disease activity, active and restricted joint counts, enthesitis count, current medications and erythrocyte sedimentation rate (ESR) or C-reactive protein (CRP) levels (if clinically indicated). Information from all study visits and interim visits was analyzed to determine attainment of remission as defined below.
Subjects were included if (1) they were enrolled within 90 days of diagnosis, (2) they did not receive biologics or triple DMARD therapy within 6 months of diagnosis, and (3) the outcome was known. The outcome was early remission while taking medication defined as 6 or more months of clinically inactive disease14, starting within 1 year of diagnosis. Specifically, there had to be 2 or more recorded study or interim visits in the database at least 182 days apart with no evidence of active disease, including the 12-month study visit. None of the recorded visits during this time could indicate any of the following: an active joint, enthesitis, a PGA of 1 or more on a 10-cm visual analog scale (VAS), systemic JIA manifestations, active uveitis, use of corticosteroid eye drops, morning stiffness > 15 min, ESR > 20 mm/h or CRP > 5 mg/l. Attainment of early remission was adjudicated by review of the data by 3 pediatric rheumatologists (JG, AMH, KO), who were asked to exercise clinical judgment using all the available information and to consider missing data; 155 subjects in whom the outcome could not be adjudicated by the panel (mostly subjects who missed the 12-month visit), were excluded. Subjects who discontinued treatment during the 182-day period and remained inactive were still deemed to have attained remission on medication. The date of remission was the date of the first recorded visit with no evidence of active disease.
An acceptable prediction model was defined a priori as having a c-index > 0.7012. If the initial approach failed to produce this, our study protocol called for time-to-event analysis (Cox regression) of all subjects recruited within 90 days of diagnosis who had followup information.
Candidate predictor variables
Eighty-seven variables assessed at enrollment were considered for inclusion in prediction models. They were selected because there was (1) a reported association with remission5,6, (2) an association with early inactive disease in our previous study13, or (3) a plausible association with remission in the authors’ opinion.
Among these 87 variables, 51 were associated with early remission with a p value < 0.2 in univariable screening with logistic regression. The correlation among pairs of variables was assessed using the Pearson correlation coefficient and correlated variables that were duplicate measures of the same construct, or subdomains within a measure were excluded (n = 17). Fifteen variables with a p > 0.2 were included because of strong support in previous studies and because most clinicians would like to ensure they were considered. Thus, a total of 49 variables were included in modeling. These were grouped by pediatric rheumatologists (RAB, JG, AMH, KO, NJS, LBT) as easy, moderate, or hard to ascertain in routine pediatric rheumatology practice. In general, variables such as which joints were involved, routine laboratory results, and PGA were considered easy, variables derived from the Childhood Health Assessment Questionnaire (CHAQ) were considered moderate, and variables derived from the Juvenile Arthritis Quality of Life Questionnaire were considered hard. Supplementary Tables 1–5 (available with the online version of this article) list all 87 variables and their disposition.
Development and testing of prediction models
Prediction models were developed using the methods reported by Guzman, et al11, with the modifications described here. All modeling was done with R software (www.r-project.org) and descriptive statistics were calculated with STATA 12 (StataCorp LLC).
Missing data on predictors were imputed by multiple imputation using the mice package in R15. With about 11% missing data, we opted for 20 imputed datasets as recommended by Bodner16.
For each imputed dataset, we created 10 random training (75%) and test (25%) splits. Models were developed in the training set, and their performance assessed in the test set. This allowed an honest assessment of models, preventing any overfitted models from giving an inflated impression of model performance. Models were fit using easy variables only, easy and moderate variables, or all the variables. In each of the training sets, we fitted a logistic model, a classification tree, neural networks, and several types of random forest (with various tuning parameters). Model results were calculated in the test sets averaging the 10 splits of the data in the 20 imputed datasets.
Our main metric to assess model performance was the c-index, supplemented by the logarithmic scoring rule and a method based on a chi-square goodness of fit test11. We also calculated the proportion of subjects identified as having a low chance of remission (probability < 0.25) and explored whether subjects at risk of a severe disease course using our previous model11 had a low chance of early remission. While the cutoff for a low chance of remission is arbitrary (people have different opinions of what is too low), we believe that a < 1-in-4 chance of remission with conventional treatment would prompt most clinicians and families to consider aggressive initial treatment.
Because the initial models had a c-index < 0.70, we proceeded to time-to-event analysis in an extended dataset of 1074 subjects. A test set of 184 randomly selected subjects with known remission status was reserved for testing models only. Data from the remaining subjects were used to develop the Cox regression models. The time-to-remission and time-to-inactive disease Cox models create a prediction of how long it would take a patient to attain the outcome after diagnosis. We used that prediction as an additional covariate in the logistic model for probability of remission at 1 year. Time to event was the elapsed time from the date of diagnosis to the date of the first visit with no evidence of active disease. In essence, these models summarize the same patient baseline data in a different way for the model to consider. In our final model, it was found to offer a statistically significant improvement in performance. The advantage is that Cox models use information from 158 additional subjects and that time to attainment of an outcome may be more informative than a dichotomous yes/no. Because competing models are tested in the same set of subjects, we used a paired testing procedure to estimate improvements in c-index associated with Cox logistic models.
We conducted sensitivity analyses in patient subgroups to assess whether the heterogeneity among patients with JIA interfered with our ability to find a model with greater prediction accuracy. We repeated analyses for patients with oligoarthritis and rheumatoid factor (RF)–negative polyarthritis together, and for children presenting with 4 or fewer joints versus 5 or more joints involved, after excluding patients with active sacroiliitis or systemic JIA (i.e., the first 2 treatment groups in the ACR treatment recommendations)2.
RESULTS
Of 1497 subjects with JIA recruited into the ReACCh-Out cohort, 5 were excluded owing to unknown JIA category, 353 for enrollment > 90 days from diagnosis, 65 because they attended only the enrollment visit, 155 because the outcome could not be adjudicated, and 3 because they received biologic agents within 6 months of diagnosis (2 received anakinra, 1 infliximab; no subject received early triple DMARD therapy). This left 916 subjects in our binary analyses. Subjects were enrolled a median of 2 days after diagnosis, and two-thirds were female (Table 1). Patients with oligoarthritis, RF-negative polyarthritis, or enthesitis-related arthritis formed 74% of the cohort. The median PGA was 2.9 and the median active joint count was 2; 405 subjects (44.2%) attained early remission. Many variables were associated with early remission in univariable analysis (Table 1).
There were 1074 subjects eligible for Cox regression. Characteristics of excluded subjects and subjects included only in Cox analyses, details of the multiple imputation method, and the univariable associations with early remission for all 87 candidate predictors are reported in Supplementary Tables 1–5 (available with the online version of this article).
Prediction models
In our binary analyses of 916 patients, the best-performing model was a random forest using 49 variables (c-index 0.65, 95% CI 0.62–0.68), a modest improvement compared to using JIA category alone (0.59, 95% CI 0.56–0.63). Using high risk of a severe disease course calculated by our previous model11 as indicative of a low probability of remission, the c-index was also 0.59 (95% CI 0.56–0.61).
Because binary models did not attain the target c-index of > 0.70, we proceeded to Cox analyses of 1074 subjects. This resulted in a best-performing Cox logistic model with a c-index of 0.69 (95% CI 0.67–0.71). The Cox logistic model improved the c-index by 0.04 (95% CI 0.02–0.06) relative to a simple logistic model and by 0.09 (95% CI 0.07–0.12) relative to using JIA category alone.
Table 2 reports the expected and observed frequencies of early remission with conventional treatment for subjects in each decile of risk for 3 models (Cox logistic, binary random forest, JIA category alone); these were calculated by ordering subjects in the test set from lowest to highest probability of remission as assigned by each model and dividing them into 10 groups with equal numbers of subjects (deciles).
The final Cox logistic model used 18 variables including PGA, JIA category, the pattern of joint involvement, and other routine measures that were assessed a median of 2 days after diagnosis. Also included were the patient/parent global assessment, pain in last week, French ethnicity, and joint swelling reported by parents (Table 3). Using this final model, a child’s probability of early remission while taking medications with conventional treatment, expressed as a percentage, is given by 100 * [eA/(1 + eA)], where eA is the natural antilogarithm of A and A is calculated as the following: A = −0.23 + 0.91 (Cox predictor for time to remission on medications) + 0.12 (Cox predictor for time to inactive disease). The probability of early remission with conventional treatment for any child with JIA can be obtained using the online calculator available at andrew-j-henrey.shinyapps.io/JIA_Remission_Calc.
If instead of considering the probability of remission, one wished to use model results as a dichotomous diagnostic test for non-remission, the cutoff of < 0.5 probability of remission results in a sensitivity of 0.71 and specificity of 0.57, while the cutoff of < 0.25 results in a sensitivity of 0.20 and specificity of 0.76. If one accepts that a probability of remission of < 0.25 with conventional treatment would justify aggressive initial treatment, JIA category alone identifies 5% of subjects as candidates for aggressive treatment (primarily subjects with RF-positive polyarthritis) and 70% of those subjects did not attain early remission with conventional treatment. Our top Cox logistic model identifies 14% of subjects as candidates for aggressive treatment, and 77% of them did not attain early remission. For reference, our previously published model11 identifies 13% of subjects in the current sample as high risk for a severe disease course and 77% of them did not attain early remission. The overlap of subjects identified by the 2 models is 39%.
Our sensitivity analyses in JIA patient subgroups did not improve prediction accuracy because they had c-index values of 0.64 to 0.68 (Supplementary Tables 1–5, available with the online version of this article).
DISCUSSION
We used clinical and routine laboratory data from a large prospective inception cohort of children with JIA to develop a model to predict attainment of early disease remission with conventional treatment. With a c-index of 0.69, our best performing model was better than using JIA category alone but fell short of the threshold of > 0.70 recommended for clinical prediction tools12. However, this model identified 3 times as many children with a low chance of remission compared to using the JIA category alone (14% vs 5%). This may well be the limit of prediction accuracy attainable with routine clinical and laboratory variables, and novel biomarkers may be required to improve prediction accuracy. Our model can serve as a benchmark to evaluate the value added by potential JIA laboratory/imaging biomarkers in future research.
Our model includes 2 patient-reported outcomes, the parent’s/patient’s global assessment of well-being and pain severity in the last week. Thus, in addition to their value as outcomes, these 2 components of the CHAQ17 help predict response to conventional treatment, even after accounting for clinical and laboratory variables.
The model includes 2 unexpected variables that increase the likelihood of early remission: French ethnicity and joint swelling observed by parents. This may mean that JIA is less severe in children with French ethnicity, but it could also be a reflection of differences in treatment approach or physicians’ assessment of attainment of remission in French Canada relative to English Canada. Easily visible swollen joints such as knees may be more frequently involved in children with mild disease (e.g., oligoarthritis) or prompt earlier referral and treatment. A longer time from onset to diagnosis decreased the chances of early remission.
The interaction between treatment intensity and attainment of early remission merits careful consideration. In our study, patients who did not attain remission were more often prescribed early DMARD and systemic corticosteroids. This suggests that physicians identified their disease as severe, but the increased treatment was not uniformly successful in attaining early remission. Incorporating initial use of DMARD or corticosteroids as predictors in our current study did not improve the accuracy of the models. It could be argued that our excluding patients who received early aggressive treatment left only subjects with relatively benign disease in the study. The advantage of using the ReACCh-Out dataset in our study is that patients were diagnosed in 2005–2010, a time when early aggressive treatment was infrequent in Canada. There were only 3 subjects excluded because of use of a biologic agent within 6 months of diagnosis and no patient was excluded because of early triple DMARD therapy. Because this represents < 0.5% of eligible subjects, we believe their exclusion did not significantly bias the population toward benign disease. It is likely that including response to treatment at 3 or 6 months after diagnosis will improve prediction of remission, but we chose not to include this information in our present study because our goal was to predict attainment of early remission at diagnosis to make the most of the hypothesized window of opportunity to change disease trajectory.
JIA is a heterogeneous group of disorders and current JIA categories may not be the best way to categorize subjects, thus we conducted sensitivity analyses in alternative subject groupings, excluding JIA category as a predictor. These sensitivity analyses found no increased prediction accuracy relative to our main model.
The strength of our study is that we used prospectively collected data from a large cohort of patients enrolled shortly after diagnosis and analyzed candidate predictors usually available in routine clinical settings, always testing model accuracy in subjects not included in model development to prevent overfitting. However, our study has some limitations. First, it is conceivable that other information not collected in the ReACCh-Out cohort may improve prediction. Second, missing data are unavoidable in large cohorts in regular practice; we have addressed missing data with multiple imputation, and by having a panel of 3 pediatric rheumatologists adjudicate remission. Third, our definition of remission while taking medications may have missed some subjects who had inactive disease for 6 months shortly after diagnosis, if they had active disease at the 12-month visit. We felt that such short-lived episodes of disease control were not very meaningful. Fourth, our findings may not be generalizable to other countries (e.g., French ethnicity may only be relevant in some countries). Fifth, the patient/parent global and pain severity assessments included in the model were completed by the patient or a parent (for younger children) and this mix of patient and parent-reported scores may be problematic; reassuringly, although parent reports may over- or underrepresent the child’s pain, on average the difference is only 0.04 cm in a 10-cm VAS18. The fact that some patients/parents answered questionnaires in French should not have interfered with our outcome assessment, because pain scores and parent’s global assessments did not form part of our definition of inactive disease. Differences between French and English may have increased variability in assessment of those variables as predictors in the models; however, the linguistic equivalence of the questions was judged adequate by bilingual speakers, and the French version of the CHAQ has been validated by Pouchot, et al19. Last, conventional treatment in our study was consistent with the 2011 ACR treatment recommendations2, but treatment evolves over time and treatment recommendations are updated.
Because our model did not reach conventional thresholds for accuracy of clinical prediction tools and it is unclear how it will perform in other settings, routine adoption in clinical practice is not advisable at this time. An alternative is to use our model to predict a severe disease course instead11, recognizing it does not predict early remission directly. The argument in its favor is that that model is very accurate and children at high risk of a severe disease course may benefit from more aggressive initial treatment even if their short-term response to conventional treatment is less predictable. It will be important to assess how both prediction models fare when tested in other inception cohorts.
Some physicians may find it helpful to use the current model’s predictions or the univariable associations reported in Table 1 in selected cases to augment their clinical judgment. Because the likelihood of remission estimated by the model closely paralleled the observed frequency of remission, the model estimates could be shared with families in Canada as a starting point for discussing the choice of initial treatment.
By combining clinical and laboratory findings at diagnosis, we developed a model that estimates the probability of early remission with conventional treatment for each child with JIA and helps identify children with a low chance of remission. The model was superior to using JIA category alone but fell short of accepted thresholds for performance of clinical prediction tools. Importantly, this constitutes proof of principle that systematic study of existing JIA cohorts can generate new information to assist with treatment decision making in individual patients. Further research is critical to improve the accuracy of predictions and may include the use of novel laboratory and imaging (ultrasound, magnetic resonance imaging) biomarkers and advanced modeling techniques; our results can be used as a benchmark to evaluate the value added by those novel approaches.
ONLINE SUPPLEMENT
Supplementary material accompanies the online version of this article.
Acknowledgment
Our greatest appreciation goes to the Canadian children and their families who volunteered their time and information to make the ReACCh-Out study possible.
APPENDIX
List of study collaborators. We appreciate the contribution of the following additional investigators in the Research in Arthritis in Canadian Children emphasizing Outcomes Study (ReACCh-Out): Roxana Bolaria, Katherine Gross, Stuart E. Turvey, David Cabral, Ross Petty, University of British Columbia, Vancouver, British Columbia; Janet Ellsworth, the Stollery Children’s Hospital and University of Alberta, Edmonton, Alberta; Nicole Johnson, Paivi Miettunen, Heinrike Schmeling, the Alberta Children’s Hospital and University of Calgary, Calgary, Alberta; Maggie Larché, McMaster University, Hamilton, Ontario; Deborah M. Levy, Ronald M. Laxer, Debbie Feldman, Lynn Spiegel, Rayfel Schneider, Shirley M.L. Tse, Earl Silverman, Bonnie Cameron, Rae S.M. Yeung, Hospital for Sick Children and University of Toronto, Toronto, Ontario; Johannes Roth, Michele Gibbon, Children’s Hospital of Eastern Ontario and University of Ottawa, Ottawa, Ontario; Anne-Laure Chetaille, Jean Dorval, Centre Hospitalier Universitaire de Laval and Université Laval, Quebec City, Quebec; Gilles Boire, Centre Hospitalier Universitaire de Sherbrooke and Université de Sherbrooke, Sherbrooke, Quebec; Sarah Campillo, Claire LeBlanc, McGill University Health Centre and McGill University, Montréal, Quebec; Elie Haddad, Claire St. Cyr, CHU Ste. Justine and Université de Montréal, Montréal, Quebec; Suzanne E. Ramsey, Elizabeth Stringer, IWK Health Centre and Dalhousie University, Halifax, Nova Scotia, Canada.
- Accepted for publication September 13, 2018.
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.