Abstract
Objective. To identify clinical features that define disease activity in pediatric localized scleroderma (LS), and determine their specificity and importance.
Methods. We conducted a multicenter prospective study of patients with active and inactive LS skin lesions. A standardized evaluation of a single designated study lesion per subject was performed at 3 visits. We evaluated the pattern and correlation between assessed features and physician’s global assessments of activity (PGA-A).
Results. Ninety of 103 subjects had evaluable data; 66 had active and 24 inactive disease. Subjects had similar age of onset, sex, and disease patterns. Linear scleroderma was the most common subtype. Features specific for active disease included erythema, violaceous color, tactile warmth, abnormal skin texture, and disease extension. Scores for these variables changed over time and correlated with PGA-A of the lesion. Active and inactive lesions could not be distinguished by the presence or level of skin thickening, either of lesion edge or center. However, in active lesions, skin thickening scores did correlate with PGA–A scores. Regression analysis identified the combination of erythema, disease extension, violaceous color, skin thickening, and abnormal texture as predictive of PGA-A at study entry. Damage features were common irrespective of activity status.
Conclusion. We identified variables strongly associated with disease activity, expanding upon those used in current measures, and determined their relative importance in physician activity scoring. Skin thickening was found to lack specificity for disease activity. These results will help guide development of a sensitive, responsive activity tool to improve care of patients with LS.
Localized scleroderma (LS) is the most common childhood form of scleroderma, an autoimmune disease whose pathology includes inflammation, vasculopathy, and fibrosis1,2,3. The disease often lasts throughout childhood either with a persistently active or a remitting and relapsing course, and many continue to have active disease as adults4,5,6,7. Deep tissue and extracutaneous involvement is common2,8, and pediatric-onset LS is associated with higher damage levels than adult-onset disease7. In contrast to the widely held impression that pediatric LS is a “benign disease” with a “good outcome,” its chronicity and extracutaneous involvement cause important longterm morbidity2,6,9. Sequelae include limb and facial hemiatrophy, severe disfigurement, arthritis, seizures, uveitis, psychosocial complications, and rarely, death2,10,11,12.
There are few options for treating fibrosis, so treatment aims to minimize progression of fibrosis and other damage by controlling inflammation. While systemic immunosuppression is considered the standard of care for moderate to severe pediatric LS13,14,15, comparative effectiveness and other large treatment studies are needed to identify the most effective regimens16. Longterm disease monitoring is needed because active disease can persist for decades4,6, and relapses are common (15–53% of cases)17,18,19,20,21. A sensitive disease activity tool to track disease status and measure treatment response is therefore needed.
Assessment of pediatric LS disease activity is difficult because routine laboratory tests are not reliable biomarkers of active disease in most patients22. While imaging modalities can aid evaluation, especially of deeper tissue involvement8,23,24,25,26, routine evaluation relies on clinical assessment. Clinical measures to evaluate disease activity or severity (activity and damage) include the Modified Skin Score (MSS, modeled after the modified Rodnan skin score); Dyspigmentation, Induration, Erythema, and Telangiectasia measure (DIET); LS Severity Index (LoSSI, and its modification, mLoSSI), and Computerized Skin Score (CSS)27,28,29,30,31,32. All these measures score skin thickening (ST) or induration. Additional scored features include erythema (DIET, LoSSI), disease extension (LoSSI, CSS), dyspigmentation (DIET), telangiectasia (DIET), and the extent of the scored feature(s). The CSS precisely assesses the extent of a single lesion using serial tracings, while other measures include a limited extent assessment across all affected anatomic regions (MSS, LoSSI)27,29,31. All these measures can track treatment response, but the specificity, sensitivity, and relative importance of individual lesion features as indicators of activity have not been assessed.
To identify features associated with activity, we conducted a multicenter study of 90 pediatric LS subjects classified by their treating physician as having either active or inactive disease. Our study focused on detailed clinical assessment of a single lesion (study lesion) per subject that was prospectively tracked in 84 subjects to determine the correlation of features with physician’s global assessment (PGA) of disease activity. Our study design allowed us to identify features specific for active disease, and to determine each one’s relative importance in physician activity assessment.
MATERIALS AND METHODS
This prospective observational multicenter cohort study was conducted by Localized scleroderma Clinical and Ultrasound Study group (LOCUS), a multidisciplinary collaboration organized through the Childhood Arthritis and Rheumatology Research Alliance (CARRA). Ten pediatric rheumatologists and 2 dermatologists from 9 academic centers participated. Numerous face-to-face meetings and conference calls were held to establish and standardize scoring methods. These included reviews of photographs of pediatric LS lesions that had erythema and violaceous color, and skin thickness assessment workshops. Inclusion criteria were a diagnosis of pediatric LS confirmed by a pediatric rheumatologist or dermatologist, and classified according to the Padua Preliminary Classification Criteria33, with disease onset before the 16th birthday.
Physicians classified their subjects as having active or inactive disease, and enrolled them at a minimum ratio of 2:1 active:inactive. Treatment was at the discretion of the subject’s physician, without restriction. Study visits were performed at visits for clinical care every 3 ± 1 months (baseline, 3, and 6 months). Each site obtained institutional ethics approval for the study, assent, and consent forms, which included our intent to publish the results of the study and measures to protect confidentiality. Data identified only by subject number were analyzed at the coordinating center (Hackensack University Medical Center, ethics approval number 07.02.055).
Because LS lesions vary in their features, a single lesion per subject was selected to serve as the study lesion for all visits. For active subjects, the investigator designated the most active, readily evaluable lesion, and specified the features that indicated that the study lesion was active. For subjects with inactive disease, any lesion could be designated as the study lesion.
PGA are reliable, quantitative measures that are sensitive to change when physicians are well-trained and experienced34,35. The study physicians have extensive experience evaluating and treating pediatric LS, collectively following over 500 patients. Because of the lack of an objective measure to determine disease activity, we used PGA assigned by this group of physicians as the gold standard. To avoid interrater variability in scoring, the same investigator examined a given subject’s study lesion at all visits, scoring 11 features (Table 1) and 4 PGA (visual analog scales, 0–100 mm). The scored PGA were (1) activity of study lesion (anchors: not active, very active), (2) activity of subject’s overall disease (anchors: not active, very active), (3) level of damage of study lesion (anchors: no loss, severe loss), and (4) level of subject’s overall disease damage (anchors: no chronic change, severe chronic change).
Scored lesion features.
Abnormal skin texture was defined as representing lesions that had an altered appearance and texture, such as an abnormally smooth, shiny, and/or waxy appearance. A composite variable, STmax, was created to represent the maximum score of skin thickening of the edge or center in some analysis. For each variable, a set of dichotomous variables was created for the regression analysis [normal or none (yes/no), mild (yes/no), moderate (yes/no), and severe (yes/no)]. Medical history, family history, and demographics were also collected.
The 90 evaluated subjects had complete data at the first visit; some lacked data for later visits. Subjects with missing values were excluded from the analysis of those visits, but included in analyses for which they had complete data to maximize our available study sample. The sample size for Visit 1 analysis was 90; Visit 2, 83; and Visit 3, 79. From Visit 2 to 3, 78 subjects were evaluated.
Statistical analysis
Disease activity was assessed by initial physician classification of the subject as active or inactive, and PGA of activity (PGA-A). Disease activity scores were compared between active and inactive group at each visit, and within each group across visits. For categorical variables, changes between groups at each visit were assessed using chi-square or Fisher’s exact test; comparisons within groups across time were assessed using Friedman’s test. Continuous data were assessed for normality using the Shapiro-Wilks test. Two-sided 2-sample t tests and/or Wilcoxon rank sum tests (non-normal analog) were used to evaluate differences in variables between groups at each visit, and 2-sided paired t tests or Wilcoxon signed-rank tests (non-normal analog) were used to evaluate differences within groups across visits. The relationship between variables was assessed by Pearson’s correlation coefficient. The correlation of variable scores with PGA-A scores was examined in bee-swarm box plots.
Multivariate linear regression was performed to examine the relationship between lesion features and PGA-A, the dependent variable. Incremental model building was used to determine the final model (evaluated by F test and adjusted R2). Regression coefficients allowed us to determine the relative ability (or weight) of each variable to explain the variation in PGA-A scores. For all analyses, p < 0.05 was considered significant.
RESULTS
Characteristics of the study subjects
Of the 103 patients with pediatric LS enrolled, 13 were excluded because of incomplete data. Of the remaining 90 subjects, 66 had an active study lesion and 24 had an inactive study lesion. Most study lesions were located on a limb, with linear scleroderma the most common subtype (Table 2). Features most commonly cited by physicians as indicative of activity in the active study lesions at enrollment were erythema 41/66 (62%), induration 34/66 (51.5%), tactile warmth 24/66 (36%), and lesion extension 22/66 (33%). Distinct border 16/66 (24%), development of a new lesion 13/66 (20%), violaceous color 12/66 (18%), white color or hypopigmentation 12/66 (18%), and hyperpigmentation 10/66 (15%) were also cited.
Study population characteristics.
Seventy-eight subjects completed 3 visits, 6 two visits, and 6 one visit. Most subjects were white (70/90) and female (71/90; Table 2). Mean age at disease onset was 7.7 years (SD 3.6). Groups did not differ in age at disease onset, race, antinuclear antibody (ANA) positivity, or extracutaneous morbidity (Table 2).
Active subjects had a shorter median disease duration than inactive subjects, lower frequency of prior treatment with methotrexate (MTX; Table 2), and shorter duration of prior MTX treatment (0.17 vs 2 yrs, p < 0.001). They were more likely to receive corticosteroid treatment during the study (active 67%, inactive 25%, p < 0.001). Active subjects had higher scores for PGA-A of the study lesion, while scores for PGA of damage (PGA-D) of the study lesion did not differ significantly among subjects with active versus inactive disease (Figure 1A). An expected similar difference in overall disease activity scores was also found at all visits (data not shown) in PGA-A for overall disease activity (median active, inactive scores: Visit 1, 30.0, 0.25 mm; Visit 2, 18.1, 0.52 mm; Visit 3, 13.8, 0.50 mm; p < 0.001 all visits). As was the case for damage assessment in the individual study lesions, PGA-D of overall disease at Visit 1 were similar (active 25.1, inactive 21.0 mm). However, in active subjects, PGA-D of overall disease scores remained stable across visits (Visit 2, 27.7; Visit 3, 24.8), but declined in inactive subjects (Visit 2, 15.1, p NS; Visit 3, 11.5, p = 0.007).
Physician’s global assessment (PGA) and lesion feature scores for active and inactive lesions. The median PGA of study lesion (Panel A, mm, range 0–100) and frequency of assessed lesion features at each visit (Panels B–D) are shown for active (dark columns) and inactive (light columns) study lesions. Significant differences between the active and inactive lesions are indicated with an * (p < 0.001), ^ (p < 0.01), or + (p < 0.05) above the column. Not shown is new lesion, which could only be scored at Visit 1 (active 13, inactive 0, p < 0.001). Subject numbers at visits (V): Visit 1 — 66 active, 24 inactive; Visit 2 — 60 active, 23 inactive; Visit 3 — 58 active, 21 inactive. PGA-Activity: PGA of activity of study lesion; PGA-Damage: PGA of damage of study lesion; Eryth: erythema; Viol: violaceous color; Warmth: tactile lesion warmth; Larger: enlargement of study lesion; Abnl Text: abnormal skin texture; STE: skin thickening of lesion edge; STC: skin thickening of lesion center; Border: distinct lesion border (visible or palpable); Palp B: distinct palpable lesion border; Visible B: distinct visible lesion border; DA: dermal atrophy; SQA: subcutaneous atrophy; Dyspig: dyspigmentation (hyper- or hypopigmentation).
Frequency of lesion features
The frequencies of assessed lesion features were calculated for active and inactive study lesions, with differences found for the following features: development of a new lesion, enlargement of study lesion, erythema, violaceous color, tactile warmth, abnormal skin texture, and distinct lesion border (Figure 1, B–D). Development of a new lesion, erythema, and violaceous color were more frequent in active lesions at all visits (a new lesion could be scored only at Visit 1), with new lesion and violaceous color exclusive to active lesions (Figure 1, B–D; new lesion at Visit 1, active 13, inactive 0, p < 0.001). Tactile warmth and abnormal skin texture were more frequent in active lesions at 2 visits, with tactile warmth approaching significance at the remaining visit (tactile warmth visit 1 and 3, p < 0.001; Visit 2, p = 0.052). Enlargement of the study lesion was more frequent in active lesions at Visit 1 (Figure 1B, p < 0.01), while distinct lesion border, either palpable or visible, occurred at a lower frequency in active lesions at Visit 3 (Figure 1D, p < 0.01).
The frequency of skin thickening of the lesion edge (STE), skin thickening of lesion center (STC), dermal atrophy, subcutaneous atrophy, and dyspigmentation did not differ between active and inactive lesions at any visit (Figure 1, B–D). STC was present in most study lesions (active 56.9–68.7%, inactive 52.4–58.3%), while STE was scored in up to half the study lesions (active 27.6–52.2%, inactive 23.8–37.5%). Dermal atrophy, subcutaneous atrophy, and dyspigmentation were present in the majority of lesions at all visits (Figure 1, B–D).
Lesion feature scores in active and inactive study lesions
We evaluated whether there were differences between groups in the distribution of scores for features scored on an ordinal (0–3) scale at each visit (Table 3). At all visits, active and inactive lesion groups differed in the distribution of erythema and violaceous color scores (Table 3, erythema p = 0.016 to < 0.001; violaceous p = 0.013 to 0.042). Groups did not differ in the distribution of STE, STC, subcutaneous atrophy, or dyspigmentation scores at any visit.
Distribution of lesion feature scores by visit and activity status.
Changes in the mean value of lesion feature scores across visits were compared between active and inactive study lesions. Active and inactive lesions differed in the level of change in scores for erythema, violaceous color, tactile warmth, abnormal skin texture, and distinct lesion border (p < 0.001 to 0.043). Active and inactive lesions did not differ in the level of change in scores for STE, STC, dermal atrophy, subcutaneous atrophy, or dyspigmentation).
Lesion feature scores in active study lesions
Within the active study lesion group, scores for erythema, violaceous color, tactile warmth, abnormal skin texture, and presence of a distinct border changed across visits (p < 0.001 to 0.035). In contrast to the lack of difference between active and inactive lesions in the level of skin thickening score change over time, scores for both STE and STC changed from Visit 1 to Visit 3 within the active study lesion group (p = 0.008, 0.011, respectively). Scores for dermal atrophy, subcutaneous atrophy, and dyspigmentation did not change across visits in the active study lesion group (data not shown). None of the lesion features had a change in scores across visits in the inactive study lesion group (data not shown).
We examined the correlations among disease features to evaluate their uniqueness. In active lesions, erythema and violaceous color had low correlation coefficients at all visits, suggesting they represent distinct features (visits 1, 2, and 3, Pearson’s r 0.237, −0.122, −0.176, respectively). Moderate correlations were found between skin thickening of lesion edge and center at all visits (visits 1, 2, and 3, Pearson’s r 0.440, 0.416, 0.452, respectively), suggesting that they are related. We evaluated the correlation of STE, STC, and a composite skin thickening variable representing the maximum score of STE or STC (STmax) with PGA-A of the study lesion. All 3 variables had similar levels of correlation with PGA-A of the study lesion at all visits (STmax, STE, STC Pearson’s r: Visit 1 — 0.222, 0.227, 0.151; Visit 2 — 0.389, 0.284, 0.272; Visit 3 — 0.455, 0.285, 0.462, respectively). We therefore used STmax to evaluate the correlation between ST and PGA-A.
At all visits, active study lesions with erythema, abnormal skin texture, or skin thickening (STmax) had higher PGA-A scores than those without these features (p < 0.05 to < 0.001, Figure 2). Active study lesions with disease extension (new or larger lesion), violaceous color, tactile warmth, or distinct border had higher PGA-A scores than lesions without these features for at least 1 visit (Figure 2; violaceous color, Visit 1, p = 0.003; warmth, Visit 2, p = 0.088, and Visit 3, p = 0.012; distinct lesion border, Visit 1, p < 0.001, and Visit 2, p = 0.020). Presence or degree of dyspigmentation or subcutaneous atrophy were not associated with PGA-A scores (data not shown).
Box plots of lesion feature scores versus PGA-A of the study lesion. The lesion feature scores for all active lesions were plotted against their PGA-A of the study lesion (mm, range 0–100) for each visit. Erythema (Panel A) and skin thickening (maximal skin thickening, Panel C) could be scored from 0 to 3, while new or larger lesion [new lesion or larger lesion size (disease extension), Panel B] and abnormal skin texture (Panel D) were scored as 0 (none) or 1 (present). Boxes indicate 25% and 75%, horizontal line marks the mean, and T brackets indicate the 5% and 95% PGA-A of the study lesion scores. Significant differences for PGA-A between lesion feature scores of 0 versus > 0 are indicated with an * (p < 0.001), ^ (p < 0.01), or + (p < 0.05) on the visit number (V). There were 66 active lesions at Visit 1, 60 at Visit 2, and 58 at Visit 3. PGA-A: physician’s global assessment of activity (of study lesion).
Ability of lesion features to predict PGA-A of the study lesion
Multivariate linear regression was used to predict PGA-A of the study lesion at each visit. We used STmax to evaluate the contribution of ST, and disease extension to represent the contribution of development of a new lesion and enlargement of an existing lesion. The best model contained the following variables: erythema, disease extension, violaceous color, maximal skin thickening, and abnormal skin texture [F (5,90) = 15.1, p < 0.0001; adjusted R2 = 0.588 at Visit 1, Table 4].
Regression model for physician’s global assessment of activity (PGA-A) of study lesion.
At Visit 1, the variables in the regression model explained 58.8% of the variation in PGA-A scores. More severe levels of disease features were associated with larger increases in PGA-A scores (Table 4). The effect of erythema and skin thickening were disproportionately higher at more severe levels of disease. Erythema had the largest effect on PGA-A score; mild and moderate disease levels were associated with 12- and 28-point increases in PGA-A scores, respectively (Table 4). Patients with mild and moderate violaceous color had PGA-A scores that were, on average, 13 and 19 points higher than patients lacking violaceous color, while disease extension, abnormal skin texture, and severe skin thickening were associated with 12-, 11-, and 22-point increases in PGA-A scores, respectively.
At Visit 2, the variables in the regression model explained 32.5% of the variation in PGA-A scores. Erythema was the strongest predictor of PGA-A scores, followed by disease extension and skin thickening (Table 4). At Visit 3, the variables in the regression model explained 49.3% of the variation in PGA-A scores. Erythema remained the strongest predictor of PGA-A, with more severe levels associated with a disproportionately higher effect, followed by disease extension (Table 4). The effect of moderate levels of skin thickening on PGA-A scores increased slightly from Visit 2.
DISCUSSION
This is the first study, to our knowledge, to prospectively evaluate LS skin features for their association with active disease. Because we had physicians classify subjects at study entry as having active or inactive disease, we could identify features specific for activity. In addition, because scoring focused on a single lesion per subject, we could more accurately determine correlation between features and lesion PGA-A. No single feature was found ubiquitous to all active lesions. Instead, erythema, new disease extension, violaceous color, tactile warmth, and abnormal skin texture were identified as specific activity features, each present in a subset of active lesions. These features were more prevalent and scored at higher levels (for erythema and violaceous color) in active lesions, and similar to PGA-A scores, their scores declined over time. Scores for these features correlated with PGA-A, suggesting they can be used to track changes in activity level. As expected, dyspigmentation, dermal atrophy, and subcutaneous atrophy were not specific to active disease, and their scores did not correlate with PGA-A.
Skin thickening, a feature used in all current LS measures, was not found to be specific for active disease. It is possible our study period was too short to allow us to identify differences between active and inactive lesions in the level of skin thickening score change over time, which warrants future study. However, we were also not able to differentiate active from inactive lesions by either the level of skin thickening scores, or the frequency of skin thickening of the lesion edge or center, even at the first visit when inactive subjects had a median 2 years longer disease duration than active subjects. We suspect that the lack of specificity reflects the dual nature of skin thickening, representing induration in the active, inflammatory phase of LS, and fibrosis in the later damage stage36. We therefore suggest that skin thickening should be used in conjunction with more specific activity features when assessing disease activity level.
At the highest activity levels, the combination of erythema, disease extension, violaceous color, abnormal skin texture, and skin thickening explained more than half of the variation in PGA-A in multiple regression models. More severe levels of lesion features were related to higher PGA-A scores. Erythema was the strongest predictor across all visits. The regression model was less successful in explaining the variation in PGA-A scores for the second visit compared to the first and third.
Limitations of our study include that there is no gold standard for activity. We relied on PGA scores, as has been done for the development of other rheumatic disease measures34,35,37,38,39. While the global assessments of activity from all physicians clearly differentiated active from inactive subjects, physicians may have differed in their weighing of certain features at different activity levels. This may have contributed to loss of significance of some features in regression analysis at later visits, especially for uncommon features such as violaceous color and tactile warmth. Given their specificity, further study of these features for assessing activity is warranted.
Poorer regression performance at later visits may be related to our scoring methodology. The abnormal skin texture feature was intended to capture waxy lesions, but our definition could have led us to include some types of dermal atrophy. At later visits, the frequency of waxy lesions, unlike that for dermal atrophy, could have declined in response to treatment, resulting in poorer model performance. For future studies, more specific descriptors should be tested.
Other potential scoring problems were our scoring for the worst rather than average level of each feature in the study lesion, which may have led to large stepwise drops in scores rather than gradual decreases, thereby limiting our ability to detect intermediate levels of activity. We did not assess the overall extent of active skin involvement, nor did we use disease extent as a score multiplier as is done in the Psoriasis Area and Severity Index measure40,41. These decisions may have produced lower model performance at later visits.
Our study provides important information for clinicians caring for patients with pediatric LS, including the lack of a universal activity feature and lack of specificity of skin thickness for active disease. Accurate identification and assessment of activity requires the evaluation of several lesion features. Damage, both atrophy and dyspigmentation, was found to commonly coexist with active disease even at study entry. The presence of damage features should therefore not be interpreted as indicating inactive disease. Because relapses are common, ongoing careful assessment of these patients for disease activity is an essential part of clinical care. The findings from this study should facilitate development of a sensitive, specific, and responsive activity tool. Such a tool should improve care, enable controlled treatment trials to be conducted, and improve longterm outcomes for this often severely damaging disease.
Acknowledgment
We thank Drs. Thaschawee Arkachaisri and Thomas Medsger for providing formal instruction in methodology for assessing skin thickness at our pre-study meetings. We thank Eseng Lai for help with data analysis and manuscript preparation, Benjamin Li for title suggestions, and Mary Ellen Riordan and Justine Zasa for their help in conducting our study.
Footnotes
Financial support from the Arthritis Foundation New Jersey Chapter Grant; Childhood Arthritis and Rheumatology Research Alliance.
- Accepted for publication June 25, 2018.