Abstract
Objective We aimed to evaluate the psychometric properties of the Scleroderma Skin Questionnaire (SSQ), a novel patient-reported outcome (PRO) to assess systemic sclerosis (SSc)–related skin symptoms.
Methods Participants were recruited from the SSc Collaborative National Quality and Efficacy Registry (CONQUER). Internal consistency was determined using Cronbach α and McDonald ω total (ωt). The correlation of the SSQ was assessed with the modified Rodnan skin score (mRSS), physician global assessment (PGA), Scleroderma Health Assessment Questionnaire, 29-item Patient-Reported Outcomes Measurement Information System (PROMIS-29), and patient global assessment to assess criterion, convergent, and divergent validity. Correlations were also assessed between patients’ self-reported recall of skin changes over the past 6 months (“SSQ 6-Month”) and 6-month change in mRSS.
Results The SSQ was administered to 799 adults (mean age 52.7; 83% female) enrolled in CONQUER. Cronbach α was 0.90 and ωt was 0.92, indicating high internal consistency. The SSQ was moderately correlated with mRSS (r 0.56), with stronger correlations in diffuse (r 0.54) vs limited cutaneous subtypes (r 0.24; all P < 0.05). The SSQ was also moderately-to-strongly correlated with PROMIS-29 physical function (r −0.50) and pain interference subscales (r 0.61), strongly with Health Assessment Questionnaire score (r 0.63) and severity subscale (r 0.62), and moderately with PGA SSc activity score (r 0.48; all P < 0.05). SSQ 6-Month correlated weakly with the 6-month change in mRSS (r 0.26; P < 0.05).
Conclusion SSQ demonstrated high reliability and moderate correlation with mRSS and legacy PROs. This study provides initial support for SSQ, but not SSQ 6-Month, to assess skin symptoms in patients with SSc.
Systemic sclerosis (SSc) is characterized by vascular, fibrotic, and immunologic manifestations in multiple organ systems.1 Skin thickening, a hallmark feature, is assessed using the semiquantitative modified Rodnan skin score (mRSS).2 Worsening mRSS is associated with higher morbidity and mortality.3 However, mRSS is limited as an outcome measure because it has only moderate interrater reliability,4 does not capture all relevant clinical features of SSc skin disease, and must be assessed by an expert at point of care. Patient-reported outcomes (PROs), by contrast, do not require in-person visits and capture an individual’s experience with symptoms and quality of life (QOL).5 In SSc, both disease-specific and generic PROs are used to understand global symptom burden, and validated, organ-specific PROs are used to study organ-targeted treatments.6 There is an ongoing need for a skin-specific PRO that captures subjective skin symptoms among patients with SSc.
The purpose of this study is to assess the psychometric properties of the Scleroderma Skin Questionnaire (SSQ), a 2-part instrument that includes cross-sectional and retrospective components. This novel PRO was initially developed in 2012 for the Prospective Registry of Early Systemic Sclerosis to assess SSc skin symptoms, but its performance characteristics have not yet been described.7 Part 1 of this PRO, which we will refer to as SSQ, includes questions related to a patient’s current symptoms, and part 2, which we refer to as SSQ 6-Month, asks subjects to retrospectively report how their symptoms have changed over the previous 6 months. The full SSQ has been administered to participants enrolled in the Collaborative National Quality and Efficacy Registry (CONQUER), a US-based multicenter registry of adults with early limited cutaneous (lc) and diffuse cutaneous (dc)SSc.8
METHODS
Study population. Subjects were recruited from CONQUER, a registry that enrolls patients ≥ 18 years old who fulfill the 2013 American College of Rheumatology (ACR)/European Alliance of Associations for Rheumatology (EULAR) classification criteria for SSc9 and have a disease duration < 5 years from onset of the first non-Raynaud phenomenon (RP) symptom. Our analysis includes 18 sites across 13 US states and the District of Columbia. Nine generic and organ-specific PROs, including the full SSQ, are collected every 6 months, and participants have an mRSS assessment performed at each visit.8 Study data are housed in a REDCap (Research Electronic Data Capture) system at the University of Utah Data Coordinating Center.8
Measures. Measures included mRSS, physician global assessment (PGA), Scleroderma Health Assessment Questionnaire (SHAQ), 29-item Patient-Reported Outcomes Measurement Information System (PROMIS-29), and patient global assessment (PtGA). The validity and scoring of these PROs have been described previously.6,10-14 mRSS and PGA were chosen to evaluate whether the SSQ is correlated with physician assessment of skin disease and global disease burden. SHAQ, PROMIS-29, and PtGA were chosen to evaluate whether SSQ is correlated with self-assessment of physical disability, health status, and global disease burden.
The PtGA comprises a single question about overall health in the past week, with scores ranging from 0 to 10, in which a higher score signifies worse health. The PGA includes questions about overall health, SSc disease activity, and global assessment of damage from SSc, each scored from 0 to 10, in which higher scores indicate worse physician-rated outcomes.
The SHAQ includes the Health Assessment Questionnaire (HAQ) and visual analog scales for RP attacks, finger ulcers, intestinal symptoms, breathing symptoms, and overall disease severity. Each symptom is scored from 0 to 3, with higher scores indicating worse symptoms.6,10-12
The PROMIS-29 includes 7 core domains and an 11-point rating scale for pain intensity. Norm-based scores were calculated such that a score of 50 ± 10 represents the mean (SD) of the general US population. On symptom-oriented domains (anxiety, depression, fatigue, pain interference, and sleep disturbance), higher scores represent worse symptoms. On the function-oriented domains (physical functioning and social role), higher scores represent better functioning.6,13,14 Data were included in the current study if all questions for each domain were answered.
SSQ and SSQ 6-month. The full SSQ is a 2-part questionnaire. The first part (SSQ) asks respondents to recall 6 skin symptoms (tight, painful, red, hard, itchy, rigid/stiff ) over the past 7 days. The SSQ 6-Month asks respondents to compare the same 6 symptoms over the past 7 days to 6 months prior. Both SSQ and SSQ 6-Month are assessed using a 5-point Likert scale and averaged to obtain scores ranging from 0 to 4, in which higher scores indicate worsening symptoms (Supplementary Figure S1, available with the online version of this article). At least 5 items must be answered to calculate a score.
Statistical analysis. Demographics and baseline characteristics were compared between SSc subtypes. Continuous variables were summarized using means and SDs and compared between groups using 2-sample t tests. Categorical variables were summarized using counts and percentages and compared between groups using chi-square tests.
The response frequency distribution of each of the 6 symptoms of the SSQ was analyzed to understand the range of symptom severity. Floor effect was defined as > 15% of participants achieving the lowest (best) score, and ceiling effect was defined as > 15% of participants having the highest (worst) score.15
Internal consistency of SSQ was evaluated using Cronbach α and McDonald ω total (ωt). Cronbach α was chosen because it is one of the most commonly used tests for evaluating internal consistency.16 However, given the nonnormal distribution of the dataset, the conditions for Cronbach α were not met and therefore, a more robust ωt was calculated to confirm the findings of Cronbach α.16,17 Participants who completed ≥ 5/6 SSQ questions were used to assess internal consistency and reliability of the instrument. Factor analysis was conducted to determine redundancy of questions.
At baseline, Pearson correlation coefficient (r) was computed between SSQ and mRSS, PGA, and each PRO (SHAQ, PROMIS-29, and PtGA), with stratified analyses by SSc subtype (lcSSc vs dcSSc) and disease duration (< 2 years vs ≥ 2 years since first non-RP symptom). Criterion validity was assessed by the correlation of SSQ with mRSS; convergent and divergent validity18,19 was assessed by the correlation of SSQ with PGA, SHAQ, PROMIS-29, and PtGA. Coefficients with an absolute value of > 0.6 indicated strong correlations and good convergent validity, and coefficients with an absolute value between 0.4 and 0.6 were considered moderate. Moderate and strong correlations were considered supportive of the validity assessments.20 Analyses included participants who had ≥ 5/6 SSQ questions completed and who had the associated score of interest (eg, PGA, mRSS) at the same visit.
Longitudinal associations between the cross-sectional SSQ and PGA, each PRO, and mRSS were calculated using mixed effects modeling with an unstructured visit correlation matrix and a bias-corrected 95% CI determined through cluster bootstrapping. All observed follow-ups for each participant were used in the modeling.
The association between change in mRSS and change in SSQ from baseline to 6 months was evaluated to assess SSQ’s responsiveness to change.19 Associations between change scores were computed using participants who had an enrollment and a 6-month follow-up visit, each with SSQ and mRSS available.
In order to assess recall of skin changes, we computed the correlation of SSQ 6-Month with objective measurements over a 6-month period (baseline to 6 months, and 6 to 12 months) using the previously described mixed modeling with the same weight given to both 6-month periods. This analysis was limited to patients who completed their baseline, 6-month, and 12-month visits.
Statistical significance was prespecified as a 2-sided P value < 0.05. Analyses were performed using SAS version 9.4 (SAS Institute).
RESULTS
Participants. As of March 15, 2024, 799 adults with SSc were enrolled in CONQUER, 536 (67%) with dcSSc and 263 (33%) with lcSSc. Most were female (83%) and non-Hispanic White (71%). Mean disease duration since first non-RP symptom was 2.5 (SD 1.4) years. Mean (SD) age at enrollment of lcSSc and dcSSc was 54.2 (14.1) years and 51.9 (13.5) years, respectively. The dcSSc group had more male individuals (20% vs 13%) and non-Hispanic Black participants (13% vs 5%) compared to the lcSSc group. mRSS scores were significantly higher in those with dcSSc vs lcSSc (16.5 vs 3.9; P < 0.001). Similarly, SSQ was higher (ie, worse symptoms) in those with dcSSc compared with lcSSc (1.6 vs 0.9; P < 0.001; Table 1).
Patient demographics and baseline characteristics.
Psychometric properties.
• Internal consistency reliability. Cronbach α of SSQ was 0.90 and ωt was 0.92, indicating high internal consistency. There was redundancy based on the exploratory factor analysis (Supplementary Information, available with the online version of this article).
• Response frequency distributions. Histograms were positively skewed, showing possible floor effects, as many reported the lowest score (not at all) on hard (40%), itchy (36%), painful (46%), red (44%), rigid/stiff (36%), and tight (19%) skin symptoms at enrollment (Supplementary Figure S2, available with the online version of this article). The same trend was observed across responses from all visits (Supplementary Figure S3). There was no evidence for ceiling effects.
• Correlation with mRSS. At baseline, SSQ was moderately correlated with mRSS (r 0.56). Correlations were stronger among individuals with dcSSc (r 0.54) than lcSSc (r 0.24; Table 2). Similar moderate correlations were observed in participants with different disease duration (r 0.54 and r 0.52 for < 2 years and ≥ 2 years, respectively; Table 3). All correlations were significant (P < 0.05).
Correlations between SSQ with mRSS and legacy PROs by SSc subtype.
Correlations between SSQ with mRSS and legacy PROs by disease duration.
• Correlations with legacy PROs. At enrollment, SSQ was moderately correlated with the PtGA (r 0.43), SSc activity component of the PGA (r 0.48), PGA overall health (r 0.42), and PGA SSc damage (r 0.39; Table 2). Higher (worse) SSQ was moderately correlated with worse PROMIS-29 physical function (r −0.50) and ability to participate in social roles and activities (r −0.49). SSQ was moderately-to-strongly correlated with worse pain interference (r 0.61) and fatigue (r 0.43). SSQ was weakly correlated with worse PROMIS-29 depression/sadness (r 0.34), anxiety/fear (r 0.23), and sleep disturbance (r 0.10). Better SSQ was strongly correlated with better HAQ score (r 0.63) and severity score (r 0.62), and weakly to SHAQ intestinal symptoms (r 0.27), breathing symptoms (r 0.26), RP attacks (r 0.39), and finger ulcers (r 0.29). Longitudinal correlations showed similar trends to correlations at baseline. The correlations at enrollment were also sustained when stratified by SSc subtype and disease duration. All correlations were significant (P < 0.05; Table 2 and Table 3).
• SSQ’s relationship with change in mRSS. There was a weak but statistically significant correlation between change in SSQ and the change in mRSS from baseline to 6 months (r 0.30; P < 0.001).
• SSQ’s minimal clinically important difference. The SSQ minimal clinically important difference, which was estimated to be 1.4 (95% CI 1.29-1.51), correlated with a change of 5 in the mRSS. However, based on a contingency table examining all visits (Supplementary Table S3, available with the online version of this article), the change in SSQ was not closely associated with the actual changes in mRSS.
• SSQ 6-Month. SSQ 6-Month recall had a weak correlation of r 0.26 (P < 0.05) with the 6-month change in mRSS. It also had a weak correlation with the 6-month changes in HAQ (r 0.17), SHAQ severity score (r 0.21), PROMIS-29 fatigue, pain interference, social role, and physical function, and all the PGA subscales (all P < 0.05; Supplementary Table S4, available with the online version of this article). Other legacy PROs did not have statistically significant correlation coefficients.
DISCUSSION
We report initial evidence to support the validity of the SSQ in assessing skin symptoms in patients with SSc. The SSQ is a brief self-report questionnaire minimizing responder burden. Further, it demonstrates high internal consistency based on Cronbach α and ωt. Review of response frequencies showed skewing toward higher scores, suggesting a possible floor effect. However, this may be owing to the large proportion of CONQUER patients for whom skin disease is less severe. SSQ is moderately correlated with mRSS; the correlation was stronger in patients with dcSSc compared to lcSSc, supporting criterion validity. We observed moderate correlations of SSQ with multiple legacy PROs. SSQ had higher correlations with items that were expected to correlate with skin symptoms (eg, PROMIS-29 pain interference and HAQ), which supports convergent validity. SSQ showed weak correlation with items that would not be expected to correlate with skin (eg, SHAQ intestinal VAS and breathing VAS), supporting divergent (discriminant) validity.
The second part of the survey, SSQ 6-Month, was used to assess the accuracy and reliability of recall of patients’ with SSc of the change in their skin disease. SSQ 6-Month showed weak correlation with change in mRSS, suggesting either a poor recall for a 6-month timespan or a disconnect between the patient perception of change and the objective measured change in mRSS. This is not surprising, given that prior studies have shown limited accuracy in patients’ recall of symptoms beyond 1 to 2 weeks.21 The SSQ 6-Month showed weak correlation with other PROs. Based on these findings, we recommend removing this retrospective component in future iterations of the full SSQ.
The lack of a fully validated, self-reported skin assessment for patients with SSc is an unmet need in SSc research. After the inception of CONQUER, the Scleroderma Skin Patient-Reported Outcome (SSPRO) was developed to capture SSc skin symptoms and skin-related QOL. The SSPRO is a longer 18-item questionnaire that has been used as an outcome measure for SSc skin disease in 1 clinical trial.22 SSPRO was weakly correlated with physician-assessed mRSS (r 0.38; P < 0.05) in an SSc cohort of 140 patients, in which it was validated and was moderately-to-strongly correlated with other PROs.23 Whether the shorter, skin symptom–focused SSQ is more acceptable to patients with SSc, and whether it is a stronger predictor of clinical outcomes than the SSPRO, are important areas of future research. Unlike the SSPRO, the SSQ was developed by clinicians without patient partnership. Refinement of the SSQ would benefit from incorporating the perspectives of individuals with SSc.
There are some limitations to consider. CONQUER participants are 71% non-Hispanic White, 67% have the diffuse cutaneous subtype, and have a mean disease duration of 2.5 years. These factors potentially affect the generalizability of our findings. Although in most cases the mRSS is performed by the same physician at each visit, this is not prespecified in the protocol and could affect the findings. We were not able to assess SSQ test-retest reliability because the CONQUER protocol is written such that participants complete the SSQ every 6 months, during which time their skin disease could have changed significantly, but this will be performed in future studies. Finally, adjustments were not made for multiple comparisons, and since this was an exploratory study, no a priori prediction was made.
Our study has several important strengths. First, CONQUER includes a large sample of validated individuals with early SSc, which is important in developing an outcome measure to be used in clinical trials.7 CONQUER has low rates of missing data in follow-up assessments (Supplementary Table S5, available with the online version of this article). The SSQ’s internal consistency was demonstrated through 2 measures with similarly strong values. The SSQ was validated against rigorous clinical anchors, including mRSS and other validated metrics that capture SSc disease activity and QOL. The SSQ shows strong to moderate correlation with mRSS, particularly in participants with dcSSc, as well as strong convergent and divergent validity.
In conclusion, this study provides evidence that the SSQ has strong psychometric properties when used cross-sectionally but shows the SSQ 6-Month is likely not a valid PRO. Patient partnership and further validation studies are planned.
ACKNOWLEDGMENT
The work described in this manuscript was completed while JY was employed at The Hospital for Special Surgery. The opinions expressed in this article do not reflect the view of the US Food and Drug Administration, the Department of Health and Human Services, or the US government. The work described in this manuscript was completed while VDS was employed at The George Washington University Medical Faculty Associates. The opinions expressed in this article do not reflect the view of the National Institutes of Health, the Department of Health and Human Services, or the US government.
Footnotes
CONTRIBUTIONS
Conceptualization, investigation, project administration, validation, writing - original draft, writing - review & editing: JY; data curation, formal analysis, methodology, software, supervision, validation, visualization, writing - review & editing: JMV; data curation, formal analysis, methodology, software, validation, visualization, writing - review & editing: AC, JSA; conceptualization, methodology, resource, writing - review & editing: LAM, LCP; data curation, writing - review & editing: SA, EJB, FVC, LC, TMF, FNH, LKH, DK, KSL, DLO, YL, AM, JAM, DFM, CR, NS, AAS, AS, VKS, BS, VDS, ERV; data curation, funding acquisition, writing - review & editing: LE; conceptualization, data curation, investigation, methodology, project administration, resource, supervision, validation, writing - review & editing: JKG.
FUNDING
The authors declare no funding or support for this research.
COMPETING INTERESTS
The authors declare no conflicts of interest relevant to this article.
ETHICS AND PATIENT CONSENT
All patients provided written informed consent and the study protocol was approved by the ethics committees at each participating institution. CONQUER IRB numbers: Columbia University (AAAR6143[M00Y07]), Duke Medical Center (Pro00108280-AMD-2.0), Georgetown University (CR00003663), Hospital for Special Surgery (2018-0165-CR4), Johns Hopkins (CR00050447/IRB00170405), Mass General (2018P001506), Mayo Clinic (21-005952), Medical University of South Carolina (MS14_Pro00080285), Northwestern University (STU00207506), Stanford University (45849), University of California Los Angeles (21-001711-AM00002), University of Michigan (HUM00149153), University of Minnesota (STUDY00014622), University of Pennsylvania (833629), University of Texas Houston (HSC-MS-18-0359), University of Utah (IRB_00111276), Vanderbilt University (210639).
- Accepted for publication December 9, 2024.
- Copyright © 2025 by the Journal of Rheumatology
This is an Open Access article, which permits use, distribution, and reproduction, without modification, provided the original article is correctly cited and is not used for commercial purposes.
REFERENCES
SUPPLEMENTARY DATA
Supplementary material accompanies the online version of this article.






