Abstract
Objective. To evaluate the expression profile of cell-free circulating microRNA (miRNA) in systemic sclerosis (SSc), healthy controls (HC), and systemic lupus erythematosus (SLE).
Methods. Total RNA was purified from plasma and 45 different, mature miRNA were measured using quantitative PCR assays after reverse transcription. Samples (n = 189) were from patients with SSc (n = 120), SLE (n = 29), and from HC (n = 40). Expression data were clustered by principal components analysis, and diagnostically specific miRNA profiles were developed by leave-one-out cross-validation. Diagnostic probability scores were derived from stepwise logistic regression.
Results. Thirty-seven miRNA specificities were consistently detected and 26 of these were unaffected by SSc sample age and present in more than two-thirds of SSc samples. SSc cases showed a distinct expression profile with 14/26 miRNA significantly decreased (false discovery rate < 0.05) and 5/26 increased compared with HC. A 21-miRNA classifier gave optimum accuracy (80%) for discriminating SSc from both HC and SLE. The discrimination between HC and SSc (95% accuracy) was strongly driven by miRNA of the 17∼92 cluster and by miR-16, -223, and -638, while SLE and SSc differed mainly in the expression of miR-142-3p, -150, -223, and -638. Except for a weak correlation between anti-Scl-70 and miR-638 (p = 0.048), there were no correlations with other patient variables.
Conclusion. Circulating miRNA profiles are characteristic for SSc compared with both HC and SLE cases. Some of the predicted targets of the differentially regulated miRNA are of relevance for transforming growth factor-β signaling and fibrosis, but need to be validated in independent studies.
Systemic sclerosis (SSc) is a rare autoimmune disease characterized by chronic inflammation of connective tissues associated with cutaneous and organ fibrosis, occlusive vasculopathy, specific types of circulating autoantibodies, and a highly variable clinical course. The etiology of SSc is unknown, but perivascular inflammation is an early and central pathogenetic event and leads to aberrant activation or increased synthesis of various collagen cross-linking enzymes and subsequent increased synthesis/cross-linking and decreased degradation of collagen1,2,3,4. The class of noncoding small RNA called microRNA (miRNA) regulates the expression of most protein-coding genes at the post-transcriptional level by interfering with translation of target mRNA. Typically, miRNA have multiple mRNA targets in inflammatory and fibrosis pathways (including growth factors, enzymes, and cytokines), are themselves regulated by inflammatory cytokine/cytokine receptor networks5, and are dysregulated in cells of the innate and adaptive immune system from patients with autoimmune disorders6,7,8. Thus, specific miRNA have been shown to be differentially expressed in fibrotic diseases such as SSc (miRNA-29)9 and idiopathic pulmonary fibrosis10 (the miRNA-17∼92 cluster). The miRNA-17∼92 cluster targets, for example, fibrotic and antiangiogenic genes and DNA methyltransferase (DNMT)-1 expression and is downregulated in fibrotic lung tissue. miRNA-29, which also targets DNMT-1 expression, is depressed in SSc dermal fibroblasts and regulates type I and III collagen expression9. Differential expression of miRNA was also noted in a study of sclerodermic skin biopsies11. Expression profiles of miRNA may thus be associated with diagnosis and classification of SSc and constitute potential therapeutic targets. From a diagnostic viewpoint, the presence of a population of stable, circulating miRNA in the blood12,13 that may include specific contributions from pathological cells and tissues in cancer and autoimmunity is valuable, and studies have demonstrated specific profiles associated with systemic autoimmune disease such as systemic lupus erythematosus (SLE)14. Because of the unique dominance of fibrosis in SSc, we hypothesized that the circulating miRNA profile in this condition might be characteristic and different from those in other systemic autoimmune diseases and possibly related to subtypes of SSc. To address this hypothesis, we investigated levels of a panel of 45 specific miRNA in plasma samples collected under controlled conditions from a cohort of 120 patients with SSc, 29 patients with SLE, and 40 healthy controls (HC).
MATERIALS AND METHODS
Patients and methods
Our cross-sectional study included 120 patients with SSc with a median age of 57 years (range 22–79 yrs) fulfilling the American College of Rheumatology criteria15 for SSc (Table 1). Patients were included consecutively over 2 years. Skin involvement was graded according to the modified Rodnan skin score16. Seven patients with SSc were receiving disease-modifying antirheumatic drugs (DMARD; azathioprine, methotrexate, or anti-malarials) and 6 were taking prednisolone (together with DMARD in 4 cases) at the time of inclusion. None of the patients had ever received therapy with biological drugs or prostanoid infusions within 6 months of inclusion. Concomitant treatment included calcium channel blockers, proton pump inhibitors, angiotensin-converting enzyme inhibitors, diuretics, statins, and others. None of the patients had known cancer or systemic infections, and none were pregnant at the time of inclusion. With the exception of 1 patient, all were white. As disease controls, 29 white patients with SLE (4 of these were men) were included from another study14. The SLE samples were collected within a few months and were subsequently stored as citrated platelet-poor plasma at −80°C for about 3.5 years before being used in our present study. Finally, 40 unmedicated, healthy individuals (33 women and 7 men) with a median age of 46 years (range 24–71 yrs) were included as HC and sampled within a few weeks in 2 rounds (36 HC in the first round) separated by a year. All participants were included after giving written informed consent. The study was approved by the local ethics committee (approval no. H-B-2008-131) and carried out in accordance with the principles of the Declaration of Helsinki.
Demographic and clinical characteristics of the 120 patients with SSc. Arterial hypertension: blood pressure above 140/90 mmHg at study inclusion.
Clinical, biochemical, and serological assessment
The clinical characteristics of the patient population are summarized in Table 1. Data on disease history including clinical, biochemical, and pharmacological data were obtained by review of medical records, patient examination, and interview. Subsequently, the data were registered in a database. Based on the extent of cutaneous involvement, patients were classified as having limited cutaneous SSc (lcSSc; n = 79) or diffuse cutaneous SSc (dcSSc; n = 41)17. Serological assessment of antinuclear antibodies, including anticentromere antibodies (ACA), was performed by indirect immunofluorescence on human epithelial type 2 cell line cells, and anti-Scl-70 antibodies were demonstrated by ELISA.
Blood sampling and isolation of platelet-poor plasma
Venipuncture was performed using a 21-gauge needle, and after release of the tourniquet, the first tube obtained was always reserved for autoantibody analyses. Next, for microRNA analysis, the blood was collected into 3 × 9 ml citrate tubes (Vacuette sodium citrate 3.8%; Greiner Bio-One) and gently mixed 5 times. Immediately after collection, the blood cells were removed by centrifugation. All samples underwent a 2-step centrifugation procedure: 1800 g, 10 min, 21°C, followed by centrifugation of the supernatant 3000 g, 10 min, 21°C to obtain platelet poor plasma (PPP). The PPP was aliquoted, snap-frozen in liquid nitrogen, and stored at −80°C until analysis.
RNA isolation and miRNA profiling
Total RNA was purified from 100 μl-citrated PPP samples using Norgen Total RNA Purification Kit (Norgen Biotek Corp.), and a panel of miRNA was analyzed after reverse transcription using stem-loop primers, preamplification, and quantitative PCR (qPCR) using specific assays from Applied Biosystems. For the qPCR, a BioMark MX microfluidic platform (Fluidigm Corp.) allowing duplicate assays for 48 miRNA in 96 samples on 96.96 dynamic arrays in 1 operation was used, as detailed elsewhere18. The 48 miRNA assays included assays for 45 human and 3 Caenorhabditis elegans miRNA (cel-miR-39, -54, and -238; Supplementary Table 1 available online at jrheum.org). The mixture of 3 synthetic cel-miRNA was added to the lysis buffer for use as spike-in controls for technical normalization, as previously reported18.
Data handling
Only assays yielding Cq values below 30 were included, and assays for 3 miRNA (miR-129-5p, -210, and -31-5p) consistently failed, leaving a total of 42 miRNA for data analysis. Technical (using the cel-miRNA average) and row normalization using the average of miRNA (n = 20) detected in all samples was performed. The resulting ΔΔCq values increase with increasing miRNA levels. Individual miRNA levels were expressed in a linear form using the 2−ΔΔCq transformation, and SLE, controls, and SSc cases were compared using 1-way ANOVA with Tukey posttest, using p = 0.05 as the significance level. For multiple comparison, significance levels were corrected using false discovery rates (FDR)19. Principal components analysis (PCA) was based on row-normalized ΔΔCq values using Genesis (v.1.7.6.; genome.tugraz.at/genesisclient/genesisclient_description.shtml).
Risk scores for SSc were derived from optimized gene profiles, i.e., the set of miRNA that resulted in maximal balanced accuracy (mean of specificity and sensitivity) for the SSc diagnosis. Three miRNA profiles were developed, distinguishing SSc from SLE, SSc from HC, and SSc from the rest (SLE and HC). The profiles were obtained using logistic regression classification from which optimal profiles were selected by leave-one-out cross-validation (LOOCV). Briefly, in this procedure, a single sample serves as a test sample and the remaining samples as a training set. This is repeated until all samples have been left out once and the accuracy of the classifier is determined by the correctly classified samples. In the training set, feature selection is necessary to avoid a small sample-per-feature ratio and provide better classification. The feature selection procedure consisted of 3 steps: (1) testing the miRNA in the training set for significance using univariate logistic regression; (2) ranking the miRNA features by their significance; and (3) finding the optimal number of features by subsequently adding 1 feature at a time in a “top-down” forward wrapper approach starting with the top 2 features of the ranked list; at each increment, the classification accuracy of the training samples was assessed using LOOCV. All calculations were performed using the open source R-environment (cran.r-project.org). The function “generalized linear model” was used for logistic regression classification, and regression coefficients were extracted from the summary (model) function.
RESULTS
Expression data and PCA
Five of the 42 analyzable hsa-miRNA (let-7a-5p, miR-10a, -101, -125a-3p, -196a) were inconsistently detected (i.e., present only in less than a third of all samples) and therefore disregarded in the further analysis. The expression values of the remaining 37 miRNA (sorted according to abundance in HC in Supplementary Table 2, available online at jrheum.org) were used as input for a PCA. In the PCA score plot (Figure 1), HC (x) clustered together with a subset of the SSc samples (white dots) at high PC2 scores (i.e., PC2 > 0.05), whereas the remaining SSc samples were below this level and distinct from HC samples. PC2 is the second principal component, i.e., the weighted linear combination of all variables that explain the second-most of the variance of the multivariable data. No apparent separation of SSc and SLE samples was observed in this plot along either the PC1 or PC2 axis. Overall, the PCA suggested that the data contained information that discriminated some SSc (and SLE) samples from HC, but that a distinct subset of SSc samples grouped with HC. Both HC and SLE samples were collected over short time intervals (the HC over 2 short periods) as compared to a considerably longer sampling interval for the SSc samples. Therefore, we next correlated PC2 scores with sampling date (Figure 2). From this analysis, it was apparent that SSc samples with a PC2 > 0.05, were the oldest and that there was a significant (p < 0.0001, R2 = 0.64) linear correlation between sample age and PC2. Interestingly, both HC sample groups were located at PC2 > 0.05 while the SLE samples that were as old as the oldest SSc samples predominantly grouped below PC2 = 0.05. Thus, the combination of variables (miRNA expression values) that constitutes PC2 did not correlate with the age of the SLE samples.
Principal components analysis score plot to visualize the degree of sample expression data variation that correlates with sample grouping (HC, SSc, and SLE). PC 1 and 2, together accounting for more than 60% of the variation of the data, are shown. A line is indicated at PC2 = 0.05, above which all normal samples, except 1, are located. HC: healthy controls; SSc: systemic sclerosis; SLE: systemic lupus erythematosus; PC: principal component.
The contribution of sample age to the PC2 scores. SSc samples display a systematic linear correlation (R2 = 0.64) between sample age and PC2 scores, indicating a dependency of some miRNA levels with sample age. PC: principal component; SSc: systemic sclerosis; HC: healthy controls; SLE: systemic lupus erythematosus; miRNA: microRNA.
Because of the noted influence of sample age on the SSc data variance, we next analyzed data for each miRNA independently to ascertain which miRNA contributed most to the sample age-dependent variability. Results are summarized in Table 2 and Supplementary Figure 1 (available online at jrheum.org). These data show that 14 miRNA were significantly decreased (FDR < 0.05; marked with downward arrows in Table 2) in SSc samples irrespective of sample age in comparison with HC, while 5 miRNA were significantly increased (upward arrows). The remaining miRNA either showed clear SSc sample age-related correlations (miR-184, -21, -27a-3p, -29a, -29c-3p, -203, -375, -409-3p, 423-5p) affecting significance levels or were non-significantly differently expressed in the sample sets. In 2 cases (miR-18a-5p, -181a) there were too few (1/3 or more missing) data values in the SSc group to warrant inclusion in Table 2.
Overview of 35 miRNA in SSc and HC samples, ranked according to fold-change (SSc/HC). miR-18a-5p and miR-181a are not included because of too few data. Rank is numbered from most (1) to least significant (35).
A circulating miRNA profile for SSc
Of the 35 miRNA included in Table 2, we excluded the 9 miRNA with expression levels that were clearly influenced by sample age. The remaining dataset of 26 miRNA was systematically tested for the best combination of miRNA, enabling discrimination between the SSc, the SLE, and the HC groups using logistic regression combined with optimizing the number of features (miRNA) by adding 1 at a time from the top of the significance list. Each step was validated for the highest diagnostic accuracy (best mean of sensitivity and specificity) by an LOOCV approach. Briefly, in this procedure, a single sample serves as a test sample and the remaining samples as a training set. This is repeated until all samples have been left out once and the accuracy of the classifier is then determined by the correctly classified samples. Risk scores (disease probability) were developed for distinguishing SSc from SLE, SSc from HC, and SSc from the rest (SLE and HC). Probability plots based on the optimized classifiers are shown in Figure 3, and lists of miRNA profiles (listed in order of decreasing univariate significance) and equations for risk scores are included in Supplementary Table 3 (available online at jrheum.org). The results showed that the optimal combination of miRNA for discriminating SSc cases from both SLE cases and HC consists of a 21 miRNA classifier yielding an analytical accuracy of 80% (Figure 3A). Looking at SSc versus HC and SSc versus SLE separately (Figure 3B and Figure 3C), the corresponding miRNA panels consisted of 22 and 14 miRNA, respectively. Using the respective scores in each case gave receiver-operation characteristic (ROC) curves with an area under the curve (AUC) of > 0.95 (data not shown), and diagnostic accuracies of 0.95 for SSc versus HC and 0.84 for SSc versus SLE.
Probability plots showing the optimized diagnostic classification based on miRNA combinations derived by logistic regression to give the highest balanced diagnostic accuracy. Risk score equations are listed in Supplementary Table 3 (available online at jrheum.org). A. SSc versus all other samples (HC and SLE) based on 21 miRNA. B. SSc versus HC based on 22 miRNA. C. SSc versus SLE based on 14 miRNA. Specificity, sensitivity, and mean accuracy are given for all 3 models. miRNA: microRNA; SSc: systemic sclerosis; HC: healthy controls; SLE: systemic lupus erythematosus.
The main contributors to the diagnostic classifications were the miRNA-17∼92 cluster members (miR-17, -20a, -92a, and -106a, down in SSc) that strongly separate SSc and HC (Figure 4A) with an AUC of the ROC curve of 0.92. The opposite directions of the changes in miR-142-3p and miR-223 (down in SSc, up in SLE) and miR-150 and miR-638 [up in SSc, down (miR-150) or unaltered (miR-638) in SLE] made these 4 miRNA strong classifiers for SSc compared with SLE (Figure 4B) with an AUC of the ROC curve of 0.87. Compared to the optimized profiles, these simplified 4 miRNA models lost comparatively little diagnostic power (accuracies of 0.84 vs 0.95 and 0.81 vs 0.84 for SSc-HC and SSc-SLE, respectively).
Simplified risk score plots based on 4 miRNA for (A) HC versus SSc and (B) SLE versus SSc. Mean accuracies are 0.84 and 0.81, respectively. The corresponding values for the optimized models (Figure 3) are 0.95 and 0.84, respectively. The stated p values are based on Mann-Whitney U tests. Optimal cutoff levels based on ROC curves (not shown) are indicated by dashed lines, and give the specificity and sensitivity performance indicated in the figure. miRNA: microRNA; HC: healthy controls; SSc: systemic sclerosis; SLE: systemic lupus erythematosus; ROC: receiver-operation characteristic.
Correlation of circulating miRNA profiles with clinical and paraclinical variables
We found no correlations between miRNA levels and patients with SSc with different auto-antibody profiles (ACA-positive patients, n = 48/120; anti-Scl-70-positive patients, n = 15/120), except for a weakly significant difference (p = 0.048, Mann-Whitney U test) between a decreased level of miR-638 in the anti-Scl-70–positive group compared with the anti-Scl-70–negative group. The lcSSc and the dcSSc groups could not be discriminated on the basis of their miRNA-expression profiles. The cohort includes a total of 9 patients in treatment with DMARD and/or prednisolone. We did not observe any specific miRNA expression differences in this small subset of cases compared with the rest of the cohort (data not shown).
DISCUSSION
To the best of our knowledge, our present study is the first investigation of circulating miRNA profiles in plasma samples from a cohort of more than 100 (n = 120) well-characterized patients with SSc compared with both HC and SLE cases. More than 2500 human miRNA are now known (www.mirbase.org), but less than 25% of these have been found in the circulation20, and the costs and amount of RNA required to analyze each sample on individual printed arrays are prohibitive. Instead, we used a dynamic array for qPCR that enabled the analysis of many samples for a subset of miRNA. We chose a panel of 45 different miRNA based on previous studies of cellular and tissue miRNA in SSc and fibrosis and circulating miRNA in SLE and inflammation5,9,10,11,14,21,22, and found 35 of these to be present at measurable levels across samples. Despite a clear correlation of some of these miRNA with sample age-related factors in the SSc cohort, we were able, after eliminating these miRNA, to delineate circulating miRNA profiles that were highly characteristic of our SSc population. In the absence of an independent validation cohort, we applied a data analysis strategy with systematic internal cross validation (leave-one-out) to optimize models. In this way, highly diagnostically accurate models based on miRNA combinations could be extracted from data. In simplified terms, a combination of the miRNA-17∼92 cluster that is significantly decreased in SSc and SLE, and miR-142-3p and -223 that in our study is decreased in SSc but increased in SLE, clearly discriminates between HC and the disease groups and between SSc and SLE.
Our finding of a significantly decreased level of miR-142-3p in SSc is in disagreement with a study of Japanese patients with SSc where the level of this miRNA was found to be increased even in comparison with SLE cases21. A very important difference, however, is that the study used sera while we here used plasma samples obtained under stringent and consistent conditions. While the overall ranking of abundance may correlate reasonably well for a number of miRNA in plasma and serum12, it has been shown in several studies that the levels of individual miRNA in serum and plasma samples from the same patients do not correlate well18,23 and indeed, that cellular miRNA content overwhelmingly influences the levels of supposedly cell-free miRNA (including the exosomal miRNA) in samples experiencing hemolysis or cell contamination or platelet activation, as will occur in coagulating blood20,24,25.
Interestingly, the miR-17∼92 cluster has been linked with fibrosis in models of idiopathic pulmonary fibrosis, where it was shown that enhanced expression of the cluster attenuated fibrosis10. This cluster of miRNA is also known to be important for oncogenesis26 and suppresses proliferation and angiogenesis in some types of cancers, but is carcinogenic in others27,28. A role in attenuating fibrosis and angiogenesis is consistent with our finding of decreased levels of 4 members of this cluster in the circulation of patients with SSc. Evidently, this cluster regulates the expression of numerous other genes in health and disease29, and its expression is also decreased in other autoimmune diseases such as SLE. Nevertheless, 2 very interesting targets of miR-17∼92 were found in a study of the effects of miR-17-92 transfection on lung epithelial cell lines30. The most downregulated gene (with an mRNA predicted target sequence for miR-17-5p) was procollagen-lysine, 2-oxoglutarate 5-dioxygenase 2, the gene encoding collagen telopeptide lysyl hydroxylase 2 (LH2). Also, as an additional target, the oxygen-dependent transcription factor HIF-1α was established. LH2 has been established as a key profibrotic enzyme responsible for hydroxylation of collagen telopeptides participating in the increased pyridinoline cross-links in collagen in fibrosis2, and HIF-1α is responsible for inducing the expression of a number of collagen prolyl and lysyl hydroxylases in fibroblasts under hypoxic conditions31. Decreased inhibition of the expression of these genes as a consequence of decreased miR-17∼92 would thus be associated with increased fibrosis, including an increased fibrotic response to tissue hypoxia. While consistent hypotheses may thus be generated by the findings of our study, we do not, however, prove that the observed systemic decrease in the levels of the miR-17∼92 cluster reflects the pathophysiological events in the skin and other sclerosing tissues and organs in SSc.
For miR-29a, which was previously reported to be downregulated in fibroblasts from patients with SSc9, we found an overall pattern of upregulation in the circulation of patients with SSc and SLE, even though to a markedly lesser extent in old SSc samples; this miRNA is not included in the diagnostic panel in our present study. The miR-29 family (29a, 29b, 29c) is involved in the regulation of DNA methylation32 and it may be hypothesized that its systemic increase in patients with SSc is a response aimed at silencing gene expression important for SSc tissue pathology. Support of this notion requires further experimental work, and the decrease of miR-29 observed in fibroblasts may simply be a secondary effect of suppressive regulation of miR-29 by transforming growth factor (TGF)-β9.
The differential response of miR-142-3p (up in SLE, down in SSc compared with HC) is interesting, especially because of its known involvement in the regulation of expression of proteins important for TGF-β signaling. Together with other growth factors and cytokines, TGF-β is a skin fibroblast activator and a major inducer of fibrosis in SSc33. The expression of TGF-β receptor I and a number of other genes related to TGF-β signaling is predicted and experimentally verified to be directly regulated by miR-142-3p34,35. An increased TGF-β signaling activity in SSc fibroblasts36,37 and a decreased production of TGF-β by lymphocytes from patients with SLE38 may be consistent with the notion of a dysregulation of miR-142-3p in opposite directions in SSc and SLE, but this also requires further studies to be verified.
In the setting of early or undifferentiated connective tissue disease, the miRNA-based biomarker profile reported here may serve to identify patients likely to progress to SSc. However, the potential practical utility of the findings is limited because intra- and interindividual variability, reference ranges, longitudinal stability, preanalytical variables, and analytical variability are still largely unknown for circulating miRNA. Also, a simple method for direct quantitative real-time PCR analysis of miRNA in plasma does not yet exist. In addition, there is a need for validation in independent studies.
A specific profile of circulating miRNA is found in SSc plasma. Common denominators for the targets of these miRNA are TGF-β signaling and enzymes participating in collagen cross-linking. Involvement of these miRNA in the pathogenesis of SSc should be experimentally verified, including studies of expression in affected versus unaffected tissues, and their validity as markers of SSc requires investigation in independent cohorts of patients with SSc and controls.
ONLINE SUPPLEMENT
Supplementary data for this article are available online at jrheum.org.
Footnotes
-
Supported by the Foundation for the Advancement of Medical Research, Bang’s Foundation, and the Danish Rheumatism Association (R99-A1937).
- Accepted for publication October 1, 2014.