Abstract
Objective. Findings from previous genome-wide association studies indicated an association of the NOTCH4 gene with systemic sclerosis (SSc). This is a followup study to fine-map exonic variants of NOTCH4 in SSc.
Methods. All exons of NOTCH4 were sequenced and analyzed in a total of 1006 patients with SSc and 1004 controls of US white ancestry with the Ion Torrent system. Identified SSc-associated variants were confirmed with Sanger sequencing, and then examined in a Chinese Han cohort consisting of 576 patients with SSc and 574 controls. The NOTCH4 variants were analyzed for association with SSc as a whole and with SSc clinical and autoantibody subtypes with and without the influence of specific HLA-class II alleles that had been previously identified as major genetic factors in SSc.
Results. A total of 12 SSc-associated and SSc subtype–associated exonic variants of NOTCH4 were identified in the US cohort. Three of them are nonsynonymous single-nucleotide polymorphisms and 1 is a CTG tandem repeat that encodes for a poly-leucine, all of which are located in the NOTCH4 extracellular domain (NECD). Conditional logistic regression analysis on SSc-associated HLA-class II alleles indicated an independent association of the NOTCH4 variants with SSc autoantibody subtypes. Analysis of the Chinese cohort supported a genetic contribution of NOTCH4 to SSc and its subtypes.
Conclusion. Multiple NOTCH4 exonic variants were associated with SSc and/or SSc subtypes. Several of these variants encode nonsynonymous sequence changes occurring in the NECD, which implicates a potentially functional effect of NOTCH4.
Systemic sclerosis (SSc) is an immune-mediated fibrotic disease with complex genetic features. It can be classified clinically by the extent of skin fibrosis into limited cutaneous (lcSSc) and diffuse cutaneous (dcSSc) forms1. Most patients with SSc (about 94%) have antinuclear autoantibodies (ANA)2. Some of these autoantibodies are SSc-specific, such as antibodies to topoisomerase I (ATA), centromere (ACA), and RNA polymerase III (ARA)3. Individual patients with SSc rarely have more than 1 of these autoantibodies3. Therefore, SSc can also be subgrouped by the presence of these autoantibodies, which are associated with specific clinical features4.
Although the precise etiopathogenesis of SSc is still unknown, genetic predisposition is clearly an important factor. Multiple genetic loci have been associated with SSc5. Genome-wide association studies (GWAS) have indicated that the strongest SSc-associated loci fall within the HLA class II region6,7. Some specific alleles of classic HLA-II genes, such as DRB1*01:01, DRB1*04:04, DRB*0701, DRB*11:04, DQA1*01:01, DQA*0201, DQA*0501, DQB*02:02, DQB1*05:01, and DPB*13:01, have been established as SSc risk or protective factors8. In addition to these classic HLA-II genes, the SSc GWAS also showed that 2 synonymous single-nucleotide polymorphisms (SNP) in NOTCH4, a non-classical HLA gene in the HLA class II region, are associated with SSc9.
NOTCH4 stands for neurogenic locus notch homolog protein 4, which is a member of the NOTCH family10. The NOTCH4 protein is a single-pass transmembrane receptor containing extracellular (NECD), transmembrane, and intra-cellular domains (NICD). The NECD contains 29-epidermal growth factor (EGF)-like repeats and serves for the ligands and calcium binding. Upon binding with ligands (including Jag 1 and 2, Delta-like 1, 3, and 411), proteolysis of NOTCH occurs that releases NICD from the cell membrane. NICD is then translocated into the nucleus. NICD in turn activates expression of a group of downstream genes such as Hes and Hay12. NOTCH signaling is an essential pathway involving cell proliferation, differentiation, and apoptosis13.
As a followup fine-mapping study of the GWAS, we sequenced whole NOTCH4 exons, to identify NOTCH4 sequence changes that could contribute to the pathogenesis of SSc.
MATERIALS AND METHODS
Study subjects
A total of 1006 patients with SSc and 1004 controls of European ancestry were examined in the first cohort. A Chinese Han cohort composed of 576 patients and 574 controls was also examined to compare ethnic difference. All patients met either the 1980 American College of Rheumatology (ACR) criteria14 or the 2013 ACR/European League Against Rheumatism criteria15 for SSc. The ethical approval of the studies was obtained from the institutional review boards at The University of Texas McGovern Medical School (approval number: HSC-MS-10-0451). Informed written consent was obtained from each participant.
Autoantibodies tests
Patient’s sera were tested for ANA by indirect immunofluorescence using HEp-2 cells as antigen substrate (Antibodies Inc.). ATA was detected by passive immunodiffusion against calf thymus extracts (Inova Diagnostics Inc.). ACA was determined by the pattern of indirect immunofluorescence using HEp-2 cells. ARA was detected using commercially available kits16.
HLA-class II genotyping
HLA-DRB1, HLA-DQA1, HLA-DQB1, and HLA-DPB1 genotyping were performed with oligotyping8 or sequence-based typing (SBT) method using SeCore Kits (Life Technologies)17. The HLA SBT uTYPE 6.0 program (Life Technologies) was used in the sequencing analysis and in assigning HLA alleles.
Next-generation sequence (NGS)
Genomic DNA was extracted from the peripheral blood of each subject. DNA concentrations were measured by ABI Taman DNase P kit and 7900 real-time PCR machine (Applied Biosystems). The NOTCH4 whole exonic sequencing was performed with the Ion Torrent Personal Genome Machine sequencer system (Applied Biosystems). The primer sets were designed by Ampliqseq designer 2.0 to cover the full length of exons and their franking intron regions at 100× coverage and 100% mapping rate. These were used for the sequence: Ion AmpliSeq Library Kit 2.0, Ion Library Equalizer kit, Ion 318 V2 Chip, Ion PGM template OT2 200 kit, and Ion PGM Sequencing 200 kit v2. The experiments were performed by following the protocols from the Ion Torrent manufacturer. Base calling, alignment, mapping, and variant calling were performed in Torrent Suite 4.0 with high stringency default setting. The human reference genome is HG19.
Confirmation of variants with Sanger sequencing (SS)
SS was used to verify the findings from the NGS and to examine the Chinese cohorts. It was performed on an ABI3130xl Genetic Analyzer (Applied Biosystems), and data were analyzed using Sequencing Analysis v.5.2 software (Applied Biosystems). The success reading of SS was 100%.
Statistical analysis
The Hardy-Weinberg equilibrium (HWE) test was done in the control and SSc samples and the SNP were filtered out if the p value of HWE was < 0.05. Chi-square test was conducted to analyze the differences of variant counts between patients with SSc and controls. Exact p values (Fisher’s test) were obtained from 2 × 2 tables of genotype counts and disease status if the sample size for any cells was < 5. Conditional logistic regression analysis was applied to eliminate risk- or protective-effects of known SSc-associated HLA alleles on NOTCH4 polymorphisms. The p values of < 0.05 were considered statistically significant after the false discovery ratio (FDR)–based multiple test correction. Stratified association analyses were conducted with ATA, ACA, and lcSSc/dcSSc to check for an association of SNP in different subgroups. Haplotype analyses were conducted with Haploview software (4.2)18. All other association analyses were conducted in R with our custom scripts (3.2.0)19 and Plink (1.07). CTG triplet repeat (located in chr6:32191659-32191690 in human hg19 genome) association analyses were conducted by comparing candidate alleles or genotypes with the most frequent allele or genotype in control samples.
RESULTS
NGS results and Sanger sequence verification in US cohort
Clinical and autoantibody information of patients with SSc are summarized in Supplementary Table 1 (available from the authors on request). A total of 143 exonic variants were reported from the NGS. Among them, 20 variants including 12 known and 8 new variants were associated with SSc by surpassing FDR-adjusted p < 0.05. Therefore, they were further examined with SS. However, all 8 new variants reported by the NGS appeared to be false variants as analyzed by SS. All 12 known SSc variants including 11 SNP and 1 triplet repeat (CTG, n = 5, 6, 9, 10, 11, 12, 13) were consistent between NGS and SS results, and met the HWE test (Supplementary Table 2, available from the authors on request).
The allele-based associations with SSc are shown in Table 1. In addition, dominant model, genotype model, recessive model, and trend model also were tested in the studies, and these are shown in Supplementary Table 3 (available from the authors on request).
Analysis for association between NOTCH4 polymorphisms and clinical subtypes showed that the strongest associations with lcSSc and dcSSc were rs204987 (p = 1.5 × 10−7, OR = 0.23) and 3 linkage disequilibrium (LD)-linked SNP (rs520803/rs520692/rs520688; p = 0.0013, OR 1.35; Supplementary Table 4, available from the authors on request).
In analysis of SSc autoantibody subtypes, ATA+ patients with SSc were positively associated with 11CTG, 6CTG, and 6 SNP, including rs415929, rs520803/rs520692/rs520688, rs423023, and rs422951. ACA+ patients were negatively associated with 9CTG, 13CTG, and 5 SNP including rs443198, rs204987, rs8192579, rs915894, and rs1044507. Among them, the p values of 11CTG, rs415929, rs520803/rs520692/rs520688, and rs423023 passed the genome-wide significance thresholds of 5 × 10−8 (Table 2A). ARA+ patients with SSc did not show significant association with any variants of NOTCH4.
Because the NOTCH4 gene is adjacent to HLA class II genes, potential LD from the latter may mask the association of NOTCH4 with SSc. We therefore performed conditional logistical regression analysis based on the known SSc-associated HLA II genes including DRB1*11:01 and DPB1*13:01 for ATA+ SSc, and DRB1*01:01, DRB1*04:04, DQA1*01:01, and DQB1*05:01 for ACA+ SSc8. All SNP, except rs422951 to ATA and rs1044507 to ACA, remained significantly associated with their corresponding autoantibody subtypes (Table 2B).
Association of NOTCH4 variants with SSc in the Chinese cohort
To investigate whether the associations found in US SSc varied by different ethnicity, a Chinese SSc cohort was examined for these 12 NOTCH4 variants. SNP rs1044507 achieved a nominal significance threshold of p < 0.05 for association with SSc (p = 6.8 × 10−4, OR 0.56; Table 3). Allelic association of CTG tandems with SSc were found at 6CTG (p = 3 × 10−4, OR 1.67) and 9CTG (p = 0.001, OR 1.42). The former was in concordance with US cohort (p = 0.015, OR 1.3), while the latter was in the opposite direction of the US cohort.
For autoantibody subtypes, ATA+ SSc was in agreement with US cohort results positively associated with rs415929, rs520803, rs520692, rs520688, and rs423023. ACA+ SSc was positively associated with 6CTG, and negatively with rs1044507, which achieved nominal significance of p < 0.05, also consistent with the US cohort (Table 3).
The case numbers of subsets of Chinese SSc autoantibodies in specific SSc-associated HLA genotypes are too small for association analysis of NOTCH4 alleles to conclude the result.
In the Chinese cohort, the number of study subjects is limited. We cannot perform the analysis of the independent association of the NOTCH4 alleles from the HLA alleles. However, we performed LD analysis using 157 CHB (Han Chinese in Beijing) and CHS (Southern Han Chinese) samples obtained from the 1000 Genome Project. Tag-SNP were selected (between the GPX5 to the ZBTB9) to cover the whole HLA region after removing the redundant variants with high LD within 5 adjacent SNP (step = 3 and variance inflation factor = 2). LD was estimated between pairwise Tag-SNP and the distribution of the r2 between NOTCH4 and adjacent regions was calculated (Supplementary Figure 1, available from the authors on request). The result indicated that LD between the NOTCH4 gene variants is very strong, but it becomes significantly decayed as its genomic distance extends beyond the NOTCH4 genes in the HLA region in the Chinese population. The r2 between the NOTCH4 and HLA-DP, -DR, and –DQ appeared to be relatively weak (r2 < 0.045).
DISCUSSION
Some HLA class II gene alleles, such as DRB1*0701, DRB1*11:04, DQA1*0201, DQA1*0501, DQB1*02:02, DQB1*05:01, and DPB1*13:01, are established SSc risk or protective factors8. The genetic association of some HLA alleles appeared more significant in SSc autoantibody subsets. For instance, HLA-DRB1*11:04 and DPB1*13:01 confer strong risk of ATA-positive SSc, and DRB1*04:01 and DQB1*03:01 confer strong risk of SSc to ACA-positive patients8. The extensive LD across the HLA region has made it hard to distinguish true association between SSc and other HLA region genes or loci from the LD effect of these HLA alleles. We addressed this confounding issue in this post-GWAS sequencing study of the NOTCH4 gene with conditional logistical regression analysis in a large SSc cohort of US white subjects, followed by analysis in a Chinese cohort that has different ancestral histories from US subjects and different LD patterns17,20,21. A total of 12 exonic variants of the NOTCH4 gene were found to be associated with SSc in the US cohort. There was a stronger association of NOTCH4 variants in the autoantibody subtypes of SSc. A CTG triplet repeat (11CTG) and 5 SNP (rs415929, rs520803/rs520692/rs520688, rs423023) located in the NECD of NOTCH4 conferred a strong risk (p values passed genome-wide significance threshold) in ATA+ patients with SSc in the US cohort. The CTG triplets encode poly-leucines (number from 5 to 13), and the number of poly-leucines determines the length of the signal peptide domain, which then affects the binding of signal peptidase, and maturation and translocation of NOTCH410,11. SNP rs520692 encodes a nonsynonymous polymorphism switching between aspartic acid and glycine. On the other hand, all identified ACA+ associations with NOTCH4 variants (9CTG, 13CTG, and 5 SNP including rs443198, rs204987, rs8192579, rs915894, and rs1044507) were negative, except 6CTG. Among them, rs915894 encodes a nonsynonymous polymorphism switching between lysine and glutamine located in NECD. Importantly, most of these associations seemed independent from an LD effect of the HLA-class II genes that have been shown to be strongly associated with SSc8. It is worth noting that the synonymous SNP rs443198, previously reported in the SSc-GWAS9, showed a consistent association with SSc here, especially with ACA+ patients.
Although not all associations of the NOTCH4 variants with US SSc were concordant in the Chinese cohort, the strong negative association of rs1044507 and the positive association of 6CTG were consistent between the 2 cohorts. In addition, there was a noteworthy trend of association of the NOTCH4 variants with SSc autoantibody subtypes. These findings support NOTCH4 as an important genetic factor in SSc, and this is in agreement with the previous GWAS and family studies of SSc9,22.
It is also worth noting that NOTCH4 variants have been associated with other rheumatic and/or immune-mediated diseases including rheumatoid arthritis23, sporadic inclusion body myositis24, ulcerative colitis25, alopecia areata26, and age-related macular degeneration27. The CTG of NOTCH4 form an α/β horseshoe fold on NECD believed to be involved in protein-protein interactions, which has also been implicated in patients with schizophrenia28.
The functional significance of these NOTCH4 variants remains to be elucidated. Several identified SSc-associated NOTCH4 variants, especially nonsynonymous polymorphisms, are located in the NECD involved in construction of EGF-like repeats that may influence molecular structure and the binding of NOTCH4 ligands and Ca2+. On the other hand, an activated NOTCH signaling has been reported in SSc29 and other fibrotic disorders such as keloids and kidney fibrosis30,31, and NOTCH signaling appeared to contribute to fibroblast activation and collagen overproduction in SSc and fibrotic animal models29,32.
Several NOTCH4 exonic variants were associated with SSc and/or SSc subtypes in studies of both US and Chinese cohorts. The results support the previous findings of a genetic contribution of NOTCH4 to SSc. Our results indicated that these associations are independent of HLA II associations. This may provide novel insight into SSc-associated and specific changes of coding region sequence of the NOTCH4 gene, some of which are nonsynonymous variants located in the NECD. The functional effect of these SSc-associated variants, especially on ligand binding and activation of NOTCH4 signaling, need to be further explored.
Footnotes
Supported by the US National Institutes of Health (NIH) National Institute of Allergy and Infectious Diseases 1U01AI09090-01 and NIH P01-052915-01; the Major National Science and Technology Program of China, grant number 2008ZX10002-002; and the Science and Technology Committee of Shanghai Municipality (11410701800).
- Accepted for publication August 8, 2018.