Abstract
Objective. Polymorphisms in human major histocompatibility complex (MHC) are the strongest genetic associations with rheumatoid arthritis (RA). Epigenome-wide methylation studies suggest DNA methylation changes within MHC may contribute to disease susceptibility. We profiled MHC-specific methylated CpG (5′–C–phosphate–G–3′) in autoantibody-positive patients with RA and matched unaffected anticitrullinated protein antibodies–negative first-degree relatives (ACPA−/FDR) from an indigenous North American (INA) population that is known to have prevalent RA.
Methods. DNA was isolated from whole blood and targeted bisulfite sequencing was used to profile methylated CpG in patients with RA and ACPA−/FDR. Differentially methylated CpG loci (DML) were mapped and gene annotated. Ingenuity pathway analysis (IPA) was used for curating biomolecular networks of mapped genes. Transcript abundance was determined by quantitative (q)PCR.
Results. We identified 74 uniquely methylated CpG sites within the MHC region that were differentially methylated in patients with RA (p < 0.05), compared to ACPA−/FDR. Of these, 32 DML were located on 22 genes. IPA showed these genes are involved in regulating the nuclear factor–κB complex and processes involved in antigen presentation, and immune cell crosstalk in autoimmunity. Pearson correlation analysis demonstrated a negative association between differentially methylated CpG in the C6ORF10 gene and risk factors associated with RA. Analysis by qPCR confirmed differential abundance of C6ORF10, TNXB, and HCG18 mRNA in patients with RA compared to ACPA−/FDR.
Conclusion. Our results confirm the presence of differential methylation at specific gene loci within the MHC region of INA patients with RA. These epigenetic signatures may precede disease onset, or alternatively, may be a result of developing RA.
- C6ORF10 gene
- DNA methylation
- Indigenous North American
- MHC
- rheumatoid arthritis
- targeted bisulfite sequencing
Rheumatoid arthritis (RA) is a chronic, progressive systemic autoimmune inflammatory disease primarily affecting the peripheral joints. It is now well established that the risk of developing RA involves a complex interplay between genetic and environmental factors1,2. Although RA prevalence worldwide has been estimated between 0.5–1%, we and others have demonstrated a disproportionately high RA prevalence in indigenous North American (INA) populations3. Risk factors for RA development in several INA populations include a high prevalence of predisposing HLA-DRB1 shared epitope (SE) encoding alleles, familial clustering of cases, high smoking rates, prevalent periodontal disease, and a high frequency of rheumatoid factor (RF) and anticitrullinated protein antibodies (ACPA) in the unaffected family members of patients with RA4–10. Further, studies of the Dene INA people suggest a unique clustering of biological features that may affect innate and adaptive immunity11,12,13,14.
Located on chromosome 6 (6p21.3), the MHC encompasses 3 major gene clusters (class I, II, and III) and encodes ∼252 genes, the majority of which are involved in antigen processing, presentation, and immune regulation15. Genetic variants in the MHC region are strongly associated with susceptibility to RA, both in seropositive and seronegative patients, although different variants are associated with these 2 major subsets16,17. In understanding the basis for this key association, a potential role of epigenetic changes in the MHC has been demonstrated in seminal studies that suggested that differential DNA methylation in the MHC region may be involved in mediating the genotype-phenotype associations in RA18,19. Independent studies in another, predominantly white population have supported this finding20. The potential role of MHC epigenetic changes in predisposing INA populations to RA has not been studied despite the high prevalence of disease in these populations. We therefore studied the presence of differential methylation signatures in a cohort of INA seropositive patients with RA and unaffected seronegative first-degree relatives (FDR) in an INA population in central Canada where we have shown that RA is particularly prevalent and severe.
MATERIALS AND METHODS
Study design
INA study participants were recruited from Cree, Ojibway, and Ojicree communities in central Canada6,11. The Biomedical Research Ethics Board of the University of Manitoba approved the overall design of the study and consent forms (Ethics: 2005:093, Protocol: HS14453). The conduct of the study was guided by the principles of Community Based Participatory Research, a cornerstone of the Canadian Institutes of Health Research guidelines for Aboriginal health research (www.cihr-irsc.gc.ca/e/29134.html). The study participants provided informed consent after the study was explained to them in detail, with the help of an INA translator from their community, where necessary. The following 2 groups were included: (1) anticitrullinated protein antibody (ACPA)–positive patients with RA, and (2) unaffected FDR (ACPA−/FDR) who were seronegative for ACPA and RF. Patients with RA were unrelated to the ACPA−/FDR. The demographics of the study groups are summarized in Table 1. RA diagnosis was made based on fulfilling the 2010 American College of Rheumatology/European League Against Rheumatism classification criteria, as determined by a rheumatologist (HEG).
Sample collection and genomic DNA isolation
Venous blood was collected into Paxgene Blood DNA tubes (Qiagen Inc.), and processed using QIAamp DNA Blood Maxi Kit (Qiagen Inc.) per the manufacturer’s instructions to isolate genomic DNA. Quality and concentration of isolated DNA was determined by Nanodrop and Qubit high sensitivity dsDNA assays. For total RNA isolation, venous blood collected into Paxgene Blood RNA tubes (Qiagen Inc.) was used.
Targeted bisulfite sequencing (TBSeq)
We used CATCH-Seq target enrichment technology (Ubiquity Genomics) to perform bisulfite sequencing of coding and noncoding regions encompassing ∼3.8 Mb around the MHC locus located within chromosome 6 (chr6:29530000-33500000). Library creation, bisulfite treatment, and Illumina sequencing for each DNA sample and β value estimation for each CpG (5′–C–phosphate–G–3′) across the MHC region were performed by Ubiquity Genomics Inc. according to their standardized methodologies21. According to the protocol optimized by Ubiquity Genomics, all the samples had a bisulfite conversion rate > 97%. Briefly, DNA samples were sheared, end-repaired, hybridized to streptavidin-coated Dyna beads, and ligated to uniquely barcoded custom methylated/nonmethylated adapters. For target enrichment, pooled DNA samples bound to streptavidin-coated Dyna beads were collected with biotinylated probes generated from the human bacterial artificial chromosome library. Following collection, target enriched library was bisulfite-treated (EpiTect Bisulfite Kit, Qiagen Inc.), amplified by PCR, and sequenced using an Illumina HiSeq2500. Bowtie 2 was used for mapping and sequence alignment against hg19 version of reference genome and Bismark was used to calculate β scores indicating methylation level at each CpG locus across the target region after performing batch normalization. Additional information on the sequencing results is included as Supplementary Table 1 (available with the online version of this article).
Immunoassays
Serum C-reactive protein (CRP) levels were tested in serum samples using a human high-sensitivity CRP (hsCRP) ELISA kit (Biomatik) per the manufacturer’s instructions. ACPA status was determined using the BioPlex 2200 System Anti-CCP reagent kit (Bio-Rad).
Genotyping of HLA-DRB1
High-resolution HLA-DRB1 genotyping was performed at the Dalhousie University HLA Typing Laboratory (Halifax, Nova Scotia, Canada) using a sequence-specific PCR method.
Total RNA extraction and quantitative real-time PCR (qRT-PCR)
Total RNA was isolated from whole blood using the miRVANA isolation kit (Ambion, AM1561) per manufacturer’s instructions. RNA quality was determined on Agilent Bioanalyzer using the Agilent RNA 6000 Nano kit (Agilent Technologies). Total RNA with an A260/280 > 2.0 and an RNA integrity number > 7.0 was used for downstream processing. For mRNA amplification, first-strand cDNA was synthesized from total RNA (1 μg) using Superscript VILO Mastermix (Thermo Scientific) per manufacturer’s instructions. Target mRNA was amplified using Applied Biosystems Power SYBR Green Mastermix (Thermo Scientific) per manufacturer’s instructions. Primers used for mRNA amplification are listed in Supplementary Table 2 (available with the online version of this article).
Data analysis and statistics
GraphPad Prism version 7.0 and SPSS v25.0 (IBM) were used for data analysis and generating volcano plots, scatter plots, and bar graphs. We performed data control as follows: first, only CpG with a minimum depth coverage > 15× were included and sites containing missing values were excluded. Upon filtering based on depth, CpG included for analysis comprised 26% of total reads. Filtered β values were log2 transformed and differential methylation at each CpG site was calculated (RA-ACPA−/FDR). Benjamini-Hochberg correction was used to perform multiple correction (FDR > 5%) and differentially methylated loci (DML) with FDR-corrected p values < 0.05 were considered significant. To annotate and determine the genomic distribution of identified DML, we used diffReps script region_analysis.pl22. Ingenuity pathway analysis (IPA) was used to perform gene network and pathway analyses. Gene expression by qPCR was determined in samples after normalization using 18S ribosomal RNA as endogenous control, and relative fold changes were calculated using the ΔΔCt method23.
RESULTS
Demographics of the study population
Participants were age-matched, ethnically homogeneous individuals, and about 80% of the study participants were women (Table 1). As expected, patients with RA in the TBSeq cohort (n = 17) demonstrated higher hsCRP (mean ± SD 32.23 ± 72.73 µg/ml) and ACPA levels (mean ± SD 173.56 ± 59.32 µg/ml) compared to ACPA−/FDR (n = 17). The demographics of ACPA−/FDR (n = 12) and patients with RA (n = 15) in the qPCR cohort were identical to those in the TBSeq cohort and have been described previously4. All the patients with RA were taking stable disease-modifying antirheumatic drugs, although the disease activity levels as determined by 28-joint count Disease Activity Score varied among the study subjects (Table 1, and Supplementary Table 3, available with the online version of this article).
Because of the known association with RA, we performed high-resolution HLA-DRB1 typing on samples from ACPA−/FDR and patients with RA included in the TBSeq cohort. Of the HLA-DRB1 alleles, there was a predominance of the SE encoding *04:04 and *14:02 alleles in both the study groups compared to other alleles. Even in this modestly sized cohort of patients and controls, bearing a single copy of SE allele was significantly associated with being an ACPA−/FDR, while a higher frequency of patients with RA had 2 copies of SE alleles (Table 2A).
Binary logistic regression analysis showed a significant association between presence of any SE alleles with body mass index (p = 0.017), while presence of 2 copies of SE associated significantly with CRP levels in this population (p = 0.021). Further, presence of *14:02 alleles associated significantly with autoantibodies [anticyclic citrullinated protein antibodies (anti-CCP) and RF] and CRP levels, suggesting that *14:02 positivity may be an important factor underlying autoantibody production and inflammation observed in patients with RA (Table 2B). The *04:04 positivity did not show association with any of the RA risk factors in our analysis.
Differences in MHC-region specific methylation between RA patients and ACPA−/FDR
Using CATCH-Seq-based complete MHC collection kits and targeted bisulfite sequencing, we examined the single-CpG site methylation levels specifically within the 3.8 Mb human MHC region (from 29.6 Mb to 33.2 Mb) of ACPA−/FDR and patients with RA. First, we identified the overall genomic distribution of all methylated CpG loci identified within the region. While 50% of the identified CpG sites were localized to intergenic regions, 31% CpG were within the gene body, and the remaining were localized to proximal and distal promoter regions (Figure 1A). No significant difference was observed in mean methylation levels within the MHC cluster between patients with RA and ACPA−/FDR (Figure 1B; p = 0.9798; unpaired t test with Welch’s correction). However, volcano plot analyses revealed the presence of 74 FDR-corrected DML out of 9782 filtered CpG loci (0.76%; p < 0.05, q < 0.05; Benjamini-Hochberg correction; Supplementary Table 4, available with the online version of this article), among which 74% of regions were hypomethylated and 26% were hypermethylated CpG sites (Figures 1C and 1D). Further, unsupervised hierarchical clustering based on these 74 DML identified within the MHC region demonstrated a distinctly independent clustering of patients and controls with minor overlap between the groups (Figure 1E). It is interesting to note that a majority of these DML could not be annotated to any specific gene in the MHC region and are located within the intronic regions. Overall, our analyses revealed the presence of intrinsic methylation differences at unique CpG loci within the MHC cluster in INA patients with RA compared to ACPA−/FDR from the same population.
IPA analysis of hypermethylated and hypomethylated genes
We performed IPA analyses on the annotated genes mapped to differentially methylated CpG to characterize the biological mechanisms predictably influenced by aberrant DNA methylation in patients with RA. The IPA biomolecular network identified nuclear factor (NF)-κB complex as a common hub regulated by both hypermethylated and hypomethylated DML-harboring genes (Figure 2A and 2B). Diseases predicted to be regulated included metabolic and immunological disorders, particularly insulin-dependent diabetes mellitus and RA (Fisher’s exact test; log p value > 2.0; threshold value = 0.05). As expected, IPA also revealed that antigen presentation pathway was the most represented canonical pathway, while interferon (IFN)-α receptor was the most common upstream regulator. This suggests that differential epigenetic methylation in patients with RA can potentially target antigen presentation and type 1 IFN pathways (Supplementary Table 5, available with the online version of this article), which can together influence NF-κB complex and modulate RA pathogenesis. We also observed a representation of pathways involved in immune cell crosstalk and hematological system development, which are probably overrepresented in chronic diseases.
Verification of DML-annotated genes
Using UCSC genome browser, we mapped 32 out of 74 DML to 21 annotated genes (hg19). For this, we created a bed file using the genomic location of DML and obtained coordinates following addition and subtraction of 1000 bases to the genomic location. It is interesting to note that the majority of DML were located within the gene body (17 CpG loci; Table 3). Among the 32 gene-annotated DML, hypermethylated DML were harbored in 6 genes and hypomethylated DML were harbored in 15 genes (Table 3).
We downloaded a public GEO dataset (GSE42861) to verify the findings from our TBSeq data. GSE42861 is an epigenome-wide association study (EWAS) measuring the DNA methylome in the peripheral blood of 354 ACPA+ patients with RA and 337 healthy controls using Illumina HumanMethylation450 arrays. We specifically downloaded DNA methylome data of chromosome 6 and filtered it by using coordinates of the MHC region (chr6:29530000–33500000). The verification was applied to only 74 DML identified in our primary analysis. Of these, the locations of 4 DML matched exactly with the positions identified by Liu, et al to be differentially methylated in patients with RA19. However, the CpG positions of the remaining 70 DML were in close proximity (within ∼20 Mb–1 kb) to the differentially methylated regions mapped in the public dataset, but in the same gene (Supplementary Table 6, available with the online version of this article). The discrepancies in methylated CpG locations can be attributed to the use of different methodologies and the sample size, which makes it difficult to demonstrate a 100% alignment between the CpG. However, identification of differentially methylated regions within the same gene in the GEO dataset indirectly confirms our findings.
Methylation differences in CpG located within the C6ORF10 gene were previously shown to be strongly associated with risk of developing RA. Therefore, we performed a Pearson correlation analysis between RA risk factors and methylation values of C6ORF10 DML identified in our study. Of the 4 DML, methylation at only 1 location (CpG32341351) demonstrated a significantly strong negative correlation with the anti-CCP levels (r = (−0.402; p = 0.018), RF antibodies (r = (−0.376; p = 0.029), the presence of 2 copies of HLA-DRB1 SE alleles (r = (−0.407; p = 0.017), and the presence of *0404 SE allele (r = (−0.360; p = 0.037).
Considering that CpG methylation in DNA is strongly implicated in regulating gene transcription24, it is important to understand the functional relationship between altered methylation and the relative abundance of genes associated with DML. Therefore, we validated bisulfite sequencing data by performing gene expression analyses by qPCR in mRNA isolated from whole blood samples collected from an independent subset of INA people (qPCR cohort). These individuals are nested within the same INA cohort and are independent from those on whom pyrosequencing was performed. Specifically, we focused on determining mRNA abundance of genes associated with multiple DML located either within the genebody or promoter1k regions (Figure 3). Based on this, we selected 5 genes for the qPCR analysis: C6ORF10 (4 DML), HLA-DOB (3 DML), HCG18 (3 DML), TNXB (3 DML), and TRIM40 (2 DML). Among the genes harboring multiple hypomethylated DML, mRNA abundance of C6ORF10 was significantly higher in patients with RA compared to ACPA−/FDR. Even though an increasing trend and a dichotomous pattern was observed in TNXB abundance, the effect was statistically nonsignificant. HLA-DOB transcript levels remained unaltered between patients with RA and ACPA−/FDR. A significantly lesser abundance of HCG18 transcript was observed in patients with RA relative to ACPA−/FDR, while TRIM40 abundance was unchanged. This suggests that differential methylation at specific loci on some of these genes can influence their gene expression.
DISCUSSION
In our present study, we compared the MHC-specific DNA methylation patterns in whole blood samples from a cohort of autoantibody-positive INA patients with RA and unrelated seronegative FDR with no clinically detectable autoimmunity. Using bisulfite sequencing, we identified distinct and differentially methylated CpG loci within MHC region of patients with RA, several of which were annotated to specific genes that regulate key pathways in RA pathogenesis, such as antigen presentation and immune cell crosstalk through the NF-κB complex.
DNA methylation, particularly in CpG islands embedded with gene promoter regions, is significantly associated with linking genetic susceptibility to phenotypic traits25. Microarray-based EWAS on European whites demonstrated directionally similar differential methylation patterns in the MHC region of patients with RA compared to healthy controls19,20. We integrated these observations and applied a high-throughput, cost-effective, and accurate pyrosequencing methodology (TBSeq) to profile methylated CpG loci specifically in and around the MHC region in a predisposed non-white population, aiming to replicate previously reported findings.
EWAS studies by Liu, et al and others underscore the importance of identifying altered DNA methylation loci within the MHC region, and describe their effect on gene expression and delineating molecular mechanisms that can eventually influence RA phenotype19,20. In our study, we observed a representation of multiple DML within the genebody of certain annotated genes (C6ORF10, TNXB, HLA-DOB, and HCG18), which indicates the presence of a higher methylation frequency and can possibly regulate their transcript expression. C6ORF10, TNXB, and HCG18 are novel RA-associated pleiotropic genes also identified in various genome-wide association studies and EWAS, whose physiological functions are yet to be deciphered19,26,27,28,29. A strong negative correlation between DML in C6ORF10 and RA-associated autoantibodies (ACPA and RF) suggests that hypomethylated C6ORF10 might be involved in amplifying the levels of circulating RA autoantibodies in genetically susceptible INA individuals with SE alleles. In fact, disease-related functional gene networks found using the mapped SNP database identify C6ORF10 as a functional hotspot interacting with at least 14 different genes that are implicated in autoimmune conditions such as systemic lupus erythematosus, type 1 diabetes, psoriasis, and RA26. Therefore, additional investigations are warranted to further unravel the role of C6ORF10 in mediating the onset of inflammatory arthritis in INA and other populations.
We applied computational predictive analyses using IPA to identify functional biological relevance and potentially common downstream signaling networks regulated by DML-harboring genes found in patients with RA. Emergence of NF-κB complex as a central node and autoimmune disorders as the top-related diseases associated with the differentially methylated MHC-specific genes is not surprising. Of the 98 gene loci established to have significant risk to RA susceptibility, a vast majority are involved in NF-κB signaling, thus establishing the role of NF-κB complex in RA pathogenesis30. Activation of NF-κB complex in the rheumatoid synovium, either canonically or noncanonically, upregulates proinflammatory cytokine expression, augments immune cell recruitment, and promotes proliferation of synovial fibroblasts leading to joint destruction31. Network analysis identified an overrepresentation of pathways, presumably regulated upstream by IFN-α receptor, and involved in immune cell crosstalk and hematological cell development in the interactome of both hypo- and hypermethylated genes (Supplementary Table 4, available with the online version of this article). While MHC class I–related chain A (MICA) is key to crosstalk between dendritic and NK cells and might play a key role in regulation of autoreactivity32, HLA-DPA1, HLA-DOB, and TAP1 are involved in multiple immune cell–associated pathways such as interleukin 4 signaling and antigen presentation33,34,35,36. Molecules such as AIF1, HLA-DPA1, MICA, and TNXB are involved in vasculogenesis, erythropoietic cell reprogramming, phosphoinositide-3-kinase–protein kinase B/Akt signaling, and T cell proliferation37,38,39.
Key limitations of our study include small sample size and use of whole blood DNA as a starting material for methylation profiling. Considering that the primary objective of our study was to validate the observations made in previously published EWAS, we partially addressed the issue of small sample size by using an MHC-specific TBSeq technique and validating our findings in an independent qPCR cohort using gene-specific primers. Also, the qPCR cohort was different from the TBSeq cohort because of logistical limitations. However, we attempted to minimize variability by selecting a comparable group of individuals belonging to the same population. We acknowledge our inability to address the association between cell-type heterogeneity to methylation levels at specific DML identified in INA patients with RA because of the unavailability of reference DNA methylation database tailored to chromosome 6 and our region of interest in different blood cell types. Our study findings may also be affected by exposure to RA medications and require future investigations to delineate their effect.
We present evidence that the presence of differential MHC-associated methylation profile is detectable in the whole blood of seropositive patients with RA from an INA population, with DML mapped to unique genes whose role in autoimmune RA is yet unknown. Multiple methylation studies, including ours, undeniably identify C6ORF10 as a novel gene that is significantly dysregulated and can potentially modulate inflammatory processes involved in RA pathogenesis and possibly other autoimmune conditions across populations. Additional research to identify the interaction network of C6ORF10 in FDR of patients with RA will delineate its functional contribution to RA onset in a high-risk INA population.
ONLINE SUPPLEMENT
Supplementary material accompanies the online version of this article.
Acknowledgment
We acknowledge the contribution of study participants from indigenous communities who donated blood for our study. We also recognize Chief and Band Councils of Norway House and St. Theresa Point, Manitoba, for their invaluable cooperation. Raw and processed data files have been submitted to NCBI Gene Expression Omnibus as FASTQ files (GSE134877). The datasets analyzed in the current study are available from the corresponding author on reasonable request.
Footnotes
Supported by individual grants to Dr. H. El-Gabalawy from the Canadian Institutes of Health Research. V. Anaparti received postdoctoral fellowships from Research Manitoba and the Arthritis Society of Canada.
- Accepted for publication October 21, 2019.