Abstract
Objective. Genetic studies have identified several rheumatoid arthritis (RA) susceptibility loci in European-derived populations. The same biological pathways may be involved in determining the RA risk in different population groups. We sought to replicate the association of 33 single-nucleotide polymorphisms (SNP) from 31 RA susceptibility loci confirmed among Europeans in a unique Pakistani population.
Methods. We genotyped 33 SNP in a sample of 366 Pakistanis that comprised related and unrelated cases and controls. Genotyping was performed using TaqMan assays and the results were analyzed with family case-control software.
Results. Twelve of the 33 SNP were replicated in this sample with significant p values ranging from 7.05E-06 to 3.72E-02, the most significant being the KIF5A-PIP4K2C/rs1678542 SNP.
Conclusion. Our observations suggest that a number of RA susceptibility loci and related pathways are shared across different populations.
- RHEUMATOID ARTHRITIS
- RELATED-UNRELATED ASSOCIATION STUDY
- SUSCEPTIBILITY
- AUTOIMMUNE DISEASE
Rheumatoid arthritis (RA) is a chronic, systemic, inflammatory autoimmune disease affecting about 0.5%–1% of European-derived populations and 0.2%–0.3% of Asian populations1,2. RA is characterized by inflammation of the synovial fluid, which leads to severe pain and bone destruction. The etiology of RA is complex and is suggested to be the outcome of various environmental, genetic, and hormonal factors3,4. RA has a strong genetic basis with estimated heritability ranging from 50% to 60%5,6,7.
Since 2000, many studies have been conducted toward understanding genetic susceptibility to RA. Genome-wide association studies (GWAS) and their metaanalyses have confirmed more than 30 risk loci for RA, including HLA-DRB1, PTPN22, CD40, STAT4, OLIG3, TNFAIP3, TNFRSF14, CTLA4, CCL2, PADI4, CD2, CD58, FCGR2A, PTPRC, REL, SPRED2, AFF3, CD28, IL2, IL21, C5orf30, IRF5, CCL21, TRAF6, CCR6, CD40, and TRAF1/C58,9,10, 11,12,13,14,15,16,17,18. Many initial genome-wide significant associations have been replicated in subsequent studies, with few exceptions19, the latter probably due to the complex etiology of RA.
We hypothesized that genetic background for RA susceptibility is shared, at least in part, across different populations. To examine this hypothesis, we genotyped 33 SNP from 31 RA loci confirmed in a recent European GWAS metaanalysis18, within a unique Pakistani population that previously had not been examined with respect to genetic risk for RA.
MATERIALS AND METHODS
Subjects
Two groups of subjects, a case-control sample of unrelated individuals (n = 214) and a family-based sample (n = 152), were included in our study. Blood samples and relevant information were collected from subjects recruited from different rheumatology centers located in 2 adjacent cities in Pakistan: the Military and Fauji Foundation Hospitals in Rawalpindi and the Kahota Research Laboratories Hospital in Islamabad. A total of 239 patients with RA satisfying the American College of Rheumatology (ACR) 1987 criteria20 were enrolled. The mean age at onset of disease was 39.1 ± 13.0 years in RA cases (63% females). The control group (n = 127; 54% females) had no history of autoimmune diseases and their mean age was 41.2 ± 12.0 years. The main characteristics of RA cases and controls are given in Table 1. All subjects were recruited after providing informed consent. The study was approved by the National University of Science and Technology Review Board in Pakistan and the University of Pittsburgh Institutional Review Board in the USA.
Clinical diagnosis of patients was made through standard methods of physical examination and review of medical screening test results. All participants were interviewed and a screening questionnaire for each was filled under the supervision of a certified rheumatologist. Subjects having multiple affected individuals in their family were classified as having familial RA. The pedigrees of families with multiple patients with RA were constructed using a standard method21.
Genomic DNA was extracted from whole blood using either the standard phenol/chloroform method or the Qiagen genomic DNA extraction kit, following the manufacturer's instructions (Qiagen Inc.). DNA was quantified using the Quant-iTTM PicoGreen® dsDNA assay kit (Life Technologies).
Genotyping
A total of 33 single nucleotide polymorphisms (SNP) from 31 confirmed RA risk loci (26 previously known and 7 newly identified) were selected from a previous GWAS metaanalysis performed in individuals of European ancestry18. Detailed information about these SNP is given in Table 2. All SNP were genotyped using the TaqMan SNP genotyping assays (Applied Biosystems) following the manufacturer's protocols. TaqMan assay was not available for one of the SNP initially selected (HLA-DRB1/rs6910071), therefore another significant SNP (rs660895)22 was evaluated from this locus for association analysis. PCR amplification was performed in 384-well plates on dual-block GeneAmp® PCR System 9700 (Applied Biosystems) and plates were read and analyzed on ABI-Prism 7900HT sequence detection systems.
Statistical analysis
Alleles and genotype frequencies were calculated through allele-counting methods, and deviations from Hardy-Weinberg equilibrium (HWE) were tested using the chi-square goodness-of-fit test. The family-based samples were examined for Mendelian inconsistencies in the pedigree data using the PedChek program23 (Website: http://watson.hgen.pitt.edu). Associations of all SNP with RA were tested in the combined set of both unrelated case-control and family-based samples using the family case-control (FamCC) software Ver 1.024, which is a unified analysis approach based on principal component for family and unrelated samples. Specifically, the FamCC consists of 3 sequential steps: (1) principal components are generated from the genotype data; (2) multiple linear regression on the top 10 principal components is performed for both the phenotypes and markers for the unrelated individuals, respectively; and (3) the residuals of the phenotypes and the markers are calculated based on the estimated coefficients in the linear mode in the second step, and then association between the phenotype and genotype is assessed by testing the correlations between these residuals using the following statistics:
where
RESULTS
A total of 33 successfully genotyped SNP corresponding to 31 confirmed RA susceptibility loci in Europeans were analyzed in a Pakistani sample comprising 366 RA cases and controls that included 214 unrelated and 152 related individuals. The genotyping call rate was > 98% for all SNP. Genotyping error was estimated by repeating 10% of the samples and the discrepancy rate was found to be 0% for all but 2 SNP (RBPJ/rs874040 and SPRED2/rs934734) for which the error rate was 0.031%. All SNP were found to be in HWE in the unrelated case-control sample; similarly, no Mendelian inconsistency was found in the family-based sample. We combined both unrelated case-control and family-based data to obtain a more effective population size. FamCC, a unified method for unrelated case-controls and family-based samples, was used for the association analysis. The association results and p values for 33 evaluated SNP are shown in Table 3. Of the 33 SNP confirmed among Europeans, we successfully replicated 12 SNP in our combined sample (p < 0.05) with the same allele and same direction of association. The most significant SNP was KIF5A-PIP4K2C/rs1678542 (p = 7.05E-06), followed by CD2-CD58/rs11586238 (p = 4.56E-05), TNFAIP3/ rs10499194 (p = 2.51E-04), and HLA-DRB1/rs660895 (p = 3.78E-04). Marginal associations with 4 additional SNP (p = 5.70E-02 to 9.61E-02) and possible associations with 7 SNP (p = 1.06E-01 to 2.91E-01) were also observed showing the same trend for association as previously reported.
DISCUSSION
Recent GWAS conducted mainly in European and American populations have identified or confirmed a number of risk loci for RA. Replication studies of known loci across various ethnic groups can indicate future directions by the identification of population-specific RA susceptibility loci/genes. Stahl, et al18 conducted a metaanalysis of 6 GWAS involving a total of 5539 RA cases and 20,169 controls of European descent, followed by replication in 6768 RA cases and 8806 controls (totaling 41,282 samples). They not only confirmed the 24 known RA loci (26 SNP) but also identified 7 new risk loci (genome-wide significance in the combined sample), 4 of which were previously implicated in other autoimmune diseases, while 3 were new RA and autoimmune risk loci.
To our knowledge, the genetic factors responsible for RA in Pakistanis have not been examined before. Pakistanis, along with North Indians, are believed to have considerable white ancestry25,26,27. Thus, we hypothesized that most, if not all, RA loci identified in white or white-derived populations might also be responsible for RA in Pakistanis. To test this hypothesis, we genotyped 33 SNP from 31 loci confirmed among Europeans in a Pakistani case-control and family-based sample.
When we analyzed our case-control and family-based samples separately we found limited associations in each group (data not shown), which were most likely due to relatively small sample sizes in each group. Therefore, to achieve a more effective sample size and increase the power, we combined the 2 groups (unrelated case-control and family-based samples) and analyzed the data using the family case-control (FamCC) method24. Unlike other statistical approaches, FamCC is a unified method that relies on principal components generated from genotype data, utilizing both unrelated case-control and family-based data simultaneously without any population stratification to test an alternative hypothesis of association only24,28. This method revealed robust replications for 12 of the 33 SNP examined, with p values ranging from 7.05E-06 to 3.72E-02. Because associations in these 12 genes (KIF5A-PIP4K2C, CD2-CD58, TNFAIP3, HLA-DRB1, STAT4, IRF5, REL, ANKRD55-IL6ST, TRAF6, TAGAP, C5orf30, BLK) have been reported in multiple studies, it is acceptable to consider p < 0.05 statistically significant in followup studies29. In addition to these 12 SNP, several others showed the same trend of association, albeit not significant at the 5% level.
Recently, Prasad, et al30 performed a replication study of white RA loci in about 2000 North Indian RA cases and controls, and reported replication of only 7 loci (7 SNP) of the 32 loci (35 SNP) tested, by using either index or surrogate SNP. They were able to identify additional associations with 10 more loci after including additional SNP in the analysis. A comparison of the replication of the 31 loci (33 SNP) examined in our study versus Prasad's study is given in Table 4. Of the 31 loci examined in our study, 29 were also tested by Prasad, et al30 that included either the index or a surrogate SNP or additional SNP in a given gene region. Based on this comparison, 17 of the 29 regions (58%) were replicated in the North Indian sample, slightly higher than the 38% replication rate (12/31) in our study, probably because additional SNP at certain loci were examined in the North Indian study, and their sample size was also larger than ours. Although many of the most closely associated signals in the North Indian sample are different from the index SNP reported in whites, the presence of other significant SNP in the same regions suggests the relevance of these gene regions in RA in multiple ethnicities. This is further confirmed in a recent multiancestry comparative analysis of GWAS data, in which 22 of the 40 European SNP (55%) were replicated in Japanese31. These data show that many of the RA loci identified to date among European or European-derived populations also confer RA risk in other population groups, including Pakistanis. It is also very likely that Pakistanis may have novel RA variants and/or loci, which could be detected only by whole-genome studies.
We have replicated 12 of the 31 European RA loci in Pakistanis, and our findings provide evidence that diverse populations share genetic susceptibility to RA. A limitation of our study was the relatively small size of our RA sample, and future studies using larger samples are needed to determine the genetic basis of RA in the Pakistani population. Despite the small sample size, however, we were able to replicate a number of reported significant signals in this unique population, which previously has not been characterized regarding genetic susceptibility to RA.
ACKNOWLEDGMENT
We thank all the physicians and rheumatologists for referring patients, and their supporting staff for collection of blood samples. We thank all the patients and family members for their participation and cooperation. We also acknowledge the control individuals who provided blood samples. We appreciate Dr. X. Zhu and Dr. T. Feng for their kindness in providing the FamCC software.
Supported by the Higher Education Commission (HEC) of Pakistan and by US National Institutes of Health grant HL092397.
- Accepted for publication December 4, 2012.