Abstract
Objective. To investigate the association of specific amino acid positions, residues, and haplotypes of HLA-DRB1 in black South Africans with autoantibody-positive rheumatoid arthritis (RA).
Methods. High-resolution HLA-DRB1 genotyping was performed in 266 black South Africans with autoantibody-positive RA and 362 ethnically and geographically matched controls. The alleles were converted to specific amino acid residues at polymorphic sites for downstream analyses. Logistic regression models were used to test whether variability at site, specific amino acid residues, and haplotypes (constructed from positions 11, 71, and 74) were associated with RA.
Results. Of the 29 amino acid positions examined, positions 11, 13, and 33 (permutation p = 3.4e-26, 1.2e-27, and 2.1e-28, respectively) showed the strongest association with RA. Univariate analyses of individual amino acid residues showed valine at position 11 (OR 5.1, 95% CI 3.7–7.0) and histidine at position 13 (OR 6.1, 95% CI 4.2–8.6) conferred the highest risk. The valine containing haplotypes of position 11, 71, 74, V_K_A conferred the most risk (OR 4.52, 95% CI 2.68–7.61) and conversely the haplotype with serine at this position, S_K_R, conferred the most protection (OR 0.83, 95% CI 0.61–1.15).
Conclusion. Autoantibody-positive RA in black South Africans is associated with histidine at position 13 and valine at position 11 of HLA-DRB1, and haplotypes with valine at position 11 conferred the highest risk; conversely, serine at position 11 conveyed protection.
The genetic association of HLA-DRB1 alleles that include the shared epitope (SE), a conserved sequence of amino acids (QKRAA, QRRAA, or RRRAA) at positions 70–74 of DRB1 with rheumatoid arthritis (RA), is well established in most populations1,2, including black South Africans3,4.
About 90% of black South Africans with RA carry at least 1 allele bearing the SE motif5. Moreover, in a genome-wide association study using the Immunochip single-nucleotide polymorphism (SNP) array, the HLA region showed the strongest association with RA in this population6. These findings contrast with studies in other populations of African ancestry, where the frequency of the SE in RA is much lower: 42% in African Americans7 and only 30% in West Africans from the Cameroon8.
Using a more refined and novel approach to study the role of the MHC region in RA, Raychaudhuri, et al found that just 5 amino acids in 3 HLA proteins (HLA-DRB1, HLA-DP1, and HLA-B) explain most of the genetic association in anticitrullinated protein antibody (ACPA)–positive white patients with RA9. The strongest association was with position 11 of HLA-DRB1; lesser associations were found with the previously associated SE positions 70 and 74. Certain haplotypes of amino acid residues at positions 11, 71, and 74 were shown to be associated with risk or protection. Specifically, haplotypes with a valine residue at position 11 were associated with a 4-fold increased risk of RA, and conversely, haplotypes with a serine residue at this position showed a reduced risk9.
In African Americans, as in whites, valine at amino acid position 11 of DRB1 conferred the strongest risk for RA; in addition, an association with position 57 was also observed, but no association was identified with positions 71 and 7410.
In the absence of studies on amino acid substitutions in sub-Saharan African populations, we sought to determine the association of specific amino acid positions, residues, and haplotypes in the HLA-DRB1 region in black South Africans with antibody-positive RA.
MATERIALS AND METHODS
Patients with RA fulfilling the 1987 American College of Rheumatology classification criteria for RA, ≥ 18 years old at disease onset, and who were rheumatoid factor (RF)–positive and/or ACPA+ (n = 266) were recruited from the Rheumatology Clinic, Chris Hani Baragwanath Academic Hospital, Soweto, South Africa. The control participants (n = 362) were ethnically and geographically matched and consisted of either hospital staff members or patients presenting to the Accident and Emergency Department for minor trauma, but with no history of joint symptoms or autoimmune diseases. Black ethnicity was defined on the basis of participants self-reporting all 4 grandparents as being black South Africans. Written consent was obtained from all participants. The study was approved by the Human Research Ethics Committee (Medical) of the University of the Witwatersrand (M10707).
RF (composite IgM, IgG, IgA) was assayed by nephelometry (Siemens Healthcare Diagnostics, BN Prospec Nephelometer). The ACPA status was determined using the anticyclic citrullinated peptide antibodies (anti-CCP2) immunofluorimetric assay (Phadia AB). The tests were considered positive when the values were > 15 IU/ml for the RF test and 10 U/m for the anti-CCP2 test.
HLA-DRB1 genotyping and conversion of classical alleles to amino acid residues
Four-digit high-resolution HLA typing was performed by DNA sequencing of exon 2, using the AlleleSEQR HLA-DRB1 reagent kit and protocol (Atria Genetics) as previously described6. The sequences were analyzed using Assign software (Conexio Genomics), which enables assignment of genotypes based on a library file of HLA-DRB1 alleles11. This method detects all of the SE-positive alleles.
The SNP to HLA tool was used to impute the amino acid polymorphisms using a reference set of HLA alleles in the context of adjacent SNP genotyped directly on the Immunochip array. Phased and imputed haplotypes generated using BEAGLE, implemented within the SNP2HLA program, were used to assign the amino acid residues at positions 11, 71, and 7412.
Statistical analysis
To determine whether a specific amino acid position, containing “n” alternate residues, was significantly associated with RA, logistic regression models were fit with the sum of “n-1” indicator variables of the presence/absence of particular amino acid residues. This was repeated for all positions with polymorphic residues (e.g., valine, serine, and leucine). To determine whether additional variation was explained by other positions after accounting for the most statistically significant positions (e.g., 11, 13, 33), conditional logistic regression was used. In these models, the p value of the test, in which additional explanatory covariance is explained by a given position, was obtained by comparing 2 models: (1) a base model with only the most significant position residues, and (2) the base model plus additional residues from the position in question. The p value was determined by considering the difference in residual deviance between 2 models as chi squared with degrees of freedom equal to the difference in number residues between the 2 models. We also performed a sensitivity analysis using a dummy variable for sex (coded as 0, 1: male, female) for the positions (11, 13) corresponding to HLA-DRB1 allele *04:01, which was the only allele showing indications of population structure.
Logistic regression was used to fit the RA case control data to the haplotype information. The log likelihood ratio test was applied to assess the signal from the haplotypes. Because 12 haplotypes were observed for positions 11, 71, and 74, a conservative test type 1 error correction was made so that only p values < 0.004 were considered statistically significant. Univariate logistic models were also fit to the data to test for frequency differences between cases and controls.
RESULTS
Most patients were female (89.7%) with a mean (± SD) age of 55 (10.8) years and established disease with a mean (± SD) symptom duration of 7.2 (8.9) years. All patients were antibody-positive, either for RF (n = 234/248) or ACPA (n = 195/217). Not all patients were tested for both autoantibodies.
Twenty-eight different 4-digit HLA-DRB1 alleles were identified in the controls and 26 in the cases, of which 17 alleles occurred at low frequencies (< 0.05). Compared to the control group, the frequencies of 4 alleles were significantly higher in the patient groups (*0401, *0404, *0405, and *1001), while the frequencies of 3 alleles were significantly lower (*1101, *1301, and *1302; Table 1). The only allele with frequency differences observed between males and females was *0401 (c2 = 8.71, DF = 1, p = 0.003).
Of the 29 amino acid positions examined, amino acid position 11, 13, and 33 were the most highly associated with RA (p = 3.4 × 10−26, 1.2 × 10−27, and 2.1 × 10−28, respectively). Position 71 was less significantly associated (p = 2.6 × 10−05) and there was no significant association with position 74 (p = 0.15). The relationships of the amino acid positions 11 and 13 were corrected for sex, and after correction the population structure effects at the *0401 allele remained highly significant (1.3 × 10−23, p = 1.2 × 10−23, respectively). After conditioning on position 11, the effects of the other positions were diminished (position 13: p = 0.015; position 33: p = 0.0043; Table 2). Conditioning on position 13 rendered all amino acid positions nonsignificant, which indicates that the strongest signal of association with RA in black South Africans arises from positions 11 and 13.
Using proline as a referent amino acid residue, valine at position 11 conferred the strongest risk (OR 5.1, 95% CI 3.74–6.97). Conversely, serine had a lower frequency in the RA group compared to the control group (Figure 1). At position 13, the strongest risk residue was histidine (OR 6.1, 95% CI 4.2–8.6) and the serine residue appeared protective.
The 2 haplotypes constructed based on positions 11, 71, and 74, with a valine residue at position 11 [V_K_A (OR 4.52, CI 2.68–7.61) and V_R_A (OR 4.44, CI 3.01–6.56)], were the most strongly associated with risk, and 3 haplotypes with serine at position 11 conferred protection (Table 3).
Haplotypes containing valine at position 11 and histidine at position 13 showed that histidine is in perfect linkage disequilibrium with valine, but the valine residue at position 11 may occur with either the histidine residue [haplotype frequency (cases and controls) = 0.15] or the phenylalanine residue (0.038) at position 13.
DISCUSSION
In our present study, we validated several of the previously observed associations of HLA-DRB1 alleles, amino acid positions, residues, and haplotypes with antibody-positive RA in black South Africans. Amino acid positions 11, 13, and 33 were associated with the highest risk for RA. Owing to the high proximity and linkage among the amino acid residue polymorphisms, it was difficult to locate the exact source of the signal. Using conditional tests and permutation p values, we evaluated the evidence that the source of the signal was coming from a particular site. Conditioning on position 13 completely nullified the effects of position 33. However, conditioning on position 33 left some residual signal at positions 11 and 13. Finally, conditioning on position 11 left some residual signal at positions 13 and 33. These results suggest that the signal in polymorphism from position 33, a biallelic site, is probably stemming from linkage with the signal coming from positions 13 and 11.
Because variability remains in high-risk allele containing haplotypes at these amino acid positions, covariance between the positions and case-control status remains, even when conditioning on the highly associated position 11. Nevertheless, the residual signal remaining at many positions after conditioning is relatively weak, and conservatively is not statistically significant when considering multiple testing correction (this includes position 70). The strongest signals of association between RA risk and amino acid positions in black South Africans are from 11 and 13, which is consistent with all other populations studied to date.
Unlike the finding in patients with RA who are of white ancestry, we found no association with positions 71 and 74, after conditioning for position 11 (or 13 and 33). We also did not observe an association with position 57, previously reported in African Americans10. Possible reasons for a lack of association include interethnic differences in specific HLA-DRB1 allele frequencies, for example, the lack of association of a white risk allele, DRB1*0408, in our cohort. Association differences between black South Africans and African Americans are not totally unexpected. We previously showed that black South Africans are not only genetically distinct from whites but also from West Africans, who are ancestrally related to African Americans13. There are limited data on 4-digit HLA-DRB1 alleles in other patients with RA of African ancestry. In patients from Cameroon, just 4 alleles were associated with RA: DRB1*0101, *0102, *0405, and *10018. To our knowledge there are no studies done in North Africans with RA; however, this population is genetically distinct from South Africans. There are also no data on East Africans, who have some genetic similarity to Southern Africans. Finally, our present study is relatively small compared to other studies, and thus underpowered to detect potentially small effect sizes of positions 71, 74, and 57.
As in previous studies, among the residues at position 11, valine conferred the highest risk for RA, when analyzed either in terms of individual amino acid residues at position 11 or as part of the position 11, 71, 74 haplotypes. Conversely, serine at position 11 or as part of the haplotype was protective. It should be noted that the absolute frequencies of valine at position 11 and valine-containing haplotypes were lower in both our patients and controls compared to those reported in whites, whereas the frequencies of serine at position 11 and serine-containing haplotypes were more common in our cohort. These differences are due to the lower frequency of HLA-DRB1*0401, *0404, *0405, *0408, and *1001, which have valine at position 11 and the 2-fold higher frequency of HLA-DRB1 *1301 and *1302 alleles, which have serine at position 11 in black South Africans.
Among the residues at position 13, histidine conferred nearly double the risk of RA compared to valine at position 11. The protective effect of serine was similar at amino acid position 11 and 13 (Figure 2). Mechanistically, the role of positions 11 and 13 and the specific amino acid residues at these positions is not clear. However, position 11 locates to the peptide binding region of HLA class II molecule, suggesting a likely role in antigen presentation. It is the only position with variable residues in the pocket 6 of the β chain. The increased risk associated with valine at position 11 is thought to be related to its hydrophobic polar state, in contrast to serine, which is hydrophilic and highly polar14. Moreover, one study suggests that this interaction is mediated by ACPA-positive status15.
The importance of valine at position 11 has been further elucidated in studies showing worse radiographic16 and nonradiographic outcomes in patients with RA carrying this amino acid residue at position 11. Worse radiographic damage, higher all-course mortality, and poorer response to tumor necrosis factor inhibitor therapy is associated with valine at position 11 of HLA-DRB117. In addition, there is an association with higher swollen joint counts and C-reactive protein levels15.
One of the limitations of our present study was the relatively small sample size. Although adequate to detect the large effects of amino acid positions 11 and 13, the sample size may not have been powered to detect the smaller effects of other amino acid positions.
Our work provides further evidence of the role of HLA class II region in genetic susceptibility to RA in black South Africans, and more specifically amino acid positions 11 and 13 of HLA-DRB1. Further studies are needed to validate these findings, characterize antigen(s) that bind to these sites, and the role of the amino acid residues regarding response to traditional and commercial disease-modifying antirheumatic drugs.
Footnotes
This work was made possible by grants from Carnegie Corporation of New York (B8749), the Connective Tissue Diseases Fund, the University of the Witwatersrand, and the Medical Research Council of South Africa to Dr. Tikly. Dr. Ramsay received financial support from the National Research Foundation of South Africa. Dr. Reynolds was supported by US National Institutes of Health grant K01 AR060848.
- Accepted for publication July 20, 2018.