Abstract
Objective The Māori and Pacific (Polynesian) population of Aotearoa New Zealand has a high prevalence of gout. Our aim was to identify potentially functional missense genetic variants in candidate inflammatory genes amplified in frequency that may underlie the increased prevalence of gout in Polynesian populations.
Methods A list of 712 inflammatory disease-related genes was generated. An in silico targeted exome set was extracted from whole genome sequencing data in people with gout of various ancestral groups (Polynesian, European, East Asian; n = 55, 780, 135, respectively) to identify Polynesian-amplified common missense variants (minor allele frequency > 0.05). Candidate functional variants were tested for association with gout by multivariable-adjusted regression analysis in 2528 individuals of Polynesian ancestry.
Results We identified 26 variants common in the Polynesian population and uncommon in the European and East Asian populations. Three of the 26 population-amplified variants were nominally associated with the risk of gout (rs1635712 [KIAA0319], ORmeta = 1.28, Pmeta = 0.03; rs16869924 [CLNK], ORmeta = 1.37, Pmeta = 0.002; rs2070025 [fibrinogen A alpha chain (FGA)], ORmeta = 1.34, Pmeta = 0.02). The CLNK variant, within the established SLC2A9 gout locus, was genetically independent of the association signal at SLC2A9.
Conclusion We provide nominal evidence for the existence of population-amplified genetic variants conferring risk of gout in Polynesian populations. Polymorphisms in CLNK have previously been associated with gout in other populations, supporting our evidence for the association of this gene with gout.
- association
- gene
- gout
- Māori
- Pacific
- Polynesian
Gout is an inflammatory arthritis triggered by an immune response to monosodium urate (MSU) crystals inside the joints when circulating urate levels are elevated (hyperuricemia). In New Zealand in 2016, the prevalence of gout was 2- to 3.5-fold higher in Māori (8%) and Pacific (14%) people than those of non-Māori and non-Pacific ancestry (4%).1 While structural inequities contribute to the increased prevalence,2 research on population-specific genetic variants associated with gout is important to provide insights into the disparities between different populations.
Candidate gene studies and genome-wide association studies (GWAS) have identified hundreds of variants associated with urate levels and gout.3,4,5,6,7,8 Many of the identified loci include genes encoding renal and gut urate transporters, such as SLC2A9, ABCG2, SLC17A1, SLC22A11, and SLC22A12, that affect the risk of gout through regulation of excretion of urate.9 However, although hyperuricemia is essential in the pathogenesis of gout, it is not sufficient for the development of clinical gout.10 Activation of the NLRP3 inflammasome is crucial for the production of interleukin (IL)-1β in response to MSU crystals and the triggering of the inflammatory cascade. There is a need to understand the genetic factors influencing the development of gout in the presence of hyperuricemia; however, research progress has been limited. Several genes potentially involved in the progression from hyperuricemia to gout have been identified, including variants in PPARGC1B, CARD8, IL-1β, CD14, TLR4, APOA1, APOC3, and P2RX7.11,12,13,14,15,16
There is increasing evidence for the existence of population-specific variants for gout. For example, GWAS have found 3 East Asian–specific signals on chromosome 11 (SLC22A9, PLA2G16, AIP),3 ABCG2 Q126X in the Japanese population,17 IL-37 N182S and ABCC4 P1036L variants in Polynesian populations,18,19 and a Polynesian-specific mitochondrial haplogroup associated with gout.20 Combined with the existence of Polynesian-specific variants in the CREBRF and CETP genes that associate with other metabolic phenotypes,21,22 the identification of further Polynesian-specific variants in candidate gout genes may yield new insights into the pathogenesis of gout in Polynesian populations.
In this study, we focused on genes implicated in inflammatory diseases and identified 26 missense variants common in Polynesian populations and uncommon in the European and East Asian populations from whole genome sequencing (WGS) data. These variants were tested for association with gout in the Māori and Pacific populations of Aotearoa New Zealand.
METHODS
Study cohorts. Demographic and biochemical characteristics of the participants are summarized in Table 1. Gout cases were diagnosed according to the 1977 American Rheumatology Association preliminary criteria.23 WGS data of people with gout (Polynesian, European, East Asian; n = 55, 780, 135, respectively) from clinical trials for urate-lowering therapy24,25 and a gout and related diseases study21 were included. Follow-up association analysis was done in a cohort genome-wide genotyped on the Human CoreExome v24 (Illumina) array.21 For the East Polynesian (EP; New Zealand Māori, Cook Island Māori) group, 800 people with gout and 810 without gout participated, and for the West Polynesian (WP; Samoa, Tonga, Tuvalu, Tokelau, Niue) group, 481 people with gout and 437 without gout participated. A separate Māori sample set from the rohe (area) of the Ngāti Porou iwi (tribe) of the Tairāwhiti (East Coast, the North Island of New Zealand) region was also included in the EP group. This sample set was recruited in collaboration with the Ngāti Porou Hauora (Health Service) Charitable Trust. Because of genetic differences between the Eastern and Western Polynesian populations,21,26 the EP and WP groups were analyzed separately.
All participants provided written informed consent and ethical approval was obtained from the New Zealand Multiregional Ethics Committee (MEC05/10/130), the Northern Y Region Health Research Ethics Committee (Ngāti Porou Hauora Charitable Trust study; NTY07/07/074), and, in the US, the Schulman Central Institutional Review Board (201102890).
Identification of candidate genes. GWAS for immune-mediated conditions with an autoinflammatory aspect such as inflammatory bowel disease (Crohn disease and ulcerative colitis), systemic lupus erythematosus (SLE), systemic sclerosis, rheumatoid arthritis (RA), ankylosing spondylitis, and gout in European and East Asian populations were sourced from the literature prior to April 30, 2019 (Supplementary Material, available from the authors on request). From the various GWAS, genetically associated loci were defined using either the summary statistics, or reported lead single-nucleotide polymorphisms (SNPs). If summary statistics were available, the less significant variants (P > 0.0001) were filtered out and associated regions on each chromosome were clustered using a density-based clustering algorithm in R from the package dbscan.27 Each region was scored and ranked based on the minimum P value within the region, and the top 20 regions per GWAS were selected. Candidate genes selected were in the ± 100 kb region from the lead SNP. Four hundred fifty-one genes were identified from the GWAS approach.
Three hundred ninety-two genes were identified from a literature search using keywords “inflammation genes” or “genes involved inflammatory pathways” in PubMed, Google Scholar, and ScienceDirect. Genes linked to monogenic disorders with an innate immune basis were also included. There were 712 genes remaining after removing duplicated genes within the combined GWAS and literature search list (Supplementary Material, available from the authors on request).
WGS and in silico extraction. Genomes were sequenced at the Garvan Institute of Medical Research, Kinghorn Cancer Centre, New South Wales, Australia. The platform used was 30x WGS (TruSeq Nano) v2.5. The FASTQ format files of sequences were aligned to the human reference genome (GRCh37) and variant called following an implementation of the Genome Analysis ToolKit (GATK) best practices28 using Burrows-Wheeler,29 Picardtools (https://broadinstitute.github.io/picard), and GATK v3.6.0.30
Only missense, nonsense, splicing variants and insertion, and/or deletions (indels) within the candidate genes that passed standard quality control criteria (mapping quality score > 20; heterozygosity > 0.001; indel heterozygosity = 1.25 × 10–4; genotype quality > 10; mapping quality > 30; and those that had high coverage of reads and genotype quality [> 90]) were included for further analysis. Genetic variations found in the discovery phase were prioritized based on those that were common (minor allele frequency [MAF] > 0.05) in the Polynesian sample set but uncommon (MAF < 0.01) in the European and East Asian sample sets. The ANNOVAR tool was used to annotate the genes and protein structural consequence of the variants.31 Only missense variants also present on the Human CoreExome v24 (Illumina) array were included in the association analysis.
Statistical analysis. Multivariable-adjusted logistic regression analysis and linear regression analysis were used to test for association with gout and serum urate concentration (in controls), respectively, using the statistical software PLINK version 1.9.32 Allelic ORs and β (mmol/L) were both calculated for each SNP by adjusting for age, sex, and principal components (derived from the genome-wide genotype data as previously described)21 in the regression models. Individuals with missing data for any variable were excluded. A P value ≤ 0.05 was set for nominal statistical significance. The dataset was not sufficiently powered to correct for multiple testing (n = 26 variants, separate EP and WP analyses) and, in the absence of an available dataset for replication, the association evidence presented here can only be considered as nominal with external information required for validation (as we were able to do for the CLNK locus). Inverse variance-weighted fixed-effect metaanalysis was performed using the software METAL33 to combine independent sample sets. Heterogeneity between combined sample sets was determined through the Q statistic, and for variants with significant evidence for heterogeneity (PHet < 0.05), the fixed-effect model was replaced with a random-effect model. A LocusZoom plot including calculations of linkage disequilibrium (LD, measured in r2)34 was produced using the entire Polynesian sample set. The Combined Annotation-Dependent Depletion (CADD) score of the significant SNPs was sourced.35 CADD scores represent rankings compared to other protein-coding variants within the entire genome. For example, a CADD score of ≥ 10 indicates 10% of variants in the genome are predicted to be more deleterious, whereas a score of ≥ 20 indicates 1% are predicted to be more deleterious.
RESULTS
The study design is summarized in Figure 1. A total of 26 missense variants were extracted from the Polynesian, European, and East Asian WGS data within the exome of the 712 candidate genes and filtered according to being common (MAF > 0.05) in the Polynesian population but uncommon (MAF < 0.01) in the other populations (Table 2).
Metaanalysis of the Polynesian EP and WP subgroups supported association of rs1635712 (KIAA0319), rs16869924 (CLNK, cytokine-dependent hematopoietic cell linker), and rs2070025 (fibrinogen A α chain [FGA]) with the risk of gout (OR 1.28 [P = 0.03], OR 1.37 [P = 0.002], and OR 1.34 [P = 0.02], respectively; Table 3, Figure 2). KIAA0319 and CLNK were included in our study as GWAS loci, associated with SLE and RA, respectively (Supplementary Material, available from the authors on request). FGA was included in our study from the literature search as a candidate gene for inflammation (Supplementary Material). The variant in KIAA0319 (rs1635712) had a derived T allele frequency of 11.4% in EP and 14.4% in WP controls (Table 3). For rs16869924 (CLNK) the derived C allele frequency was 25.0% in EP and 11.6% in WP controls (Table 3). For rs2070025 (FGA) the frequency of the derived C allele was 13.0% in EP and 4.1% in WP controls (Table 3). The CADD score was 4.8 for rs16869924, 7.4 for rs1635712, and 14.6 for rs2070025.
Rs16869924 (CLNK) is located 526 kb downstream from SLC2A9, a locus with strong association with gout in Polynesian populations.7 Therefore, we tested for the possibility of linkage disequilibrium between rs16869924 and gout-associated variants in SLC2A9 (Figure 3), revealing rs16869924 not to be in LD with the maximal gout-associated variant of SLC2A9 in Polynesian populations (rs3733591, r2 = 0.05). Linear regression analysis using a random effects model showed that rs16869924 was not associated with serum urate concentrations (βmeta 0.0055 mmol/L, 95% CI –0.0063 to 0.0174, P = 0.36). Similarly, neither rs1635712 (βmeta 0.0098 mmol/L, 95% CI –0.0057 to 0.0253, P = 0.13) nor rs2070025 (βmeta –0.0024 mmol/L, 95% CI –0.0181 to 0.0134) were significantly associated with serum urate levels. Further, we included genotype at the lead gout-associated variant rs3733591 (Figure 3; previously reported as associated with gout in Polynesian36 and other8 populations) as a covariate in the logistic regression model testing for association with gout. Even with this adjustment, rs16869924 remained associated with gout (OR 1.37, P = 1.5 × 10–4), demonstrating that rs16869924 is associated with gout independent of genotype at SLC2A9.
DISCUSSION
We carried out in silico resequencing using a targeted panel of candidate genes with subsequent analysis for association with gout in the Aotearoa New Zealand Polynesian population. The derived alleles of missense variants p.Gly243Asp (rs1635712, KIAA0319), p.Ser65Gly (rs16869924, CLNK), and p.Ile6Val (rs2070025, FGA) increased the risk of gout. The 3 variants can be regarded as candidate etiological functional variants, requiring validation by replication, fine-mapping, and experimental evidence, although prior evidence (below) raises the likelihood of genetic variation in CLNK being causal of gout.
Previously, several articles have reported CLNK gene variants to be associated with gout or serum urate.8,37,38 In a Tibetan sample set including 315 people with gout there was nominal evidence for association with gout of intronic variants rs10033825 and rs17467273,37 and in a Chinese sample set including 145 people with gout there was nominal evidence for association with gout of intronic variants rs2041215 and rs1686947.38 In our study, the CLNK Ser65Gly-derived allele associated with increased risk of gout (ORmeta 1.37, Pmeta = 0.0017) but not with serum urate concentrations (βmeta 0.0055, P = 0.36). The variant rs16869924 (CLNK) was associated with gout independent of genotype at SLC2A9 (Figure 3), demonstrating it to be an independent risk locus for gout at the SLC2A9 locus. Combined with previous studies, our data support CLNK to be a likely causal gene for gout. Given that rs16869924 was the maximally associated variant within the CLNK gene, this variant represents a candidate causal variant for gout. The CLNK gene is a member of the SLP76 (lymphocyte cytosolic protein 2) family of adaptors, which is expressed in several cell types, including T cells, natural killer cells, and mast cells. Its expression appears to be dependent on sustained exposure to cytokines such as IL-2 and IL-3, and it plays a role in regulating immunoreceptor signaling.39,40 CLNK may directly or indirectly affect gout development through the signal transducer and activator of transcription pathway and the concentration of circulating immune complexes in the blood.41,42 Considering a role for CLNK in immune and inflammatory processes, as well as previous reports of association of the gene with gout, suggests CLNK to be of particular interest in gout pathogenesis in the New Zealand Polynesian population.
We cannot identify any previous studies where variants of KIAA0319 and FGA are implicated in gout or hyperuricemia. In the Tin et al transancestral GWAS,8 there were no suggestive (P < 1 × 10–6) signals at the KIAA0319 locus for either serum urate or gout. The KIAA0319 gene has been associated with developmental dyslexia and is involved in neuronal migration.43,44 The deduced KIAA0319 protein contains several polycystic kidney disease domains, which may mediate the interaction between neurons and glial fibers during neuronal migration.45 Mutations in the FGA (fibrinogen A α) gene lead to several disorders, including dysfibrinogenemia, hypofibrinogenemia, afibrinogenemia, and renal amyloidosis.46,47,48
Of the nonassociated variants ALDH16A1, which codes for a member of the aldehyde dehydrogenase superfamily, deserves mention. ALDH16A1 is enzymatically inactive but interacts with a number of proteins,49,50 including the γ subunit of AMPK (the gene encoding this protein [PRKAG2] maps within a locus associated with serum urate levels) and GLUT4, which belongs to the glucose transporter family to which GLUT9 (encoded by SLC2A9) belongs. Further, the derived G allele at p.Pro476Arg (rs150414818) has been predicted to inhibit the interaction between ALDH16A1 and hypoxanthine-guanine phosphoribosyltransferase (HGPRT; encoded by HPRT1) that might contribute to hyperuricemia by disrupting the ability of HGPRT to salvage hypoxanthine from the urate-producing purine degradation pathway.50 A previous study showed the Arg allele of ALDH16A1 (rs150414818) to be a strong risk factor for gout (OR 3.12, P = 1.5 × 10–16) in the Icelandic population.51 In our study, the Polynesian common missense variant rs78635115 (p.Ser178Pro allele of ALDH16A1) was not associated with gout (ORmeta 1.01, P = 0.93; Table 3).
In conclusion, this study identified 3 nominally gout-associated missense genetic variants in Aotearoa New Zealand Polynesian. Our data provide further support for CLNK as a risk gene at the SLC2A9 locus, independent of SLC2A9.
ACKNOWLEDGMENT
The authors thank the many participants who so generously donated their genetic samples and other information for the study. The authors would also like to thank Jordyn Allan, Jill Drake, Roddi Laurence, Christopher Franklin, Meaghan House, Ria Akuhata-Brown, Nancy Aupouri, Carol Ford, and Gabrielle Sexton for recruitment, and NeSI (New Zealand eScience Infrastructure) for provision of compute resource.
Footnotes
This research was supported by the Health Research Council of New Zealand (Grant 14/527).
The authors declare no conflict of interest relevant to this article.
- Accepted for publication June 11, 2021.
- Copyright © 2021 by the Journal of Rheumatology
REFERENCES
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- 21.
- 22.
- 23.
- 24.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.