scholarly journals Fine scale human genetic structure in three regions of Cameroon reveals episodic diversifying selection

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kevin K. Esoh ◽  
Tobias O. Apinjoh ◽  
Steven G. Nyanjom ◽  
Ambroise Wonkam ◽  
Emile R. Chimusa ◽  
...  

AbstractInferences from genetic association studies rely largely on the definition and description of the underlying populations that highlight their genetic similarities and differences. The clustering of human populations into subgroups (population structure) can significantly confound disease associations. This study investigated the fine-scale genetic structure within Cameroon that may underlie disparities observed with Cameroonian ethnicities in malaria genome-wide association studies in sub-Saharan Africa. Genotype data of 1073 individuals from three regions and three ethnic groups in Cameroon were analyzed using measures of genetic proximity to ascertain fine-scale genetic structure. Model-based clustering revealed distinct ancestral proportions among the Bantu, Semi-Bantu and Foulbe ethnic groups, while haplotype-based coancestry estimation revealed possible longstanding and ongoing sympatric differentiation among individuals of the Foulbe ethnic group, and their Bantu and Semi-Bantu counterparts. A genome scan found strong selection signatures in the HLA gene region, confirming longstanding knowledge of natural selection on this genomic region in African populations following immense disease pressure. Signatures of selection were also observed in the HBB gene cluster, a genomic region known to be under strong balancing selection in sub-Saharan Africa due to its co-evolution with malaria. This study further supports the role of evolution in shaping genomes of Cameroonian populations and reveals fine-scale hierarchical structure among and within Cameroonian ethnicities that may impact genetic association studies in the country.

2017 ◽  
Vol 28 (7) ◽  
pp. 1927-1941
Author(s):  
Jiyuan Hu ◽  
Wei Zhang ◽  
Xinmin Li ◽  
Dongdong Pan ◽  
Qizhai Li

In the past decade, genome-wide association studies have identified thousands of susceptible variants associated with complex human diseases and traits. Conducting follow-up genetic association studies has become a standard approach to validate the findings of genome-wide association studies. One problem of high interest in genetic association studies is to accurately estimate the strength of the association, which is often quantified by odds ratios in case-control studies. However, estimating the association directly by follow-up studies is inefficient since this approach ignores information from the genome-wide association studies. In this article, an estimator called GFcom, which integrates information from genome-wide association studies and follow-up studies, is proposed. The estimator includes both the point estimate and corresponding confidence interval. GFcom is more efficient than competing estimators regarding MSE and the length of confidence intervals. The superiority of GFcom is particularly evident when the genome-wide association study suffers from severe selection bias. Comprehensive simulation studies and applications to three real follow-up studies demonstrate the performance of the proposed estimator. An R package, “GFcom”, implementing our method is publicly available at https://github.com/JiyuanHu/GFcom .


2014 ◽  
Vol 8 (1) ◽  
pp. 29-42 ◽  
Author(s):  
Jingxiao Jin ◽  
Chou Chou ◽  
Maria Lima ◽  
Danielle Zhou ◽  
Xiaodong Zhou

Systemic sclerosis (SSc) is a fibrotic and autoimmune disease characterized clinically by skin and internal organ fibrosis and vascular damage, and serologically by the presence of circulating autoantibodies. Although etiopathogenesis is not yet well understood, the results of numerous genetic association studies support genetic contributions as an important factor to SSc. In this paper, the major genes of SSc are reviewed. The most recent genome-wide association studies (GWAS) are taken into account along with robust candidate gene studies. The literature search was performed on genetic association studies of SSc in PubMed between January 2000 and March 2014 while eligible studies generally had over 600 total participants with replication. A few genetic association studies with related functional changes in SSc patients were also included. A total of forty seven genes or specific genetic regions were reported to be associated with SSc, although some are controversial. These genes include HLA genes, STAT4, CD247, TBX21, PTPN22, TNFSF4, IL23R, IL2RA, IL-21, SCHIP1/IL12A, CD226, BANK1, C8orf13-BLK, PLD4, TLR-2, NLRP1, ATG5, IRF5, IRF8, TNFAIP3, IRAK1, NFKB1, TNIP1, FAS, MIF, HGF, OPN, IL-6, CXCL8, CCR6, CTGF, ITGAM, CAV1, MECP2, SOX5, JAZF1, DNASEIL3, XRCC1, XRCC4, PXK, CSK, GRB10, NOTCH4, RHOB, KIAA0319, PSD3 and PSOR1C1. These genes encode proteins mainly involved in immune regulation and inflammation, and some of them function in transcription, kinase activity, DNA cleavage and repair. The discovery of various SSc-associated genes is important in understanding the genetics of SSc and potential pathogenesis that contribute to the development of this disease.


2021 ◽  
Vol 11 (2) ◽  
pp. 145
Author(s):  
Ryan Walsh ◽  
Kirsten Voorhies ◽  
Merry-Lynn McDonald ◽  
Michael McGeachie ◽  
Joanne E. Sordillo ◽  
...  

Genome-wide association studies (GWAS) play a critical role in identifying many loci for common diseases and traits. There has been a rapid increase in the number of GWAS over the past decade. As additional GWAS are being conducted, it is unclear whether a novel signal associated with the trait of interest is independent of single nucleotide polymorphisms (SNPs) in the same region that has been previously associated with the trait of interest. The general approach to determining whether the novel association is independent of previous signals is to examine the association of the novel SNP with the trait of interest conditional on the previously identified SNP and/or calculate linkage disequilibrium (LD) between the two SNPs. However, the role of epistasis and SNP by SNP interactions are rarely considered. Through simulation studies, we examined the role of SNP by SNP interactions when determining the independence of two genetic association signals. We have created an R package on Github called gxgRC to generate these simulation studies based on user input. In genetic association studies of asthma, we considered the role of SNP by SNP interactions when determining independence of signals for SNPs in the ARG1 gene and bronchodilator response.


2021 ◽  
Vol 12 ◽  
Author(s):  
Seung-Soo Kim ◽  
Adam D. Hudgins ◽  
Brenda Gonzalez ◽  
Sofiya Milman ◽  
Nir Barzilai ◽  
...  

The rich data from the genome-wide association studies (GWAS) and phenome-wide association studies (PheWAS) offer an unprecedented opportunity to identify the biological underpinnings of age-related disease (ARD) risk and multimorbidity. Surprisingly, however, a comprehensive list of ARDs remains unavailable due to the lack of a clear definition and selection criteria. We developed a method to identify ARDs and to provide a compendium of ARDs for genetic association studies. Querying 1,358 electronic medical record-derived traits, we first defined ARDs and age-related traits (ARTs) based on their prevalence profiles, requiring a unimodal distribution that shows an increasing prevalence after the age of 40 years, and which reaches a maximum peak at 60 years of age or later. As a result, we identified a list of 463 ARDs and ARTs in the GWAS and PheWAS catalogs. We next translated the ARDs and ARTs to their respective 276 Medical Subject Headings diseases and 45 anatomy terms. The most abundant disease categories are neoplasms (48 terms), cardiovascular diseases (44 terms), and nervous system diseases (27 terms). Employing data from a human symptoms-disease network, we found 6 symptom-shared disease groups, representing cancers, heart diseases, brain diseases, joint diseases, eye diseases, and mixed diseases. Lastly, by overlaying our ARD and ART list with genetic correlation data from the UK Biobank, we found 54 phenotypes in 2 clusters with high genetic correlations. Our compendium of ARD and ART is a highly useful resource, with broad applicability for studies of the genetics of aging, ARD, and multimorbidity.


2021 ◽  
Author(s):  
James A Watson ◽  
Carolyne M Ndila ◽  
Sophie Uyoga ◽  
Alex W Macharia ◽  
Gideon Nyutu ◽  
...  

Severe falciparum malaria has substantially affected human evolution. Genetic association studies of patients with clinically defined severe malaria and matched population controls have helped characterise human genetic susceptibility to severe malaria, but phenotypic imprecision compromises discovered associations. In areas of high malaria transmission the diagnosis of severe malaria in young children and, in particular, the distinction from bacterial sepsis, is imprecise. We developed a probabilistic diagnostic model of severe malaria using platelet and white count data. Under this model we re-analysed clinical and genetic data from 2,220 Kenyan children with clinically defined severe malaria and 3,940 population controls, adjusting for phenotype mis-labelling. Our model, validated by the distribution of sickle trait, estimated that approximately one third of cases did not have severe malaria. We propose a data-tilting approach for case-control studies with phenotype mis-labelling and show that this reduces false discovery rates and improves statistical power in genome-wide association studies.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
James A Watson ◽  
Carolyne M Ndila ◽  
Sophie Uyoga ◽  
Alexander Macharia ◽  
Gideon Nyutu ◽  
...  

Severe falciparum malaria has substantially affected human evolution. Genetic association studies of patients with clinically defined severe malaria and matched population controls have helped characterise human genetic susceptibility to severe malaria, but phenotypic imprecision compromises discovered associations. In areas of high malaria transmission the diagnosis of severe malaria in young children and, in particular, the distinction from bacterial sepsis, is imprecise. We developed a probabilistic diagnostic model of severe malaria using platelet and white count data. Under this model we re-analysed clinical and genetic data from 2,220 Kenyan children with clinically defined severe malaria and 3,940 population controls, adjusting for phenotype mis-labelling. Our model, validated by the distribution of sickle trait, estimated that approximately one third of cases did not have severe malaria. We propose a data-tilting approach for case-control studies with phenotype mis-labelling and show that this reduces false discovery rates and improves statistical power in genome-wide association studies.


2021 ◽  
Vol 12 ◽  
Author(s):  
Adam Abied ◽  
Abulgasim M. Ahbara ◽  
Haile Berihulay ◽  
Lingyang Xu ◽  
Rabiul Islam ◽  
...  

With climate change bound to affect food and feed production, emphasis will shift to resilient and adapted indigenous livestock to sustain animal production. However, indigenous livestock comprise several varieties, strains and ecotypes whose genomes are poorly characterized. Here, we investigated genomic variation in an African thin-tailed Desert Sheep sampled in Sudan, using 600K genotype data generated from 92 individuals representing five ecotypes. We included data from 18 fat-tailed and 45 thin-tailed sheep from China, to investigate shared ancestry and perform comparative genomic analysis. We observed a clear genomic differentiation between the African thin-tailed Desert Sheep and the Chinese thin-tailed and fat-tailed sheep, suggesting a broad genetic structure between the fat-tailed and thin-tailed sheep in general, and that at least two autosomal gene pools comprise the genome profile of the thin-tailed sheep. Further analysis detected two distinct genetic clusters in both the African thin-tailed Desert Sheep and the Chinese thin-tailed sheep, suggesting a fine-scale and complex genome architecture in thin-tailed sheep. Selection signature analysis suggested differences in adaptation, production, reproduction and morphology likely underly the fine-scale genetic structure in the African thin-tailed Desert Sheep. This may need to be considered in designing breeding programs and genome-wide association studies.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 2243-2243
Author(s):  
Emily Kawabata ◽  
Samuel Lessard ◽  
Dirk Paul ◽  
Paola G Bronson ◽  
Robert Peters ◽  
...  

Sickle cell disease (SCD) affects more than five million people worldwide, predominantly in sub-Saharan Africa. Hereditary persistence of fetal hemoglobin (HbF) is an uncommon genetic condition in which production of HbF in early life is not suppressed. SCD symptoms are reduced in patients carrying this condition, suggesting that increased HbF levels may be a promising therapeutic strategy to ameliorate the symptoms of SCD. This is exemplified by the HbF-raising compound hydroxyurea, currently the most commonly used US Food and Drug Administration approved drug treatment for SCD. However, hydroxyurea is only effective in ~70% of patients and carries a black box warning for carcinogenicity, hence there is a need for additional SCD treatments. Previous genome-wide association studies (GWAS) have identified four loci robustly associated with HbF levels, including variants in the BCL11A region. These initial genetic discoveries have led to promising ex vivo gene-editing approaches to silence or reduce levels of BCL11A, which are currently being tested in clinical trials. To identify novel potential therapeutic targets to raise HbF levels, we conducted the largest GWAS of HbF levels in ~11,000 healthy blood donors from the INTERVAL study among whom HbF was measured in whole blood using a mass spectrometry approach. We ran linear mixed models accounting for age, sex, blood group, technical effects and 10 principal components of ancestry for 14,910,742 variants either directly genotyped using the Affymetrix UK Biobank array or imputed from a combined 1000 Genomes/UK10K reference panel. In addition to confirming previously reported signals at the BCL11A, HBS1L-MYB and HBB loci, stepwise conditional analysis identified six novel genomic regions at genome-wide significance (p<5x10-8). At some loci we identified multiple independent association signals, including two at BCL11A, three at HBS1L-MYB, and two at HBB. Genetic fine-mapping resolved some loci to highly likely causal variants (posterior probability>0.5), including a rare (frequency=0.2%) variant near GRIK2. To identify likely causal genes and mechanisms, we integrated our results with relevant transcriptomic and epigenomic datasets. Preliminary results suggest that a common (frequency=26%) 29 base-pair indel upstream of CHAC2 is highly likely (posterior probability=0.77) to be the causal variant at this locus. The variant maps to a site of active chromatin and GATA1 transcription factor binding specifically in erythroblasts, and the insertion allele contains a GATA motif suggesting that this variant regulates HbF levels by the presence or absence of a second GATA1 binding site. Gene expression data across a range of hematopoietic cells revealed a restricted expression pattern of CHAC2 in erythroblasts, but not for other genes in the region, suggesting CHAC2 as the likely causal gene. CHAC2 encodes the glutathione-specific gamma-glutamylcyclotransferase 2, an enzyme that catalyzes the cleavage of glutathione into 5-oxo-L-proline and a Cys-Gly dipeptide. This finding adds support to the idea that altered erythrocyte glutathione levels play a role in SCD pathogenesis, potentially via HbF modulation. Ongoing work includes characterizing the likely causal mechanisms of the six novel loci, and providing target validation of CHAC2 using genome editing and deep phenotyping of erythroid cells. In summary, our expanded GWAS has identified new loci associated with HbF levels, providing novel potential therapeutic targets for SCD. Disclosures Lessard: Sanofi: Employment. Peters:Sanofi: Employment. Krishnamoorthy:Sanofi: Employment.


Sign in / Sign up

Export Citation Format

Share Document