scholarly journals Massively expedited genome-wide heritability analysis (MEGHA)

2015 ◽  
Vol 112 (8) ◽  
pp. 2479-2484 ◽  
Author(s):  
Tian Ge ◽  
Thomas E. Nichols ◽  
Phil H. Lee ◽  
Avram J. Holmes ◽  
Joshua L. Roffman ◽  
...  

The discovery and prioritization of heritable phenotypes is a computational challenge in a variety of settings, including neuroimaging genetics and analyses of the vast phenotypic repositories in electronic health record systems and population-based biobanks. Classical estimates of heritability require twin or pedigree data, which can be costly and difficult to acquire. Genome-wide complex trait analysis is an alternative tool to compute heritability estimates from unrelated individuals, using genome-wide data that are increasingly ubiquitous, but is computationally demanding and becomes difficult to apply in evaluating very large numbers of phenotypes. Here we present a fast and accurate statistical method for high-dimensional heritability analysis using genome-wide SNP data from unrelated individuals, termed massively expedited genome-wide heritability analysis (MEGHA) and accompanying nonparametric sampling techniques that enable flexible inferences for arbitrary statistics of interest. MEGHA produces estimates and significance measures of heritability with several orders of magnitude less computational time than existing methods, making heritability-based prioritization of millions of phenotypes based on data from unrelated individuals tractable for the first time to our knowledge. As a demonstration of application, we conducted heritability analyses on global and local morphometric measurements derived from brain structural MRI scans, using genome-wide SNP data from 1,320 unrelated young healthy adults of non-Hispanic European ancestry. We also computed surface maps of heritability for cortical thickness measures and empirically localized cortical regions where thickness measures were significantly heritable. Our analyses demonstrate the unique capability of MEGHA for large-scale heritability-based screening and high-dimensional heritability profile construction.

2021 ◽  
Vol 12 (3) ◽  
pp. 212-231
Author(s):  
Issam El Hammouti ◽  
Azza Lajjam ◽  
Mohamed El Merouani

The berth allocation problem is one of the main concerns of port operators at a container terminal. In this paper, the authors study the berth allocation problem at the strategic level commonly known as the strategic berth template problem (SBTP). This problem aims to find the best berth template for a set of calling ships accepted to be served at the port. At strategic level, port operator can reject some ships to be served for avoid congestion. Since the computational complexity of the mathematical formulation proposed for SBTP, solution approaches presented so far for the problem are limited especially at level of large-scale instances. In order to find high quality solutions with a short computational time, this work proposes a population based memetic algorithm which combine a first-come-first-served (FCFS) technique, two genetics operators, and a simulating annealing algorithm. Different computational experiences and comparisons against the best known solutions so far have been presented to show the performance and effectiveness of the proposed method.


2020 ◽  
Author(s):  
Youwen Qin ◽  
Aki S Havulinna ◽  
Yang Liu ◽  
Pekka Jousilahti ◽  
Scott C Ritchie ◽  
...  

Co-evolution between humans and the microbial communities colonizing them has resulted in an intimate assembly of thousands of microbial species mutualistically living on and in their body and impacting multiple aspects of host physiology and health. Several studies examining whether human genetic variation can affect gut microbiota suggest a complex combination of environmental and host factors. Here, we leverage a single large-scale population-based cohort of 5,959 genotyped individuals with matched gut microbial shotgun metagenomes, dietary information and health records up to 16 years post-sampling, to characterize human genetic variations associated with microbial abundances, and predict possible causal links with various diseases using Mendelian randomization (MR). Genome-wide association study (GWAS) identified 583 independent SNP-taxon associations at genome-wide significance (p<5.0×10-8), which included notable strong associations with LCT (p=5.02×10-35), ABO (p=1.1×10-12), and MED13L (p=1.84×10-12). A combination of genetics and dietary habits was shown to strongly shape the abundances of certain key bacterial members of the gut microbiota, and explain their genetic association. Genetic effects from the LCT locus on Bifidobacterium and three other associated taxa significantly differed according to dairy intake. Variation in mucin-degrading Faecalicatena lactaris abundances were associated with ABO, highlighting a preferential utilization of secreted A/B/AB-antigens as energy source in the gut, irrespectively of fibre intake. Enterococcus faecalis levels showed a robust association with a variant in MED13L, with putative links to colorectal cancer. Finally, we identified putative causal relationships between gut microbes and complex diseases using MR, with a predicted effect of Morganella on major depressive disorder that was consistent with observational incident disease analysis. Overall, we present striking examples of the intricate relationship between humans and their gut microbial communities, and highlight important health implications.


2019 ◽  
Vol 21 (Supplement_6) ◽  
pp. vi194-vi194
Author(s):  
Chenan Zhang ◽  
Quinn Ostrom ◽  
Helen Hansen ◽  
Adam de Smith ◽  
Cassie Kline ◽  
...  

Abstract BACKGROUND Ependymoma is a histologically-defined central nervous system tumor most commonly occurring in children. Incidence differs by race/ethnicity, with individuals of European ancestry at highest risk. No large-scale genomic analyses of ependymoma predisposition have been conducted to date. We aimed to determine whether extent of European genetic ancestry is associated with ependymoma risk. METHODS In a multi-ethnic study of Californian children (327 cases, 1970 controls), we estimated the proportions of European, African, and Native American ancestry among admixed Hispanic and African-American subjects and estimated European substructure among non-Hispanic white subjects using genome-wide data. We tested whether genome-wide ancestry differences were associated with ependymoma risk and performed admixture mapping to identify associations with local European ancestry. We also re-analyzed CBTRUS data to examine subtype-specific differences in ependymoma incidence across racial/ethnic groups. RESULTS Each 20% increase in European ancestry was associated with 1.31-fold greater odds of ependymoma among Hispanic and African-American subjects (95% CI: 1.08–1.59, Pmeta=6.7×10–3). Among non-Hispanic whites, European ancestral substructure was also significantly associated with ependymoma risk. Local admixture mapping revealed a peak at 20p13 associated with increased local European ancestry, and genotype association analysis in the region identified an association upstream of R-spondin 4 that survived Bonferroni correction (P=2.2x10-5) but was not validated in an independent set of posterior fossa type A (PF-EPN-A) patients. In complementary CBTRUS analyses, American Indian/Alaskan Natives were at reduced risk relative to non-Hispanic whites (RR=0.64, 95% CI:0.46–0.87), as were African-Americans (RR=0.67, 95% CI:0.60–0.74) and Asian/Pacific Islanders (RR=0.86, 95% CI:0.73–1.00). Although overall ependymoma rates were similar in U.S. Hispanics (RR=0.96, 95% CI:0.88–1.05), lower rates were observed for myxopapillary ependymoma and other spinal ependymoma. CONCLUSION Inter-ethnic differences in ependymoma risk vary by histopathologic and potentially molecular subgroup, and are recapitulated in the genomic ancestry of ependymoma patients.


2019 ◽  
Vol 15 (3) ◽  
pp. 64-78
Author(s):  
Chandrakala D ◽  
Sumathi S ◽  
Saran Kumar A ◽  
Sathish J

Detection and realization of new trends from corpus are achieved through Emergent Trend Detection (ETD) methods, which is a principal application of text mining. This article discusses the influence of the Particle Swarm Optimization (PSO) on Dynamic Adaptive Self Organizing Maps (DASOM) in the design of an efficient ETD scheme by optimizing the neural parameters of the network. This hybrid machine learning scheme is designed to accomplish maximum accuracy with minimum computational time. The efficiency and scalability of the proposed scheme is analyzed and compared with standard algorithms such as SOM, DASOM and Linear Regression analysis. The system is trained and tested on DBLP database, University of Trier, Germany. The superiority of hybrid DASOM algorithm over the well-known algorithms in handling high dimensional large-scale data to detect emergent trends from the corpus is established in this article.


2019 ◽  
Vol 48 (3) ◽  
pp. 978-993 ◽  
Author(s):  
Tuulia Tynkkynen ◽  
Qin Wang ◽  
Jussi Ekholm ◽  
Olga Anufrieva ◽  
Pauli Ohukainen ◽  
...  

Abstract Background Quantitative molecular data from urine are rare in epidemiology and genetics. NMR spectroscopy could provide these data in high throughput, and it has already been applied in epidemiological settings to analyse urine samples. However, quantitative protocols for large-scale applications are not available. Methods We describe in detail how to prepare urine samples and perform NMR experiments to obtain quantitative metabolic information. Semi-automated quantitative line shape fitting analyses were set up for 43 metabolites and applied to data from various analytical test samples and from 1004 individuals from a population-based epidemiological cohort. Novel analyses on how urine metabolites associate with quantitative serum NMR metabolomics data (61 metabolic measures; n = 995) were performed. In addition, confirmatory genome-wide analyses of urine metabolites were conducted (n = 578). The fully automated quantitative regression-based spectral analysis is demonstrated for creatinine and glucose (n = 4548). Results Intra-assay metabolite variations were mostly <5%, indicating high robustness and accuracy of urine NMR spectroscopy methodology per se. Intra-individual metabolite variations were large, ranging from 6% to 194%. However, population-based inter-individual metabolite variations were even larger (from 14% to 1655%), providing a sound base for epidemiological applications. Metabolic associations between urine and serum were found to be clearly weaker than those within serum and within urine, indicating that urinary metabolomics data provide independent metabolic information. Two previous genome-wide hits for formate and 2-hydroxyisobutyrate were replicated at genome-wide significance. Conclusion Quantitative urine metabolomics data suggest broad novelty for systems epidemiology. A roadmap for an open access methodology is provided.


2017 ◽  
Author(s):  
Wei Zhou ◽  
Jonas B. Nielsen ◽  
Lars G. Fritsche ◽  
Rounak Dey ◽  
Maiken E. Gabrielsen ◽  
...  

AbstractIn genome-wide association studies (GWAS) for thousands of phenotypes in large biobanks, most binary traits have substantially fewer cases than controls. Both of the widely used approaches, linear mixed model and the recently proposed logistic mixed model, perform poorly – producing large type I error rates – in the analysis of phenotypes with unbalanced case-control ratios. Here we propose a scalable and accurate generalized mixed model association test that uses the saddlepoint approximation (SPA) to calibrate the distribution of score test statistics. This method, SAIGE, provides accurate p-values even when case-control ratios are extremely unbalanced. It utilizes state-of-art optimization strategies to reduce computational time and memory cost of generalized mixed model. The computation cost linearly depends on sample size, and hence can be applicable to GWAS for thousands of phenotypes by large biobanks. Through the analysis of UK Biobank data of 408,961 white British European-ancestry samples for >1400 binary phenotypes, we show that SAIGE can efficiently analyze large sample data, controlling for unbalanced case-control ratios and sample relatedness.


2019 ◽  
Author(s):  
Inken Wohlers ◽  
Axel Künstner ◽  
Matthias Munz ◽  
Michael Olbrich ◽  
Anke Fähnrich ◽  
...  

AbstractThe human genome is composed of chromosomal DNA sequences consisting of bases A, C, G and T – the blueprint to implement the molecular functions that are the basis of every individual’s life. Deciphering the first human genome was a consortium effort that took more than a decade and considerable cost. With the latest technological advances, determining an individual’s entire personal genome with manageable cost and effort has come within reach. Although the benefits of the all-encompassing genetic information that entire genomes provide are manifold, only a small number of de novo assembled human genomes have been reported to date 1–3, and few have been complemented with population-based genetic variation 4, which is particularly important for North Africans who are not represented in current genome-wide data sets 5–7. Here, we combine long- and short-read whole-genome next-generation sequencing data with recent assembly approaches into the first de novo assembly of the genome of an Egyptian individual. The resulting assembly demonstrates well-balanced quality metrics and is complemented with high-quality variant phasing via linked reads into haploblocks, which we can associate with gene expression changes in blood. To construct an Egyptian genome reference, we further assayed genome-wide genetic variation occurring in the Egyptian population within a representative cohort of 110 Egyptian individuals. We show that differences in allele frequencies and linkage disequilibrium between Egyptians and Europeans may compromise the transferability of European ancestry-based genetic disease risk and polygenic scores, substantiating the need for multi-ethnic genetic studies and corresponding genome references. The Egyptian genome reference represents a comprehensive population data set based on a high-quality personal genome. It is a proof of concept to be considered by the many national and international genome initiatives underway. More importantly, we anticipate that the Egyptian genome reference will be a valuable resource for precision medicine targeting the Egyptian population and beyond.


2012 ◽  
Vol 30 (15_suppl) ◽  
pp. 10097-10097
Author(s):  
Kimberly E Barnholt ◽  
Chuong B Do ◽  
Amy K Kiefer ◽  
Marisa Nelson ◽  
Judy Ellen Garber ◽  
...  

10097 Background: Recruitment of an adequately sized cohort for genome-wide studies presents a serious challenge for rare diseases such as sarcoma. Traditional barriers to participation include proximity of clinical centers and motivation or ability to travel. 23andMe’s web-based platform provides increased accessibility to research participation, facilitating rapid recruitment of patients (pts) and enabling a large-scale genome-wide association study (GWAS) of sarcoma. Methods: Sarcoma pts were recruited through web and email campaigns, patient advocacy groups, physician offices, and events. Pts provide IRB-approved consent, complete surveys, and receive updates about research progress through an online account. In collaboration with an uncompensated panel of academic experts, an online survey was developed to collect patient-reported data on diagnosis, family history, symptoms and treatment. Results: This web-based approach has accrued the largest genotyped, recontactable sarcoma cohort to date. In 20 months, 772 sarcoma pts have enrolled, 683 have been genotyped and 611 have provided data online. The cohort is primarily of European ancestry (92%), disproportionately female (72%), with an average age of 51 (± 15 years). More than 88% of pts indicated a soft tissue sarcoma diagnosis, with leiomyosarcoma, liposarcoma and “malignant fibrous histiocytoma” being the most commonly reported subtypes. Over 36% of pts report undergoing active treatment of some type. Association scans were conducted across a set of 8,058,452 imputed SNPs, using 568 unrelated sarcoma cases of European ancestry and >70,000 unrelated population controls from the 23andMe database. Initial results have identified no significant genome-wide associations for general sarcoma risk, despite having >90% power to detect risk variants with >5% minor allele frequency and odds ratio >2.5, suggesting the absence of common variants with strong shared effects across sarcoma subtypes. Conclusions: This pilot study demonstrates feasibility of rapid recruitment and longitudinal engagement of pts through online technology. Such techniques may significantly accelerate, and in some cases fully enable, large-scale genomic studies of sarcoma and other rare diseases.


2019 ◽  
Author(s):  
Triin Laisk ◽  
Ana Luiza G Soares ◽  
Teresa Ferreira ◽  
Jodie N Painter ◽  
Samantha Laber ◽  
...  

Miscarriage is a common complex trait that affects 10-25% of clinically confirmed pregnancies1,2. Here we present the first large-scale genetic association analyses with 69,118 cases from five different ancestries for sporadic miscarriage and 750 cases of European ancestry for recurrent miscarriage, and up to 359,469 female controls. We identify one genome-wide significant association on chromosome 13 (rs146350366, minor allele frequency (MAF) 1.2%,Pmeta=3.2×-8(CI) 1.2-1.6) for sporadic miscarriage in our European ancestry meta-analysis (50,060 cases and 174,109 controls), located nearFGF9involved in pregnancy maintenance3and progesterone production4. Additionally, we identified three genome-wide significant associations for recurrent miscarriage, including a signal on chromosome 9 (rs7859844, MAF=6.4%,Pmeta=1.3×-8in controlling extravillous trophoblast motility5. We further investigate the genetic architecture of miscarriage with biobank-scale Mendelian randomization, heritability and, genetic correlation analyses. Our results implicate that miscarriage etiopathogenesis is partly driven by genetic variation related to gonadotropin regulation, placental biology and progesterone production.


2020 ◽  
Author(s):  
Segun Fatumo ◽  
Tinashe Chikowore ◽  
Robert Kalyesubula ◽  
Rebecca N Nsubuga ◽  
Gershim Asiki ◽  
...  

AbstractGenome-wide association studies (GWAS) for kidney function have uncovered hundreds of risk loci, primarily in populations of European ancestry. We conducted the first GWAS of estimated glomerular filtration rate (eGFR) in Africa in 3288 Ugandans and replicated the findings in 8224 African Americans. We identified two loci associated with eGFR at genome-wide significance (p<5×10−8). The most significantly associated variant (rs2433603, p=2.4×10−9) in GATM was distinct from previously reported signals. A second association signal mapping near HBB (rs141845179, p=3.0×10−8) was not significant after conditioning on a previously reported SNP (rs334) for eGFR. However, fine-mapping analyses highlighted rs141845179 to be the most likely causal variant at the HBB locus (posterior probability of 0.61). A trans-ethnic GRS of eGFR constructed from previously reported lead SNPs was not predictive into the Ugandan population, indicating that additional large-scale efforts in Africa are necessary to gain further insight into the genetic architecture of kidney disease.


Sign in / Sign up

Export Citation Format

Share Document