binary traits
Recently Published Documents


TOTAL DOCUMENTS

70
(FIVE YEARS 7)

H-INDEX

15
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Avni S Gupta ◽  
Victoria Chevee ◽  
Adam S. Kirosingh ◽  
Nicole M Davis ◽  
David S Schneider

We infected Diversity Outbred mice with Plasmodium chabaudi to better understand how the host response to infection can vary and to try to identify genetic loci responsible for this variation. We identified two loci correlating with binary traits: one on chromosome two was linked to undetectable parasite loads and another on chromosome ten which was linked to death. Though we tested many variable traits, none of those reached statistical significance using the 489 mice we tested.


2021 ◽  
Author(s):  
Joelle Mbatchou ◽  
Leland Barnard ◽  
Joshua Backman ◽  
Anthony Marcketta ◽  
Jack A. Kosmicki ◽  
...  

2021 ◽  
Author(s):  
Jian Yang ◽  
Longda Jiang ◽  
Zhili Zheng

Abstract Compared to linear mixed model-based genome-wide association (GWA) methods, generalized linear mixed model (GLMM)-based methods have better statistical properties when applied to binary traits but are computationally much slower. Here, leveraging efficient sparse matrix-based algorithms, we developed a GLMM-based GWA tool (called fastGWA-GLMM) that is orders of magnitude faster than the state-of-the-art tool (e.g., ~37 times faster when n=400,000) with more scalable memory usage. We show by simulation that the fastGWA-GLMM test-statistics of both common and rare variants are well-calibrated under the null, even for traits with an extreme case-control ratio (e.g., 0.1%). We applied fastGWA-GLMM to the UK Biobank data of 456,348 individuals, 11,842,647 variants and 2,989 binary traits (full summary statistics available at http://fastgwa.info/ukbimpbin) and identified 259 rare variants associated with 75 traits, demonstrating the use of imputed genotype data in a large cohort to discover rare variants for binary complex traits.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ming-Huei Chen ◽  
Achilleas Pitsillides ◽  
Qiong Yang

AbstractRecognizing that family data provide unique advantage of identifying rare risk variants in genetic association studies, many cohorts with related samples have gone through whole genome sequencing in large initiatives such as the NHLBI Trans-Omics for Precision Medicine (TOPMed) program. Analyzing rare variants poses challenges for binary traits in that some genotype categories may have few or no observed events, causing bias and inflation in commonly used methods. Several methods have recently been proposed to better handle rare variants while accounting for family relationship, but their performances have not been thoroughly evaluated together. Here we compare several existing approaches including SAIGE but not limited to related samples using simulations based on the Framingham Heart Study samples and genotype data from Illumina HumanExome BeadChip where rare variants are the majority. We found that logistic regression with likelihood ratio test applied to related samples was the only approach that did not have inflated type I error rates in both single variant test (SVT) and gene-based tests, followed by Firth logistic regression that had inflation in its direction insensitive gene-based test at prevalence 0.01 only, applied to either related or unrelated samples, though theoretically logistic regression and Firth logistic regression do not account for relatedness in samples. SAIGE had inflation in SVT at prevalence 0.1 or lower and the inflation was eliminated with a minor allele count filter of 5. As for power, there was no approach that outperformed others consistently among all single variant tests and gene-based tests.


Author(s):  
Joelle Mbatchou ◽  
Leland Barnard ◽  
Joshua Backman ◽  
Anthony Marcketta ◽  
Jack A. Kosmicki ◽  
...  

AbstractGenome-wide association analysis of cohorts with thousands of phenotypes is computationally expensive, particularly when accounting for sample relatedness or population structure. Here we present a novel machine learning method called REGENIE for fitting a whole genome regression model that is orders of magnitude faster than alternatives, while maintaining statistical efficiency. The method naturally accommodates parallel analysis of multiple phenotypes, and only requires local segments of the genotype matrix to be loaded in memory, in contrast to existing alternatives which must load genomewide matrices into memory. This results in substantial savings in compute time and memory usage. The method is applicable to both quantitative and binary phenotypes, including rare variant analysis of binary traits with unbalanced case-control ratios where we introduce a fast, approximate Firth logistic regression test. The method is ideally suited to take advantage of distributed computing frameworks. We demonstrate the accuracy and computational benefits of this approach compared to several existing methods using quantitative and binary traits from the UK Biobank dataset with up to 407,746 individuals.


2020 ◽  
Vol 98 (6) ◽  
Author(s):  
David Kenny ◽  
Michelle M Judge ◽  
Roy D Sleator ◽  
Craig P Murphy ◽  
Ross D Evans ◽  
...  

Abstract The objective of the present study was to estimate the genetic parameters associated with the achievement of desirable weight, conformation, and fat specifications, represented by a series of binary traits. The desired specifications were those stipulated by Irish beef processors, in accordance with the EUROP carcass grading system, and were represented by a carcass weight between 270 and 380 kg, a fat score between 2+ and 4= (between 6 and 11 on a 15-point scale), and a conformation score of O= or better (≥5 on a 15-point scale). Using data from 58,868 beef carcasses, variance components were estimated using linear mixed models for these binary traits, as well as their underlying continuous measures. Heritability estimates for the continuous traits ranged from 0.63 to 0.73; heritability estimates for the binary traits ranged from 0.05 to 0.19. An additional trait was defined to reflect if all desired carcass specifications were met. All genetic correlations between this trait and the individual contributing binary traits were positive (0.38 to 0.87), while all genetic correlations between this trait and the continuous carcass measures were negative (−0.87 to −0.07). The genetic parameters estimated in the present study signify that potential exists to breed cattle that more consistently achieve desirable carcass metrics at harvest.


2019 ◽  
Vol 104 (2) ◽  
pp. 260-274 ◽  
Author(s):  
Han Chen ◽  
Jennifer E. Huffman ◽  
Jennifer A. Brody ◽  
Chaolong Wang ◽  
Seunggeun Lee ◽  
...  

PLoS ONE ◽  
2018 ◽  
Vol 13 (11) ◽  
pp. e0207752 ◽  
Author(s):  
Esperanza Shenstone ◽  
Julian Cooper ◽  
Brian Rice ◽  
Martin Bohn ◽  
Tiffany M. Jamann ◽  
...  
Keyword(s):  

2018 ◽  
Author(s):  
Han Chen ◽  
Jennifer E. Huffman ◽  
Jennifer A. Brody ◽  
Chaolong Wang ◽  
Seunggeun Lee ◽  
...  

ABSTRACTWith advances in Whole Genome Sequencing (WGS) technology, more advanced statistical methods for testing genetic association with rare variants are being developed. Methods in which variants are grouped for analysis are also known as variant-set, gene-based, and aggregate unit tests. The burden test and Sequence Kernel Association Test (SKAT) are two widely used variant-set tests, which were originally developed for samples of unrelated individuals and later have been extended to family data with known pedigree structures. However, computationally-efficient and powerful variant-set tests are needed to make analyses tractable in large-scale WGS studies with complex study samples. In this paper, we propose the variant-Set Mixed Model Association Tests (SMMAT) for continuous and binary traits using the generalized linear mixed model framework. These tests can be applied to large-scale WGS studies involving samples with population structure and relatedness, such as in the National Heart, Lung, and Blood Institute’s Trans-Omics for Precision Medicine (TOPMed) program. SMMAT tests share the same null model for different variant sets, and a virtue of this null model, which includes covariates only, is that it needs to be only fit once for all tests in each genome-wide analysis. Simulation studies show that all the proposed SMMAT tests correctly control type I error rates for both continuous and binary traits in the presence of population structure and relatedness. We also illustrate our tests in a real data example of analysis of plasma fibrinogen levels in the TOPMed program (n = 23,763), using the Analysis Commons, a cloud-based computing platform.


2018 ◽  
Vol 61 (2) ◽  
pp. 161-171 ◽  
Author(s):  
Laura Viviana Santos ◽  
Kerstin Brügemann ◽  
Asja Ebinghaus ◽  
Sven König

Abstract. Genetic (co)variance components were estimated for alternative functional traits generated by automatic milking systems (AMSs), and reflecting dairy cow behavior and health. Data recording spanned a period of 30 days and included 70 700 observations (visits to the AMS) from 922 Holstein cows kept in three German farms. The three selected farms used the same type of AMS and specific selection gates allowing “natural cow behavior on a voluntary basis”. AMS traits used as behavior indicator traits were AMS visits per cow and day as binary traits, with a threshold for equal to or greater than three visits (VIS3) and equal to or greater than four visits (VIS4), knocking off the milking device with a threshold of at least one udder quarter, also as a binary trait (KO), milking duration of each AMS visit in minutes (DUR), average milk flow in kg min−1 (AMF), and the interval between two consecutive milkings (INT). Electrical conductivity (EC) of milk from each udder quarter and in total was used as a health indicator trait. For genetic analyses, in univariate and bivariate models, linear and generalized linear mixed models (GLMMs) with a logit link function were applied to Gaussian distributed and binary traits, respectively. The heritability was 0.08 ± 0.03 for VIS3, 0.05 ± 0.05 for VIS4, 0.03 ± 0.03 for KO, 0.19 ± 0.07 for DUR, 0.25 ± 0.07 for AMF, and 0.07 ± 0.03 for INT. Heritabilities for EC varied between 0.37 ± 0.08 and 0.46 ± 0.09, depending on the udder quarter. On the genetic scale, an increased number of AMS visits (VIS3 and VIS4) were associated with an increase of KO (rg= 0.24 and rg= 0.55, respectively). From a genetic perspective, high-milk-yielding cows visited the AMS more often (rg= 0.49 for VIS3 and rg= 0.80 for VIS4), had a faster AMF (rg= 0.40), and shorter INT (rg= −0.51). When considering these traits as behavior indicator traits, selection of cows with desired temperament simultaneously increases milk yield. An increase of automatically and objectively recorded AMS traits with moderate heritabilities justifies modifications of dairy cattle breeding goals towards higher emphasis on behavioral traits, especially when developing specific robot indices. Nevertheless, ongoing research in this regard with a larger data is suggested in order to validate the results from the present pilot study.


Sign in / Sign up

Export Citation Format

Share Document