scholarly journals Nearest-Neighbor Projected Distance Regression for Epistasis Detection in GWAS With Population Structure Correction

2020 ◽  
Vol 11 ◽  
Author(s):  
Marziyeh Arabnejad ◽  
Courtney G. Montgomery ◽  
Patrick M. Gaffney ◽  
Brett A. McKinney
2019 ◽  
Author(s):  
Trang T. Le ◽  
Bryan A. Dawkins ◽  
Brett A. McKinney

AbstractMachine learning feature selection methods are needed to detect complex interaction-network effects in complicated modeling scenarios in high-dimensional data, such as GWAS, gene expression, eQTL, and structural/functional neuroimage studies for case-control or continuous outcomes. In addition, many machine learning methods have limited ability to address the issues of controlling false discoveries and adjusting for covariates. To address these challenges, we develop a new feature selection technique called Nearest-neighbor Projected-Distance Regression (NPDR) that calculates the importance of each predictor using generalized linear model (GLM) regression of distances between nearest-neighbor pairs projected onto the predictor dimension. NPDR captures the underlying interaction structure of data using nearest-neighbors in high dimensions, handles both dichotomous and continuous outcomes and predictor data types, statistically corrects for covariates, and permits statistical inference and penalized regression. We use realistic simulations with interactions and other effects to show that NPDR has better precision-recall than standard Relief-based feature selection and random forest importance, with the additional benefit of covariate adjustment and multiple testing correction. Using RNA-Seq data from a study of major depressive disorder (MDD), we show that NPDR with covariate adjustment removes spurious associations due to confounding. We apply NPDR to eQTL data to identify potentially interacting variants that regulate transcripts associated with MDD and demonstrate NPDR’s utility for GWAS and continuous outcomes.


2020 ◽  
Vol 36 (9) ◽  
pp. 2770-2777
Author(s):  
Trang T Le ◽  
Bryan A Dawkins ◽  
Brett A McKinney

Abstract Summary Machine learning feature selection methods are needed to detect complex interaction-network effects in complicated modeling scenarios in high-dimensional data, such as GWAS, gene expression, eQTL and structural/functional neuroimage studies for case–control or continuous outcomes. In addition, many machine learning methods have limited ability to address the issues of controlling false discoveries and adjusting for covariates. To address these challenges, we develop a new feature selection technique called Nearest-neighbor Projected-Distance Regression (NPDR) that calculates the importance of each predictor using generalized linear model regression of distances between nearest-neighbor pairs projected onto the predictor dimension. NPDR captures the underlying interaction structure of data using nearest-neighbors in high dimensions, handles both dichotomous and continuous outcomes and predictor data types, statistically corrects for covariates, and permits statistical inference and penalized regression. We use realistic simulations with interactions and other effects to show that NPDR has better precision-recall than standard Relief-based feature selection and random forest importance, with the additional benefit of covariate adjustment and multiple testing correction. Using RNA-Seq data from a study of major depressive disorder (MDD), we show that NPDR with covariate adjustment removes spurious associations due to confounding. We apply NPDR to eQTL data to identify potentially interacting variants that regulate transcripts associated with MDD and demonstrate NPDR’s utility for GWAS and continuous outcomes. Availability and implementation Available at: https://insilico.github.io/npdr/. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
J. M. Oblak ◽  
W. H. Rand

The energy of an a/2 <110> shear antiphase. boundary in the Ll2 expected to be at a minimum on {100} cube planes because here strue ture is there is no violation of nearest-neighbor order. The latter however does involve the disruption of second nearest neighbors. It has been suggested that cross slip of paired a/2 <110> dislocations from octahedral onto cube planes is an important dislocation trapping mechanism in Ni3Al; furthermore, slip traces consistent with cube slip are observed above 920°K.Due to the high energy of the {111} antiphase boundary (> 200 mJ/m2), paired a/2 <110> dislocations are tightly constricted on the octahedral plane and cannot be individually resolved.


Author(s):  
S. R. Herd ◽  
P. Chaudhari

Electron diffraction and direct transmission have been used extensively to study the local atomic arrangement in amorphous solids and in particular Ge. Nearest neighbor distances had been calculated from E.D. profiles and the results have been interpreted in terms of the microcrystalline or the random network models. Direct transmission electron microscopy appears the most direct and accurate method to resolve this issue since the spacial resolution of the better instruments are of the order of 3Å. In particular the tilted beam interference method is used regularly to show fringes corresponding to 1.5 to 3Å lattice planes in crystals as resolution tests.


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


Sign in / Sign up

Export Citation Format

Share Document