inclusion probability
Recently Published Documents


TOTAL DOCUMENTS

32
(FIVE YEARS 9)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Michael G. Levin ◽  
Verena Zuber ◽  
Venexia M. Walker ◽  
Derek Klarin ◽  
Julie Lynch ◽  
...  

ABSTRACTBackgroundCirculating lipid and lipoprotein levels have consistently been identified as risk factors for atherosclerotic cardiovascular disease (ASCVD), largely on the basis of studies focused on coronary artery disease (CAD). The relative contributions of specific lipoproteins to risk of peripheral artery disease (PAD) have not been well-defined. Here, we leveraged large scale genetic association data to identify genetic proxies for circulating lipoprotein-related traits, and employed Mendelian randomization analyses to investigate their effects on PAD risk.MethodsGenome-wide association study summary statistics for PAD (Veterans Affairs Million Veteran Program, 31,307 cases) and CAD (CARDIoGRAMplusC4D, 60,801 cases) were used in the Mendelian Randomization Bayesian model averaging (MR-BMA) framework to prioritize the most likely causal major lipoprotein and subfraction risk factors for PAD and CAD. Mendelian randomization was used to estimate the effect of apolipoprotein B lowering on PAD risk using gene regions that proxy potential lipid-lowering drug targets. Transcriptome-wide association studies were performed to identify genes relevant to circulating levels of prioritized lipoprotein subfractions.ResultsApoB was identified as the most likely causal lipoprotein-related risk factor for both PAD (marginal inclusion probability 0.86, p = 0.003) and CAD (marginal inclusion probability 0.92, p = 0.005). Genetic proxies for ApoB-lowering medications were associated with reduced risk of both PAD (OR 0.87 per 1 standard deviation decrease in ApoB, 95% CI 0.84 to 0.91, p = 9 × 10−10) and CAD (OR 0.66, 95% CI 0.63 to 0.69, p = 4 × 10−73), with a stronger predicted effect of ApoB-lowering on CAD (ratio of ORs 1.33, 95% CI 1.25 to 1.42, p = 9 × 10−19). Among ApoB-containing subfractions, extra-small VLDL particle concentration (XS.VLDL.P) was identified as the most likely subfraction associated with PAD risk (marginal inclusion probability 0.91, p = 2.3 × 10−4), while large LDL particle concentration (L.LDL.P) was the most likely subfraction associated with CAD risk (marginal inclusion probability 0.95, p = 0.011). Genes associated with XS.VLDL.P and L.LDL.P included canonical ApoB-pathway components, although gene-specific effects varied across the lipoprotein subfractions.ConclusionApoB was prioritized as the major lipoprotein fraction causally responsible for both PAD and CAD risk. However, diverse effects of ApoB-lowering drug targets and ApoB-containing lipoprotein subfractions on ASCVD, and distinct subfraction-associated genes suggest possible biologic differences in the role of lipoproteins in the pathogenesis of PAD and CAD.


2020 ◽  
Author(s):  
Fanny Mollandin ◽  
Andrea Rau ◽  
Pascal Croiseau

Abstract Background: Technological advances and decreasing costs have led to the rise of increasingly dense genotyping data, making feasible the identification of potential causal or candidate markers. Custom genotyping chips, which represent a cost-effective strategy to combine medium-density genotypes with a custom genotype panel, can capitalize on these candidates to potentially yield improved accuracy and interpretability in genomic prediction. A particularly promising model to this end is BayesR, which divides markers into four effect size classes (null, small, medium, and large). The flexibility of BayesR has been shown to yield accurate predictions and promise for quantitative trait loci (QTL) mapping in real data applications, but an extensive benchmarking in simulated data is currently lacking.Results: Based on a set of real genotypes, we generated simulated data under a variety of genetic architectures, phenotype heritabilities, and polygenic variances, and we evaluated the impact of excluding (50k genotype data) or including (50k custom genotype data) causal markers among the genotypes. We define several statistical criteria for QTL mapping using BayesR output (maximum a posteriori rule, non-null maximum a posteriori rule, posterior variance, weighted cumulative inclusion probability), including several based on sliding windows rather than individual markers to account for linkage disequilibrium. We compare and contrast these statistics and their ability to accurately prioritize known causal markers. Overall, we confirm the strong predictive performance for BayesR in moderately to highly heritable traits, particularly for 50k custom data; in cases of low heritability or weak linkage disequilibrium with the causal marker in 50k genotypes, QTL mapping is a challenge, regardless of the criterion used.Conclusion: BayesR is a promising approach to simultaneously obtain accurate predictions and interpretable classifications of SNPs into effect size classes. Although QTL mapping is unsurprisingly easiest for highly heritable phenotypes and large QTLs, we illustrated the performance of BayesR in a variety of simulation scenarios, and compared the advantages and limitations of each. Among those considered, the weighted cumulative inclusion probability appears to provide the best mapping results, even under less favorable conditions. Finally, we quantify the advantage that can be gained by incorporating causal mutations on a custom genotyping chip.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 948
Author(s):  
Stefano Cabras

The variable selection problem in general, and specifically for the ordinary linear regression model, is considered in the setup in which the number of covariates is large enough to prevent the exploration of all possible models. In this context, Gibbs-sampling is needed to perform stochastic model exploration to estimate, for instance, the model inclusion probability. We show that under a Bayesian non-parametric prior model for analyzing Gibbs-sampling output, the usual empirical estimator is just the asymptotic version of the expected posterior inclusion probability given the simulation output from Gibbs-sampling. Other posterior conditional estimators of inclusion probabilities can also be considered as related to the latent probabilities distributions on the model space which can be sampled given the observed Gibbs-sampling output. This paper will also compare, in this large model space setup the conventional prior approach against the non-local prior approach used to define the Bayes Factors for model selection. The approach is exposed along with simulation samples and also an application of modeling the Travel and Tourism factors all over the world.


2019 ◽  
Vol 22 (4) ◽  
pp. 423-436 ◽  
Author(s):  
Solikin M. Juhro ◽  
Bernard Njindan Iyke

We examine the usefulness of large-scale inflation forecasting models in Indonesiawithin an inflation-targeting framework. Using a dynamic model averaging approachto address three issues the policymaker faces when forecasting inflation, namely,parameter, predictor, and model uncertainties, we show that large-scale modelshave significant payoffs. Our in-sample forecasts suggest that 60% of 15 exogenouspredictors significantly forecast inflation, given a posterior inclusion probability cut-offof approximately 50%. We show that nearly 87% of the predictors can forecast inflationif we lower the cut-off to approximately 40%. Our out-of-sample forecasts suggest thatlarge-scale inflation forecasting models have substantial forecasting power relative tosimple models of inflation persistence at longer horizons.


2019 ◽  
Vol 8 (5) ◽  
pp. 932-964 ◽  
Author(s):  
Roderick J A Little ◽  
Brady T West ◽  
Philip S Boonstra ◽  
Jingwei Hu

Abstract With the current focus of survey researchers on “big data” that are not selected by probability sampling, measures of the degree of potential sampling bias arising from this nonrandom selection are sorely needed. Existing indices of this degree of departure from probability sampling, like the R-indicator, are based on functions of the propensity of inclusion in the sample, estimated by modeling the inclusion probability as a function of auxiliary variables. These methods are agnostic about the relationship between the inclusion probability and survey outcomes, which is a crucial feature of the problem. We propose a simple index of degree of departure from ignorable sample selection that corrects this deficiency, which we call the standardized measure of unadjusted bias (SMUB). The index is based on normal pattern-mixture models for nonresponse applied to this sample selection problem and is grounded in the model-based framework of nonignorable selection first proposed in the context of nonresponse by Don Rubin in 1976. The index depends on an inestimable parameter that measures the deviation from selection at random, which ranges between the values zero and one. We propose the use of a central value of this parameter, 0.5, for computing a point index, and computing the values of SMUB at zero and one to provide a range of the index in a sensitivity analysis. We also provide a fully Bayesian approach for computing credible intervals for the SMUB, reflecting uncertainty in the values of all of the input parameters. The proposed methods have been implemented in R and are illustrated using real data from the National Survey of Family Growth.


2019 ◽  
Author(s):  
Calwing Liao ◽  
Veikko Vuokila ◽  
Alexandre D Laporte ◽  
Dan Spiegelman ◽  
Patrick A. Dion ◽  
...  

AbstractMiserableness is a behavioural trait that is characterized by strong negative feelings in an individual. Although environmental factors tend to invoke miserableness, it is common to feel miserable ‘for no reason’, suggesting an innate, potential genetic component. Currently, little is known about the functional relevance of common variants associated with miserableness. To further characterize the trait, we conducted a transcriptome-wide association study (TWAS) on 373,733 individuals and identified 104 signals across brain tissue panels with 37 unique genes. Subsequent probabilistic fine-mapping prioritized 95 genes into 90%-credible sets. Amongst these prioritized hits, C7orf50 had the highest posterior inclusion probability of 0.869 in the brain cortex. Furthermore, we demonstrate that many GWAS hits for miserableness are driven by expression. To conclude, we successfully identified several genes implicated in miserableness and highlighted the power of TWAS to prioritize genes associated with a trait.Short summaryThe first transcriptome-wide association study of miserableness identifies many genes including c7orf50 implicated in the trait.


2019 ◽  
Vol 49 (1) ◽  
pp. 41-52 ◽  
Author(s):  
P. Corona ◽  
R.M. Di Biase ◽  
L. Fattorini ◽  
M. D’Amati

Non-detection of trees is an important issue when using single-scan TLS in forest inventories. A hybrid inference approach is adopted. Quoting from distance sampling, a detection function is assumed, so that the inclusion probability of each tree included within each plot can be determined. A simulation study is performed to compare the TLS-based estimators corrected and uncorrected for non-detection with the Horvitz–Thompson estimator based on conventional plot sampling, in which all the trees within plots are recorded. Results show that single-scan TLS provides more efficient estimators with respect to those provided by the conventional plot sampling in the case of low-density forests when no distance sampling correction is performed. In low-density forests, uncorrected estimators lead to a small bias (1%–6%), increasing with plot size. Therefore, care must be taken in enlarging the plot radius too much. The bias increases in forests with clustered spatial structures and in dense forests, where the bias levels (30%–50%) deteriorate the performance of uncorrected estimators. Even if the bias-corrected estimators prove to be effective in reducing the bias (below 15%), these reductions are not sufficient to outperform conventional plot sampling. Therefore, there is no convenience in using TLS-based estimation in high-density forests.


2018 ◽  
Author(s):  
Daniel Trejo Banos ◽  
Daniel L. McCartney ◽  
Tom Battram ◽  
Gibran Hemani ◽  
Rosie M. Walker ◽  
...  

1AbstractEpigenetic DNA modification is partly under genetic control, and occurs in response to a wide range of environmental exposures. Linking epigenetic marks to clinical outcomes may provide greater insight into underlying molecular processes of disease, assist in the identification of therapeutic targets, and improve risk prediction. Here, we present a statistical approach, based on Bayesian inference, that estimates associations between disease risk and all measured epigenetic probes jointly, automatically controlling for both data structure (including cell-count effects, relatedness, and experimental batch effects) and correlations among probes. We benchmark our approach in simulation study, finding improved estimation of probe associations across a wide range of scenarios over existing approaches. Our method estimates the total proportion of disease risk captured by epigenetic probe variation, and when we applied it to measures of body mass index (BMI) and cigarette consumption behaviour in 5,101 individuals, we find that 66.7% (95% CI 60.0-72.8) of the variation in BMI and 67.7% (95% CI 58.4-76.9) of the variation in cigarette consumption can be captured by methylation array data from whole blood, independent of the variation explained by single nucleotide polymorphism markers. We find novel associations, with smoking behaviour associated with a methylation probe at the MNDA gene with >95% posterior inclusion probability, which is a myeloid cell nuclear differentiation antigen gene previously implicated as a biomarker for inflammation and non-Hodgkin lymphoma risk. We conduct unique genome-wide enrichment analyses, identifying blood cholesterol, lipid transport and sterol metabolism pathways for BMI, and response to xenobiotic stimulus and negative regulation of RNA polymerase II promoter transcription for smoking, all with >95% posterior inclusion probability of having methylation probes with associations >1.5 times larger than the average. Finally, we improve phenotypic prediction in two independent cohorts by 28.7% and 10.2% for BMI and smoking respectively over a LASSO model. These results imply that probe measures may capture large amounts of variance because they are likely a consequence of the phenotype rather than a cause. As a result, ‘omics’ data may enable accurate characterization of disease progression and identification of individuals who are on a path to disease. Our approach facilitates better understanding of the underlying epigenetic architecture of complex common disease and is applicable to any kind of genomics data.


Sign in / Sign up

Export Citation Format

Share Document