39 Combining Different Marker Prioritization Methods in the Analysis of High-density and Sequence Data

2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 21-21
Author(s):  
Ashley S Ling ◽  
El Hamidi Hay ◽  
Samuel E Aggrey ◽  
Romdhane Rekaya

Abstract High-density and sequence genotypes were expected to increase accuracy of genomic predictions through inclusion of markers in high linkage disequilibrium with causal loci, yet the realized increase has been minimal. Marker preselection has been proposed as a strategy to prioritize the most relevant markers to reduce the dimensionality of the association model and potentially increase accuracy. Strength of association statistics (estimated effect, p-value) and population differentiation measurements (FST score) have both been explored as criteria for preselection, but sensitivity to identify relevant markers decreases as random noise exceeds true signal variation. Combining both criteria into an index would leverage the unique contributions of each criterion and potentially increase prediction accuracies. A simulation consisting of 200 QTL, 777k SNP, and 7 generations under selection was generated (10 replicates). Marker preselection was compared across three criteria: only estimated effect (EFF), only FST score (FST), or an index combining the two previous statistics (COMB). In the COMB scenario, markers from genomic regions with high correlation (>0.7) between estimated effect and FST score were selected along with markers whose estimated effect or FST score exceeded a certain threshold. Across replicates, COMB identified additional markers tagging between 1 and 7 QTL not tagged by EFF or FST that explain 0.2–5.4% of the genetic variance. The highest accuracy for EFF and FST was 0.76 and 0.73 when preselecting 2k and 10k markers, respectively. Under the best-case scenario (3,297 preselected markers), COMB improved accuracy by less than 1% and 4% compared to EFF and FST scenarios, respectively. Though an index combining multiple statistics may increase the number of QTL tagged by preselected markers and genetic variance explained relative to single-statistic preselection, this does not necessarily translate to a meaningful increase in accuracy. However, the results are dependent on the indexing method.

2019 ◽  
Vol 97 (7) ◽  
pp. 3027-3033 ◽  
Author(s):  
Thaise P Melo ◽  
Marina R S Fortes ◽  
Gerardo A Fernandes Junior ◽  
Lucia G Albuquerque ◽  
Roberto Carvalheiro

Abstract An efficient strategy to improve QTL detection power is performing across-breed validation studies. Variants segregating across breeds are expected to be in high linkage disequilibrium (LD) with causal mutations affecting economically important traits. The aim of this study was to validate, in a Tropical Composite cattle (TC) population, QTL associations identified for sexual precocity traits in a Nellore and Brahman meta-analysis genome-wide association study. In total, 2,816 TC, 8,001 Nellore, and 2,210 Brahman animals were available for the analysis. For that, genomic regions significantly associated with puberty traits in the meta-analysis study were validated for the following sexual precocity traits in TC: age at first corpus luteum (AGECL), first postpartum anestrus interval (PPAI), and scrotal circumference at 18 months of age (SC). We considered validated QTL those underpinned by significant markers from the Nellore and Brahman meta-analysis (P ≤ 10–4) that were also significant for a TC trait, i.e., presenting a P-value of ≤10–3 for AGECL, PPAI, or SC. We also considered as validated QTL those regions where significant markers in the reference population were at ±250 kb from significant markers in the validation population. Using this criteria, 49 SNP were validated for AGECL, 4 for PPAI, and 14 for SC, from which 5 were in common with AGECL, totaling 62 validated SNP for these traits and 30 candidate genes surrounding them. Considering just candidate genes closest to the top SNP of each chromosome, for AGECL 8 candidate genes were identified: COL8A1, PENK, ENSBTAG00000047425, BPNT1, ADAMTS17, CCHCR1, SUFU, and ENSBTAG00000046374. For PPAI, 3 genes emerged as candidates (PCBP3, KCNK10, and MRPS5), and for SC 8 candidate genes were identified (SNORA70, TRAC, ASS1, BPNT1, LRRK1, PKHD1, PTPRM, and ENSBTAG00000045690). Several candidate regions presented here were previously associated with puberty traits in cattle. The majority of emerging candidate genes are related to biological processes involved in reproductive events, such as maintenance of gestation, and some are known to be expressed in reproductive tissues. Our results suggested that some QTL controlling early puberty seem to be segregating across cattle breeds adapted to tropical conditions.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 25-25
Author(s):  
Muhammad Yasir Nawaz ◽  
Rodrigo Pelicioni Savegnago ◽  
Cedric Gondro

Abstract In this study, we detected genome wide footprints of selection in Hanwoo and Angus beef cattle using different allele frequency and haplotype-based methods based on imputed whole genome sequence data. Our dataset included 13,202 Angus and 10,437 Hanwoo animals with 10,057,633 and 13,241,550 imputed SNPs, respectively. A subset of data with 6,873,624 common SNPs between the two populations was used to estimate signatures of selection parameters, both within (runs of homozygosity and extended haplotype homozygosity) and between (allele fixation index, extended haplotype homozygosity) the breeds in order to infer evidence of selection. We observed that correlations between various measures of selection ranged between 0.01 to 0.42. Assuming these parameters were complementary to each other, we combined them into a composite selection signal to identify regions under selection in both beef breeds. The composite signal was based on the average of fractional ranks of individual selection measures for every SNP. We identified some selection signatures that were common between the breeds while others were independent. We also observed that more genomic regions were selected in Angus as compared to Hanwoo. Candidate genes within significant genomic regions may help explain mechanisms of adaptation, domestication history and loci for important traits in Angus and Hanwoo cattle. In the future, we will use the top SNPs under selection for genomic prediction of carcass traits in both breeds.


10.2196/17803 ◽  
2020 ◽  
Vol 8 (8) ◽  
pp. e17803
Author(s):  
JeeEun Lee ◽  
Sun K Yoo

Background As the mobile environment has developed recently, there have been studies on continuous respiration monitoring. However, it is not easy for general users to access the sensors typically used to measure respiration. There is also random noise caused by various environmental variables when respiration is measured using noncontact methods in a mobile environment. Objective In this study, we aimed to estimate the respiration rate using an accelerometer sensor in a smartphone. Methods First, data were acquired from an accelerometer sensor by a smartphone, which can easily be accessed by the general public. Second, an independent component was extracted to calibrate the three-axis accelerometer. Lastly, the respiration rate was estimated using quefrency selection reflecting the harmonic component because respiration has regular patterns. Results From April 2018, we enrolled 30 male participants. When the independent component and quefrency selection were used to estimate the respiration rate, the correlation with respiration acquired from a chest belt was 0.7. The statistical results of the Wilcoxon signed-rank test were used to determine whether the differences in the respiration counts acquired from the chest belt and from the accelerometer sensor were significant. The P value of the difference in the respiration counts acquired from the two sensors was .27, which was not significant. This indicates that the number of respiration counts measured using the accelerometer sensor was not different from that measured using the chest belt. The Bland-Altman results indicated that the mean difference was 0.43, with less than one breath per minute, and that the respiration rate was at the 95% limits of agreement. Conclusions There was no relevant difference in the respiration rate measured using a chest belt and that measured using an accelerometer sensor. The accelerometer sensor approach could solve the problems related to the inconvenience of chest belt attachment and the settings. It could be used to detect sleep apnea through constant respiration rate estimation in an internet-of-things environment.


2021 ◽  
Author(s):  
Richard F Oppong ◽  
Pau Navarro ◽  
Chris S Haley ◽  
Sara Knott

We describe a genome-wide analytical approach, SNP and Haplotype Regional Heritability Mapping (SNHap-RHM), that provides regional estimates of the heritability across locally defined regions in the genome. This approach utilises relationship matrices that are based on sharing of SNP and haplotype alleles at local haplotype blocks delimited by recombination boundaries in the genome. We implemented the approach on simulated data and show that the haplotype-based regional GRMs capture variation that is complementary to that captured by SNP-based regional GRMs, and thus justifying the fitting of the two GRMs jointly in a single analysis (SNHap-RHM). SNHap-RHM captures regions in the genome contributing to the phenotypic variation that existing genome-wide analysis methods may fail to capture. We further demonstrate that there are real benefits to be gained from this approach by applying it to real data from about 20,000 individuals from the Generation Scotland: Scottish Family Health Study. We analysed height and major depressive disorder (MDD). We identified seven genomic regions that are genome-wide significant for height, and three regions significant at a suggestive threshold (p-value <1x10^(-5) ) for MDD. These significant regions have genes mapped to within 400kb of them. The genes mapped for height have been reported to be associated with height in humans, whiles those mapped for MDD have been reported to be associated with major depressive disorder and other psychiatry phenotypes. The results show that SNHap-RHM presents an exciting new opportunity to analyse complex traits by allowing the joint mapping of novel genomic regions tagged by either SNPs or haplotypes, potentially leading to the recovery of some of the "missing" heritability.


2020 ◽  
Author(s):  
David Curtis

Rare genetic variants in LDLR, APOB and PCSK9 are known causes of familial hypercholesterolaemia and it is expected that rare variants in other genes will also have effects on hyperlipidaemia risk although such genes remain to be identified. The UK Biobank consists of a sample of 500,000 volunteers and exome sequence data is available for 50,000 of them. 11,490 of these were classified as hyperlipidaemia cases on the basis of having a relevant diagnosis recorded and/or taking lipid-lowering medication while the remaining 38,463 were treated as controls. Variants in each gene were assigned weights according to rarity and predicted impact and overall weighted burden scores were compared between cases and controls, including population principal components as covariates. One biologically plausible gene, HUWE1, produced statistically significant evidence for association after correction for testing 22,028 genes with a signed log10 p value (SLP) of -6.15, suggesting a protective effect of variants in this gene. Other genes with uncorrected p<0.001 are arguably also of interest, including LDLR (SLP=3.67), RBP2 (SLP=3.14), NPFFR1 (SLP=3.02) and ACOT9 (SLP=-3.19). Gene set analysis indicated that rare variants in genes involved in metabolism and energy can influence hyperlipidaemia risk. Overall, the results provide some leads which might be followed up with functional studies and which could be tested in additional data sets as these become available. This research has been conducted using the UK Biobank Resource.


Circulation ◽  
2017 ◽  
Vol 135 (suppl_1) ◽  
Author(s):  
Samar R El Khoudary ◽  
Xirun Chen ◽  
Maria Brooks ◽  
Imke Janssen ◽  
Steve Hollenberg ◽  
...  

Objective: Growing evidence suggests that the cardioprotective effects of high density lipoproteins (HDL) may be diminished over the menopause transition. Estradiol (E2), the leading ovarian estrogen that declines substantially during the menopause transition, is considered a potent antioxidant with potential impact on lipid peroxidation and formation of reactive oxygen species that could affect HDL cardioprotective capacities. Whether the cardioprotective effects of HDL on atherosclerotic subclinical measures are apparent only in the presence of high levels of E2 in women at midlife is not clear. We hypothesized that the expected positive association between levels of high density lipoprotein cholesterol (HDL-C) and endothelial function as indexed by flow-mediated dilation (FMD) is weaker at lower levels of E2. Methods: Participants were fromthe baseline visit of the Study of Women’s Health Across the Nation (SWAN) Heart study. Women with hysterectomy and/or bilateral oophorectomy and those on hormone therapy were excluded. B-mode ultrasound images of the right brachial artery before and after deflation were obtained to estimate % change in FMD. Linear regression analyses were utilized to model the difference between log-baseline arterial diameter and log-post-deflation arterial diameter as a function of the interaction between log-E2 and HDL-C levels. To illustrate, the interaction between E2 tertiles and HDL-C was presented as well. The final model adjusted for log-baseline arterial diameter, race, study site, cycle day of blood draw, menopause status, body mass index, and diastolic blood pressure. Results were presented as % change in FMD (95% CI). Results: The study included 322 women (60% White and 40% Black) aged 50.7±2.8 years who were either in pre-/early perimenopause (63%) or late peri-/postmenopause (37%). In the final model, a significant interaction between HDL-C and log-E2 levels on %FMD was found, P value for interaction=0.01; such that a positive association between HDL-C and %FMD was only evident among women in the high E2 tertile (E2 ≥51.1 pg/ml) [%FMD (95%CI) per 1 SD increase in HDL-C: 0.93%(0.21%, 1.64%)] and that %FMD per 1 SD increase in HDL-C was significantly lower in women in the low E2 tertile (E2 <21.5 pg/ml) compared to women in the high E2 tertile [%FMD difference (95%CI): low E2 tertile vs. high E2 tertile: -0.98 (-1.88, -0.07), P-value=0.03]. Conclusions: The cardioprotective association of HDL-C on endothelial function depends on levels of E2 in women at midlife, such that HDL-C may not be protective to the vascular endothelium in the setting of low endogenous E2. Our analyses should be replicated in longitudinal settings. Future studies should investigate potential mechanistic pathways by which dynamic changes in E2 levels may impact HDL composition and functionality over the menopausal transition.


2019 ◽  
Vol 20 (6) ◽  
pp. 1260 ◽  
Author(s):  
Renate Horn ◽  
Aleksandra Radanovic ◽  
Lena Fuhrmann ◽  
Yves Sprycha ◽  
Sonia Hamrit ◽  
...  

Hybrid breeding in sunflowers based on CMS PET1 requires development of restorer lines carrying, in most cases, the restorer gene Rf1. Markers for marker-assisted selection have been developed, but there is still need for closer, more versatile, and co-dominant markers linked to Rf1. Homology searches against the reference sunflower genome using sequences of cloned markers, as well as Bacterial Artificial Chromosome (BAC)-end sequences of clones hybridizing to them, allowed the identification of two genomic regions of 30 and 3.9 Mb, respectively, as possible physical locations of the restorer gene Rf1 on linkage group 13. Nine potential candidate genes, encoding six pentatricopeptide repeat proteins, one tetratricopeptide-like helical domain, a probable aldehyde dehydrogenase 22A1, and a probable poly(A) polymerase 3 (PAPS3), were identified in these two genomic regions. Amplicon targeted next generation sequencing of these nine candidate genes for Rf1 was performed in an association panel consisting of 27 maintainer and 32 restorer lines and revealed the presence of 210 Single Nucleotide Polymorphisms (SNPs) and 67 Insertions/Deletions (INDELs). Association studies showed significant associations of 10 SNPs with fertility restoration (p-value < 10−4), narrowing Rf1 down to three candidate genes. Three new markers, one co-dominant marker 67N04_P and two dominant markers, PPR621.5R for restorer, and PPR621.5M for maintainer lines were developed and verified in the association panel of 59 sunflower lines. The versatility of the three newly developed markers, as well as of three existing markers for the restorer gene Rf1 (HRG01 and HRG02, Cleaved Amplified Polymorphic Sequence (CAPS)-marker H13), was analyzed in a large association panel consisting of 557 accessions.


2004 ◽  
Vol 85 (1) ◽  
pp. 45-48 ◽  
Author(s):  
Linda M. Kohn

Astract Phylogenetic or genealogical interpretation of DNA sequence data from multiple genomic regions has become the gold standard for species delimitation and population genetics. Precise species concepts can inform quarantine decisions but are likely to reflect evolutionary events too far in the past to impact disease management. On the other hand, multilocus approaches at the population level can identify patterns of endemism or migration directly associated with episodes of disease, including host shifts and associated changes in determinants of pathogenicity and avirulence. We used the genome database of Magnaporthe grisea to frame a comparative, multilocus genomics approach from which we demonstrate a single origin for rice infecting genotypes with concomitant loss of sex in pandemic clonal lineages, and patterns of gain and loss of avirulence genes. In the Sclerotinia sclerotiorum pathosystem, we identified significant associations of multilocus haplotypes with specific pathogen populations in North America. Following the introduction of a new crop, endemic pathogen genotypes and newly evolved migrant genotypes caused novel, early-season symptoms.


Sign in / Sign up

Export Citation Format

Share Document