pairwise linkage disequilibrium
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 4)

H-INDEX

7
(FIVE YEARS 1)

Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 2027-2027
Author(s):  
Parul Rai ◽  
Sara Rashkin ◽  
Victoria I Okhomina ◽  
Vijaya Joshi ◽  
Kenneth I. Ataga ◽  
...  

Abstract Background: Pulmonary hypertension (PH) is associated with premature mortality in adults with sickle cell anemia (SCA). Decreased nitric oxide bioavailability has been implicated in the pathogenesis of SCA associated PH. Thrombospondin-1 (TSP1) protein encoded by THBS1 gene inhibits nitric oxide/cGMP signaling and has been hypothesized to promote PH in SCA (Novelli et al. Am J Physiol Lung Cell Mol Physiol 2019). Recently in a cross-sectional targeted gene analysis in adults with SCA, THBS1 gene single nucleotide polymorphisms (SNPs) rs1478605-G and rs1478604-T were found to be associated with PH, as estimated by its surrogate echocardiogram marker tricuspid regurgitant velocity (TRV). However, neither of these SNPs were validated in an external cohort (Jacob et al. Am J Hematol. 2017). Objective: To test the hypothesis that THBS1 gene SNPs were associated with TRV, we first validated the association of previously reported THBS1 gene SNPs (rs1478605-G and rs1478604-T) with TRV as quantitative and binary variables (elevated TRV: < or ≥ 2.5m/sec) in a longitudinal, pediatric, SCA patient cohort. We then assessed the associations of 8 other common THBS1 SNPs with false discovery rate controlled at 0.1. Finally, we sought to generate a polygenic score (PGS) incorporating this region genomic data and determine its association with TRV. Methods: Our cohort was comprised of children aged 5-18 years with HbSS or HbS/β 0thalassemia enrolled in either Long-Term Effects of Erythrocyte Lysis Trial (ELYSIS, NCT00842621) (Yates et al. Pediatr Blood and Cancer. 2019) and Sickle Cell Clinical and Intervention Program (SCCRIP, NCT02098863). All participants underwent prospective measurement of TRV by 2D-echocardiogram at baseline steady state (≥ 4 weeks from transfusion, hospitalization) and then repeated two years later (Rai et al. Blood Adv. 2021). Generalized linear mixed models were used to assess the association between quantitative or elevated TRV and genetic markers, after adjusting for age at baseline, sex, time point, NT-pro-BNP, creatinine and 5 principal components (PC). Three SNPs, rs1478605-G, rs753599-A and rs1051442-T statistically significant at levels given above for either quantitative or elevated TRV and whose pairwise linkage disequilibrium R 2 ≤ 0.1 were selected to build a PGS. The PGS was defined as the summed number of TRV-increasing unfavorable alleles carried by each individual across all selected SNPs. Two-stage iterative resampling (TSIR) approach was used to internally discover and validate the associations of PGS with TRV at an overall type I error rate of 0.05 with randomly sampled half of samples as the discovery cohort and the remaining as the validation cohort (Kang et al. J Hum Genet. 2015). Results: Our cohort included 138 children (276 data points) with SCA with mean age of 9.9 years (SD 3.25 years). We validated the association of quantitative TRV with previously reported two SNPs rs1478605 (estimate = -0.11, standard error [SE] = 0.04, p = 0.003) and rs1478604 (estimate = -0.09, SE = 0.04, p = 0.014). We further identified new associations of elevated TRV with THBS1 SNPs rs10551442 odds ratio (OR) 0.26 (95% CI 0.09, 0.77; p = 0.017), rs17633107 OR 0.26 (95% CI 0.09, 0.77; p=0.017), rs3743125 OR 4.4 (95% CI 1.2, 1.7; p = 0.03), and rs753599 OR 5.3 (95% CI 1.4, 2.1; p=0.018). Two of these SNPs rs10551442 (estimate = -0.13, SE = 0.05, p = 0.013) and rs17633107 (estimate = -0.13, SE = 0.05, p = 0.013) were also associated with quantitative TRV. One unfavorable allele increase in PGS was associated with a 0.1 m/sec increase in TRV (Figure 1A) and a 2.1 increase in log-odds of having an elevated TRV (Figure 1B). The associations of PGS with quantitative and elevated TRV were discovered and validated in 36 and 21 out of 100 repetitions by the TSIR analysis, respectively. Conclusion: We validated two previously published variants in the THBS1 gene rs1478605 and rs1478604 in an external and independent pediatric cohort with quantitative TRV. We also identified associations of common SNPs rs10551442, rs17633107, rs3743125, and rs753599 with TRV ≥ 2.5m/sec and of rs10551442 and rs17633107 with quantitative TRV in a pediatric population and created a PGS incorporating these data. The PGS is significantly associated with TRV, which was internally validated using TSIR approach. Our genomic findings further support the role of TSP1 in the pathophysiology of the pulmonary vasculopathy seen in SCA. Figure 1 Figure 1. Disclosures Ataga: Forma Therapeutics: Membership on an entity's Board of Directors or advisory committees; F. Hoffmann-La Roche Ltd: Consultancy; Novo Nordisk: Membership on an entity's Board of Directors or advisory committees; Global Blood Therapeutics: Membership on an entity's Board of Directors or advisory committees; Agios Pharmaceuticals: Consultancy; Novartis: Membership on an entity's Board of Directors or advisory committees. Hankins: Global Blood Therapeutics: Consultancy; Vindico Medical Education: Consultancy; Bluebird Bio: Consultancy; UpToDate: Consultancy. Estepp: Global Blood Therapeutics: Consultancy, Research Funding.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Eric B. Nonnecke ◽  
Patricia A. Castillo ◽  
Amanda E. Dugan ◽  
Faisal Almalki ◽  
Mark A. Underwood ◽  
...  

AbstractIntelectins are ancient carbohydrate binding proteins, spanning chordate evolution and implicated in multiple human diseases. Previous GWAS have linked SNPs in ITLN1 (also known as omentin) with susceptibility to Crohn's disease (CD); however, analysis of possible functional significance of SNPs at this locus is lacking. Using the Ensembl database, pairwise linkage disequilibrium (LD) analyses indicated that several disease-associated SNPs at the ITLN1 locus, including SNPs in CD244 and Ly9, were in LD. The alleles comprising the risk haplotype are the major alleles in European (67%), but minor alleles in African superpopulations. Neither ITLN1 mRNA nor protein abundance in intestinal tissue, which we confirm as goblet-cell derived, was altered in the CD samples overall nor when samples were analyzed according to genotype. Moreover, the missense variant V109D does not influence ITLN1 glycan binding to the glycan β-D-galactofuranose or protein–protein oligomerization. Taken together, our data are an important step in defining the role(s) of the CD-risk haplotype by determining that risk is unlikely to be due to changes in ITLN1 carbohydrate recognition, protein oligomerization, or expression levels in intestinal mucosa. Our findings suggest that the relationship between the genomic data and disease arises from changes in CD244 or Ly9 biology, differences in ITLN1 expression in other tissues, or an alteration in ITLN1 interaction with other proteins.


2020 ◽  
Author(s):  
David Gerard

AbstractMany tasks in statistical genetics involve pairwise estimation of linkage disequilibrium (LD). The study of LD in diploids is mature. However, in polyploids, the field lacks a comprehensive characterization of LD. Polyploids also exhibit greater levels of genotype uncertainty than diploids, and yet no methods currently exist to estimate LD in polyploids in the presence of such genotype uncertainty. Furthermore, most LD estimation methods do not quantify the level of uncertainty in their LD estimates. Our paper contains three major contributions. (i) We characterize haplotypic and composite measures of LD in polyploids. These composite measures of LD turn out to be functions of common statistical measures of association. (ii) We derive procedures to estimate haplotypic and composite LD in polyploids in the presence of genotype uncertainty. We do this by estimating LD directly from genotype likelihoods, which may be obtained from many genotyping platforms. (iii) We derive standard errors of all LD estimators that we discuss. We validate our methods on both real and simulated data. Our methods are implemented in the R package ldsep, available on the Comprehensive R Archive Network https://cran.r-project.org/package=ldsep.


2018 ◽  
Author(s):  
Shuzhen Sun ◽  
Zhuqi Miao ◽  
Blaise Ratcliffe ◽  
Polly Campbell ◽  
Bret Pasch ◽  
...  

AbstractHigh-throughput sequencing technology has revolutionized both medical and biological research by generating exceedingly large numbers of genetic variants. The resulting datasets share a number of common characteristics that might lead to poor generalization capacity. Concerns include noise accumulated due to the large number of predictors, sparse information regarding the p ≫ n problem, and overfitting and model mis-identification resulting from spurious collinearity. Additionally, complex correlation patterns are present among variables. As a consequence, reliable variable selection techniques play a pivotal role in predictive analysis, generalization capability, and robustness in clustering, as well as interpretability of the derived models.K-dominating set, a parameterized graph-theoretic generalization model, was used to model SNP (single nucleotide polymorphism) data as a similarity network and searched for representative SNP variables. In particular, each SNP was represented as a vertex in the graph, (dis)similarity measures such as correlation coefficients or pairwise linkage disequilibrium were estimated to describe the relationship between each pair of SNPs; a pair of vertices are adjacent, i.e. joined by an edge, if the pairwise similarity measure exceeds a user-specified threshold. A minimum K-dominating set in the SNP graph was then made as the smallest subset such that every SNP that is excluded from the subset has at least k neighbors in the selected ones. The strength ofk-dominating set selection in identifying independent variables, and in culling representative variables that are highly correlated with others, was demonstrated by a simulated dataset. The advantages of k-dominating set variable selection were also illustrated in two applications: pedigree reconstruction using SNP profiles of 1,372 Douglas-fir trees, and species delineation for 226 grasshopper mouse samples. A C++ source code that implements SNP-SELECT and uses Gurobi™ optimization solver for the k-dominating set variable selection is available (https://github.com/transgenomicsosu/SNP-SELECT).


2017 ◽  
Author(s):  
Timothy P. Bilton ◽  
John C. McEwan ◽  
Shannon M. Clarke ◽  
Rudiger Brauning ◽  
Tracey C. van Stijn ◽  
...  

AbstractHigh-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. One side-effect of these methods, however, is that one or more alleles at a particular locus may not be sequenced, particularly when the sequencing depth is low, resulting in some heterozygous genotypes being called as homozygous. Under-called heterozygous genotypes have a profound effect on the estimation of linkage disequilibrium and, if not taken into account, leads to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for under-called heterozygous genotypes. Our findings show that accurate estimates were obtained using GUS-LD on low coverage sequencing data, whereas underestimation of linkage disequilibrium results if no adjustment is made for under-called heterozygotes.


2017 ◽  
Author(s):  
Max Shpak ◽  
Yang Ni ◽  
Jie Lu ◽  
Peter Müller

AbstractThe mean pairwise genetic distance among haplotypes is an estimator of the population mutation rate θ and a standard measure of variation in a population. With the advent of next-generation sequencing (NGS) methods, this and other population parameters can be estimated under different modes of sampling. One approach is to sequence individual genomes with high coverage, and to calculate genetic distance over all sample pairs. The second approach, typically used for microbial samples or for tumor cells, is sequencing a large number of pooled genomes with very low individual coverage. With low coverage, pairwise genetic distances are calculated across independently sampled sites rather than across individual genomes. In this study, we show that the variance in genetic distance estimates is reduced with low coverage sampling if the mean pairwise linkage disequilibrium weighted by allele frequencies is positive. Practically, this means that if on average the most frequent alleles over pairs of loci are in positive linkage disequilibrium, low coverage sequencing results in improved estimates of θ, assuming similar per-site read depths. We show that this result holds under the expected distribution of allele frequencies and linkage disequilibria for an infinite sites model at mutation-drift equilibrium. From simulations, we find that the conditions for reduced variance only fail to hold in cases where variant alleles are few and at very low frequency. These results are applied to haplotype frequencies from a lung cancer tumor to compute the weighted linkage disequilibria and the expected error in estimated genetic distance using high versus low coverage.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 4457-4457
Author(s):  
Leisa Lopes-Aguiar ◽  
Marcia Torresan Delamain ◽  
Angelo Borsarelli Carvalho Brito ◽  
Gustavo Jacob Lourenço ◽  
Ericka Francislaine Dias Costa ◽  
...  

Abstract Introduction: Angiogenesis (AG) abnormalities are crucial in pathogenesis of multiple myeloma (MM), and give support to treat patients with antiangiogenic agents. However, patients with similar clinicopathological aspects may present distinct outcome under AG inhibitors treatment. Single nucleotide polymorphisms (SNPs) in genes involved in blood vessels formation may constitute a plausible explanation for this finding. The wild-type alleles of VEGF c.-2595C>A (rs699947), c.-1154G>A (rs1570360), c.-634G>C (rs2010963), c.*237C>T (rs3025039), VEGFR2 c.-906T>C (rs2071559) and c.889G>A SNPs (rs2305948) SNPs, and GSTM1 and GSTT1 genes determine higher production, transcriptional activity, binding efficiency or best-characterized regulator of VEGF. This study aimed to investigate the roles of VEGF c.-2595C>A, c.-1154G>A, c.-634G>C, c.*237C>T, VEGFR2 c.-906T>C, c.889G>A SNPs, and GSTM1 and GSTT1 genes in outcome of MM patients treated with thalidomide-based regimens. Patients and methods: The study comprised 102 MM patients diagnosed at the Haematology and Haemotherapy Centre of University of Campinas between June 2005 and June 2013. The tumor was diagnosed by standard criteriaand staged by International Staging System. Therapeutic regimens consisted in thalidomide combined with steroids and chemotherapy, followed or not by autologous steam cell transplantation (ASCT). Response was evaluated at the end of treatment using the International Myeloma Working Group guidelines. The follow-up of patients was performed with hematological, biochemical, and immunological exams. The end of the study was February 2016. Genotypes of VEGF, VEGFR2 SNPs, and GSTM1 and GSTT1 genes were analyzed in genomic DNA by polymerase chain reaction based methods. The pairwise linkage disequilibrium (LD) was performed to ensure that markers were appropriate for inclusion in the VEGF and VEGFR2 haplotype estimates. The chi-square test and logistic regression model were used to identify variables influencing response to treatment. The Kaplan-Meier method, log-rank test and Cox hazards models served to assess the associations between event-free survival (EFS) and overall survival (OS). Results: Near half of patients enrolled in this study were male, and most of them were caucasians and with tumor at stages II or III. ASCT was performed after chemotherapy in near 40% of patients. LDs between VEGF and VEGFR2 SNPs were seen in study, and common haplotypes (frequency >1%) of the genes were included in further analyses. Patients with the wild-type allele of VEGF c.-2595C>A alone or plus the wild-type allele of VEGFR2 c.-906T>C SNPs, and the CGGC haplotype of all respective VEGF SNPs had 3.55, 9.91, and 3.86 more chances of achieving better response to therapy than others. The median follow-up time of 102 MM patients enrolled in the study was 43 months (range: 1-88). The estimated probabilities of 60-months EFS and OS were 24.5% and 48.1%, respectively. At 60 months of follow-up, patients with VEGFR2 c.889GG, VEGF c.-634GG plus VEGFR2 c.889GG, and VEGFR2 c.889GGplus GSTM1 present genotypes had 2.62, 2.64, and 2.80 more chances of presenting disease relapse or progression, and 2.21, 4.88, and 4.23 more chances of evolving to death in multivariate analysis, respectively. Conclusion: Our data present, for the first time, a preliminary evidence that VEGF c.-2595C>A, c.-1154G>A, c.-634G>C, c.*237C>T, VEGFR2 c.-906T>C, c.889G>A SNPs, and GSTM1 gene alter outcome of MM patients treated with thalidomide-based regimens. If these findings could be confirmed in multi-center studies with larger sample size, this might constitute a promise in assisting optimal patient choice for treatment with angiogenic agents. Financial support: São Paulo Research Foundation (FAPESP). Disclosures No relevant conflicts of interest to declare.


2013 ◽  
Vol 1 (1) ◽  
pp. 25-29 ◽  
Author(s):  
Venkatesh Babu Gurramkonda ◽  
Jyotsna Murthy ◽  
Altaf Hussain Syed ◽  
Bhaskar VKS Lakkakula

Objective: This present study is aimed to investigate the association between interferon regulatory factor 6 (IRF6), single nucleotide polymorphisms (SNPs), and nonsyndromic cleft lip without without cleft palate (NSCLP) in the South Indian population. Subject and Methods: For this study, 190 unrelated NSCLP patients and 189 controls without clefts were genotyped with rs2235371 (V2741) and rs642961 SNPs using PCR-RFLP. The associations between NSCLP groups and IRF6 gene polymorphisms, as well as haplotypes, were analyzed using chi-squared test and 95% confidence interval (95%CI) of the odds ratios were calculated with the control groups as reference. Results: For controls, the minor allele frequencies of both variants, V2741 and rs642961, were 7.1% and 21.1%, respectively. Genotype data for both variants in control and cleft groups follow the Hardy Weinberg Equilibrium. Between cases with NSCLP and controls, the two SNPs showed no differences in frequencies of the genotypes or alleles. The pairwise linkage disequilibrium (LD) values (D’=1 and r2=0.027) between V2741 and rs642961 revealed that these two SNPs are not in strong LD. Haplotype G-T showed a significantly reduced risk for oral clefts (p<0.001) and haplotype A-T increased the risk for oral clefts (p=0.043). Gene-gene interaction showed that the higher risk group contains more GG-CC combination of cases that the controls, but this model was not significantly associated with cleft status (p=0.136) Conclusion: In conclusion, while IRF6 is strongly associated in other populations, this study demonstrated that variants in IRF6 may play a role in NSCLP in a South Indian population, but other genes are expected to play a role in this population as well.


PLoS ONE ◽  
2012 ◽  
Vol 7 (11) ◽  
pp. e47491 ◽  
Author(s):  
Cynthia Vierra-Green ◽  
David Roe ◽  
Lihua Hou ◽  
Carolyn Katovich Hurley ◽  
Raja Rajalingam ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document