scholarly journals SynTracker: a synteny based tool for tracking microbial strains

2021 ◽  
Author(s):  
Hagay Enav ◽  
Ruth E. Ley

AbstractIn the human gut microbiome, specific strains emerge due to within-host evolution and can occasionally be transferred to or from other hosts. Phenotypic variance among such strains can have implications for strain transmission and interaction with the host. Surveilling strains of the same species, within and between individuals, can further our knowledge about the way in which microbial diversity is generated and maintained in host populations. Existing methods to estimate the biological relatedness of similar strains usually rely on either detection of single nucleotide polymorphisms (SNP), which may include sequencing errors, or on the analysis of pangenomes, which can be limited by the requirement for extensive gene databases. To complement existing methods, we developed SynTracker. This strain-comparison tool is based on synteny comparisons between strains, or the comparison of the arrangement of sequence blocks in two homologous genomic regions in pairs of metagenomic assemblies or genomes. Our method is executed in a species-specific manner, has a low sensitivity to SNPs, does not require a pre-existing database, and can correctly resolve strains using complete or draft genomes and metagenomic samples using <5% of the genome length. When applied to metagenomic datasets, we detected person-specific strains with an average sensitivity of 97% and specificity of 99%, and strain-sharing events in mother-infant pairs. SynTracker can be used to study the population structure of specific microbial species between and within environments, to identify evolutionary trajectories in longitudinal datasets, and to further understanding of strain sharing networks.

Author(s):  
Gloria Pérez-Rubio ◽  
Luis Alberto López-Flores ◽  
Ana Paula Cupertino ◽  
Francisco Cartujano-Barrera ◽  
Luz Myriam Reynales-Shigematsu ◽  
...  

Previous studies have identified variants in genes encoding proteins associated with the degree of addiction, smoking onset, and cessation. We aimed to describe thirty-one single nucleotide polymorphisms (SNPs) in seven candidate genomic regions spanning six genes associated with tobacco-smoking in a cross-sectional study from two different interventions for quitting smoking: (1) thirty-eight smokers were recruited via multimedia to participate in e-Decídete! program (e-Dec) and (2) ninety-four attended an institutional smoking cessation program on-site. SNPs genotyping was done by real-time PCR using TaqMan probes. The analysis of alleles and genotypes was carried out using the EpiInfo v7. on-site subjects had more years smoking and tobacco index than e-Dec smokers (p < 0.05, both); in CYP2A6 we found differences in the rs28399433 (p < 0.01), the e-Dec group had a higher frequency of TT genotype (0.78 vs. 0.35), and TG genotype frequency was higher in the on-site group (0.63 vs. 0.18), same as GG genotype (0.03 vs. 0.02). Moreover, three SNPs in NRXN1, two in CHRNA3, and two in CHRNA5 had differences in genotype frequencies (p < 0.01). Cigarettes per day were different (p < 0.05) in the metabolizer classification by CYP2A6 alleles. In conclusion, subjects attending a mobile smoking cessation intervention smoked fewer cigarettes per day, by fewer years, and by fewer cumulative pack-years. There were differences in the genotype frequencies of SNPs in genes related to nicotine metabolism and nicotine dependence. Slow metabolizers smoked more cigarettes per day than intermediate and normal metabolizers.


2006 ◽  
Vol 04 (03) ◽  
pp. 639-647 ◽  
Author(s):  
ELEAZAR ESKIN ◽  
RODED SHARAN ◽  
ERAN HALPERIN

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .


Animals ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 170 ◽  
Author(s):  
Zengkui Lu ◽  
Yaojing Yue ◽  
Chao Yuan ◽  
Jianbin Liu ◽  
Zhiqiang Chen ◽  
...  

Body weight is an important economic trait for sheep and it is vital for their successful production and breeding. Therefore, identifying the genomic regions and biological pathways that contribute to understanding variability in body weight traits is significant for selection purposes. In this study, the genome-wide associations of birth, weaning, yearling, and adult weights of 460 fine-wool sheep were determined using resequencing technology. The results showed that 113 single nucleotide polymorphisms (SNPs) reached the genome-wide significance levels for the four body weight traits and 30 genes were annotated effectively, including AADACL3, VGF, NPC1, and SERPINA12. The genes annotated by these SNPs significantly enriched 78 gene ontology terms and 25 signaling pathways, and were found to mainly participate in skeletal muscle development and lipid metabolism. These genes can be used as candidate genes for body weight in sheep, and provide useful information for the production and genomic selection of Chinese fine-wool sheep.


2009 ◽  
Vol 49 (7) ◽  
pp. 558 ◽  
Author(s):  
William Barendse ◽  
Rowan J. Bunch ◽  
Blair E. Harrison

An important step in the localisation of quantitative trait loci is the confirmation of trait-marker associations in independent studies. In this report, we test three single nucleotide polymorphisms (SNP) of two genes for associations to intramuscular fat (IMF) measurements in cattle. We genotyped SNP of carboxypeptidase E (CPE) and ccaat/enhancer binding protein, α (CEBPA) in a sample of a total of 813 cattle of taurine, composite and indicine breeds. All three polymorphisms showed significant differences between breeds, with the widest range found in CEBPA:g.271A > C where the A allele frequency ranged from P = 0.07 in Brahman to 0.88 in Shorthorn. The taurine breeds showed high linkage disequilibrium between the pair of CPE SNP, with all four breeds showing r2 = 1.0. The Brahman and Santa Gertrudis showed r2 ≤ 0.17. Both CPE:g.445C > T and CPE:g.601C > T SNP showed significant allele substitution effects to IMF in animals of taurine ancestry, with an allele substitution effect of α = 0.22, P = 0.020 for CPE:g.445C > T, explaining 0.4% of the phenotypic variance.


2021 ◽  
Author(s):  
Jyun-Hong Lin ◽  
Liang-Chi Chen ◽  
Shu-Qi Yu ◽  
Yao-Ting Huang

AbstractLong-read phasing has been used for reconstructing diploid genomes, improving variant calling, and resolving microbial strains in metagenomics. However, the phasing blocks of existing methods are broken by large Structural Variations (SVs), and the efficiency is unsatisfactory for population-scale phasing. This paper presents an ultra-fast algorithm, LongPhase, which can simultaneously phase single nucleotide polymorphisms (SNPs) and SVs of a human genome in ∼10-20 minutes, 10x faster than the state-of-the-art WhatsHap and Margin. In particular, LongPhase produces much larger phased blocks at almost chromosome level with only long reads (N50=26Mbp). We demonstrate that LongPhase combined with Nanopore is a cost-effective approach for providing chromosome-scale phasing without the need for additional trios, chromosome-conformation, and single-cell strand-seq data.


2021 ◽  
Author(s):  
Yu-Ming Hsu ◽  
Matthieu Falque ◽  
Olivier Martin

In essentially all species where meiotic crossovers have been studied, they occur preferentially in open chromatin, typically near gene promoters and to a lesser extent at the end of genes. Here, in the case of Arabidopsis thaliana, we unveil further trends arising when one considers contextual information, namely summarized epigenetic status, size of underlying genomic regions and degree of divergence between homologs. For instance we find that intergenic recombination rate is reduced if those regions are less than 1.5 kb in size. Furthermore, we propose that the presence of single nucleotide polymorphisms is a factor driving enhanced crossover rate compared to when homologous sequences are identical, in agreement with previous works comparing rates in homozygous and heterozygous blocks. Lastly, by integrating these different factors, we produce a quantitative and predictive model of the recombination landscape that reproduces much of the experimental variation.


Author(s):  
Eric Robert Page ◽  
Robert E. Nurse ◽  
Sydney Meloche ◽  
Kerry Bosveld ◽  
Christopher Grainger ◽  
...  

Palmer amaranth is one of the most economically important and widespread weeds of arable land in the United States. Although no populations are currently known to exist in Canada, its distribution has expanded northward such that it is present in many of the States bordering Canada and multiple pathways exist for its introduction. In this short communication we report on the transport of viable Palmer amaranth seed on imported sweet potato slips. A reproductive pair of Palmer amaranth seedlings were identified from soil accompanying imported sweet potato slips in 2018. Identification was confirmed using species specific single nucleotide polymorphisms.


2021 ◽  
Vol 12 ◽  
Author(s):  
Marie-Christine Bartens ◽  
Amanda J. Gibson ◽  
Graham J. Etherington ◽  
Federica Di Palma ◽  
Angela Holder ◽  
...  

Recent evidence suggests that several cattle breeds may be more resistant to infection with the zoonotic pathogen Mycobacterium bovis. Our data presented here suggests that the response to mycobacterial antigens varies in macrophages generated from Brown Swiss (BS) and Holstein Friesian (HF) cattle, two breeds belonging to the Bos taurus family. Whole genome sequencing of the Brown Swiss genome identified several potential candidate genes, in particular Toll-like Receptor-2 (TLR2), a pattern recognition receptor (PRR) that has previously been described to be involved in mycobacterial recognition. Further investigation revealed single nucleotide polymorphisms (SNP) in TLR2 that were identified between DNA isolated from cells of BS and HF cows. Interestingly, one specific SNP, H326Q, showed a different genotype frequency in two cattle subspecies, Bos (B.) taurus and Bos indicus. Cloning of the TLR2 gene and subsequent gene-reporter and chemokine assays revealed that this SNP, present in BS and Bos indicus breeds, resulted in a significantly higher response to mycobacterial antigens as well as tri-acylated lipopeptide ligands in general. Comparing wild-type and H326Q containing TLR2 responses, wild-type bovine TLR2 response showed clear, diminished mycobacterial antigen responses compared to human TLR2, however bovine TLR2 responses containing H326Q were found to be partially recovered compared to human TLR2. The creation of human:bovine TLR2 chimeras increased the response to mycobacterial antigens compared to the full-length bovine TLR2, but significantly reduced the response compared to the full-length human TLR2. Thus, our data, not only present evidence that TLR2 is a major PRR in the mammalian species-specific response to mycobacterial antigens, but furthermore, that there are clear differences between the response seen in different cattle breeds, which may contribute to their enhanced or reduced susceptibility to mycobacterial infection.


2017 ◽  
Author(s):  
Débora Y. C. Brandt ◽  
Jônatas César ◽  
Jérôme Goudet ◽  
Diogo Meyer

ABSTRACTBalancing selection is defined as a class of selective regimes that maintain polymorphism above what is expected under neutrality. Theory predicts that balancing selection reduces population differentiation, as measured by FST. However, balancing selection regimes in which different sets of alleles are maintained in different populations could increase population differentiation. To tackle this issue, we investigated population differentiation at the HLA genes, which constitute the most striking example of balancing selection in humans. We found that population differentiation of single nucleotide polymorphisms (SNPs) at the HLA genes is on average lower than that of SNPs in other genomic regions. However, this result depends on accounting for the differences in allele frequency between selected and putatively neutral sites. Our finding of reduced differentiation at SNPs within HLA genes suggests a predominant role of shared selective pressures among populations at a global scale. However, in pairs of closely related populations, where genome-wide differentiation is low, differentiation at HLA is higher than in other genomic regions. This pattern was reproduced in simulations of overdominant selection. We conclude that population differentiation at the HLA genes is generally lower than genome-wide, but it may be higher for recently diverged population pairs, and that this pattern can be explained by a simple overdominance regime.


Sign in / Sign up

Export Citation Format

Share Document