scholarly journals Mining underutilized whole-genome sequencing projects to improve 16S rRNA databases

2021 ◽  
Author(s):  
Ben Nolan ◽  
Florence Abram ◽  
Fiona Brennan ◽  
Ashleigh Holmes ◽  
Vincent O’Flaherty ◽  
...  

AbstractCurrent approaches to interpreting 16S rDNA amplicon data are hampered by several factors. Among these are database inaccuracy or incompleteness, sequencing error, and biased DNA/RNA extraction. Existing 16S rRNA databases source the majority of sequences from deposited amplicon sequences, draft genomes, and complete genomes. Most of the draft genomes available are assembled from short reads. However, repeated ribosomal regions are notoriously difficult to assemble well from short reads, and as a consequence the short-read-assembled 16S rDNA region may be an amalgamation of different loci within the genome. This complicates high-resolution community analysis, as a draft genome’s 16S rDNA sequence may be a chimera of multiple loci; in such cases, the draft-derived sequences in a database may not represent a 16S rRNA sequence as it occurs in biology. We present Focus16, a pipeline for improving 16S rRNA databases by mining NCBI’s Sequence Read Archive for whole-genome sequencing runs that could be reassembled to yield additional 16S rRNA sequences. Using riboSeed (a genome assembly tool for correcting rDNA misassembly), Focus16 provides a way to augment 16S rRNA databases with high-quality re-assembled sequences. In this study, we augmented the widely-used SILVA 16S rRNA database with the novel sequences disclosed by Focus16 and re-processed amplicon sequences from several benchmarking datasets with DADA2. Using this augmented SILVA database increased the number of amplicon sequence variants that could be assigned taxonomic annotations. Further, fine-scale classification was improved by revealing ambiguities. We observed, for example, that amplicon sequence variants (ASVs) may be assigned to a specific genus where Focus16-correction would indicate that the ASV is represented in two or more genera. Thus, we demonstrate that improvements can be made to taxonomic classification by incorporating these carefully re-assembled 16S rRNA sequences, and we invite the community to expand our work to augment existing 16S rRNA reference databases such as SILVA, GreenGenes, and RDP.

2020 ◽  
Vol 21 (11) ◽  
Author(s):  
Kenny Lischer ◽  
ANANDA BAGUS RICHKY DIGDAYA PUTRA ◽  
Brian Wirawan Guslianto ◽  
Forbes Avila ◽  
Sarah Grace Sitorus ◽  
...  

Abstract. Lischer K, Putra ABRD, Guslianto BW, Avilla F, Sitorus SG, Nugraha Y, Sarmoko. 2020. Short Communication: The emergence and rise of indigenous thermophilic bacteria exploration from hot springs in Indonesia. Biodiversitas 21: 5474-5481. Indonesia is an archipelagic country located in the pacific ring of fire, and is estimated to cause numerous hot springs spread across the country. In addition, small living microbes have been explored in these locations since 1985. These microbes possess the ability to survive in areas with high temperature (more than 40oC-90oC), and are therefore termed thermophiles. Hence, massive explorations have been conducted on Java island and other unexplored areas at Sumatra to Papua in New Guinea islands. Moreover, a total of 71 hot springs characterized by the presence of thermophilic bacteria have been explored in Indonesia. These investigations ensue with various approaches, including through conventional and microbiological, 16S rRNA, as well as whole-genome sequencing methods. In addition to species exploration, the application of thermophiles has become a topic of interest from 1999, especially based on thermostable enzymes with the capacity to maintain activity at high-temperature conditions. These include amylase, protease, lipase, xylanase, esterase, and cellulase as the most common isolated form, which indicates the existence of significant extractable potentials. Hence, there is a need for further research in terms of both exploration and application purposes.


2019 ◽  
Vol 9 (10) ◽  
pp. 3213-3223 ◽  
Author(s):  
Giovanna Cáceres ◽  
María E. López ◽  
María I. Cádiz ◽  
Grazyella M. Yoshida ◽  
Ana Jedlicki ◽  
...  

Nile tilapia (Oreochromis niloticus) is one of the most cultivated and economically important species in world aquaculture. Intensive production promotes the use of monosex animals, due to an important dimorphism that favors male growth. Currently, the main mechanism to obtain all-male populations is the use of hormones in feeding during larval and fry phases. Identifying genomic regions associated with sex determination in Nile tilapia is a research topic of great interest. The objective of this study was to identify genomic variants associated with sex determination in three commercial populations of Nile tilapia. Whole-genome sequencing of 326 individuals was performed, and a total of 2.4 million high-quality bi-allelic single nucleotide polymorphisms (SNPs) were identified after quality control. A genome-wide association study (GWAS) was conducted to identify markers associated with the binary sex trait (males = 1; females = 0). A mixed logistic regression GWAS model was fitted and a genome-wide significant signal comprising 36 SNPs, spanning a genomic region of 536 kb in chromosome 23 was identified. Ten out of these 36 genetic variants intercept the anti-Müllerian (Amh) hormone gene. Other significant SNPs were located in the neighboring Amh gene region. This gene has been strongly associated with sex determination in several vertebrate species, playing an essential role in the differentiation of male and female reproductive tissue in early stages of development. This finding provides useful information to better understand the genetic mechanisms underlying sex determination in Nile tilapia.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Gerson A. Oliveira Júnior ◽  
Daniel J. A. Santos ◽  
Aline S. M. Cesar ◽  
Solomon A. Boison ◽  
Ricardo V. Ventura ◽  
...  

Abstract Background Impaired fertility in cattle limits the efficiency of livestock production systems. Unraveling the genetic architecture of fertility traits would facilitate their improvement by selection. In this study, we characterized SNP chip haplotypes at QTL blocks then used whole-genome sequencing to fine map genomic regions associated with reproduction in a population of Nellore (Bos indicus) heifers. Methods The dataset comprised of 1337 heifers genotyped using a GeneSeek® Genomic Profiler panel (74677 SNPs), representing the daughters from 78 sires. After performing marker quality control, 64800 SNPs were retained. Haplotypes carried by each sire at six previously identified QTL on BTAs 5, 14 and 18 for heifer pregnancy and BTAs 8, 11 and 22 for antral follicle count were constructed using findhap software. The significance of the contrasts between the effects of every two paternally-inherited haplotype alleles were used to identify sires that were heterozygous at each QTL. Whole-genome sequencing data localized to the haplotypes from six sires and 20 other ancestors were used to identify sequence variants that were concordant with the haplotype contrasts. Enrichment analyses were applied to these variants using KEGG and MeSH libraries. Results A total of six (BTA 5), six (BTA 14) and five (BTA 18) sires were heterozygous for heifer pregnancy QTL whereas six (BTA 8), fourteen (BTA 11), and five (BTA 22) sires were heterozygous for number of antral follicles’ QTL. Due to inadequate representation of many haplotype alleles in the sequenced animals, fine mapping analysis could only be reliably performed for the QTL on BTA 5 and 14, which had 641 and 3733 concordant candidate sequence variants, respectively. The KEGG “Circadian rhythm” and “Neurotrophin signaling pathway” were significantly associated with the genes in the QTL on BTA 5 whereas 32 MeSH terms were associated with the QTL on BTA 14. Among the concordant sequence variants, 0.2% and 0.3% were classified as missense variants for BTAs 5 and 14, respectively, highlighting the genes MTERF2, RTMB, ENSBTAG00000037306 (miRNA), ENSBTAG00000040351, PRKDC, and RGS20. The potential causal mutations found in the present study were associated with biological processes such as oocyte maturation, embryo development, placenta development and response to reproductive hormones. Conclusions The identification of heterozygous sires by positionally phasing SNP chip data and contrasting haplotype effects for previously detected QTL can be used for fine mapping to identify potential causal mutations and candidate genes. Genomic variants on genes MTERF2, RTBC, miRNA ENSBTAG00000037306, ENSBTAG00000040351, PRKDC, and RGS20, which are known to have influence on reproductive biological processes, were detected.


2020 ◽  
Author(s):  
Alexander Smetanin ◽  
Nikita Moshkov ◽  
Tatiana V. Tatarinova

AbstractSummaryWe developed PyLAE - a new tool for determining local ancestry along a genome using whole-genome sequencing data or high-density genotyping experiments. PyLAE can process an arbitrarily large number of ancestral populations (with or without an informative prior). Since PyLAE does not involve estimation of many parameters, it can process thousands of genomes within a day. Computational efficiency, straightforward presentation of results, and an ease of installation makes PyLAE a useful tool to study admixed populations.Availability and implementationThe source code and installation manual are available at https://github.com/smetam/pylae.


2021 ◽  
Author(s):  
Severin Einspanier ◽  
Tamara Susanto ◽  
Nicole Metz ◽  
Pieter J. Wolters ◽  
Vivianne G.A.A. Vleeshouwers ◽  
...  

Early blight of potato is caused by the fungal pathogen Alternaria solani and is an increasing problem worldwide. The primary strategy to control the disease is applying fungicides such as succinate dehydrogenase inhibitors (SDHI). SDHI-resistant strains, showing reduced sensitivity to treatments, appeared in Germany in 2013, five years after introduction of SDHIs. Two primary mutations in the Sdh complex (SdhB-H278Y and SdhC-H134R) have been frequently found throughout Europe. How these resistances arose and spread, and whether they are linked to other genomic features, remains unknown. We performed whole-genome sequencing for A. solani isolates from potato fields across Europe (Germany, Sweden, Belgium, and Serbia) to better understand the pathogen's genetic diversity in general and understand the development and spread of the genetic mutations that lead to SDHI resistance. We used ancestry analysis and phylogenetics to determine the genetic background of 48 isolates. The isolates can be grouped into 7 genotypes. These genotypes do not show a geographical pattern but appear spread throughout Europe. The Sdh mutations appear in different genetic backgrounds, suggesting they arose independently, and the observed admixtures might indicate a higher adaptive potential in the fungus than previously thought. Our research gives insights into the genetic diversity of A. solani on a genome level. The mixed occurrence of different genotypes and apparent admixture in the populations indicate higher genomic complexity than anticipated. The conclusion that SDHI tolerance arose multiple times independently has important implications for future fungicide resistance management strategies. These should not solely focus on preventing the spread of isolates between locations but also on limiting population size and the selective pressure posed by fungicides in a given field to avoid the rise of new mutations in other genetic backgrounds.


2021 ◽  
Vol 12 ◽  
Author(s):  
Annika Brinkmann ◽  
Sophie-Luisa Ulm ◽  
Steven Uddin ◽  
Sophie Förster ◽  
Dominique Seifert ◽  
...  

Since the emergence of the Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) in December 2019, the scientific community has been sharing data on epidemiology, diagnostic methods, and whole-genomic sequences almost in real time. The latter have already facilitated phylogenetic analyses, transmission chain tracking, protein modeling, the identification of possible therapeutic targets, timely risk assessment, and identification of novel variants. We have established and evaluated an amplification-based approach for whole-genome sequencing of SARS-CoV-2. It can be used on the miniature-sized and field-deployable sequencing device Oxford Nanopore MinION, with sequencing library preparation time of 10 min. We show that the generation of 50,000 total reads per sample is sufficient for a near complete coverage (>90%) of the SARS-CoV-2 genome directly from patient samples even if virus concentration is low (Ct 35, corresponding to approximately 5 genome copies per reaction). For patient samples with high viral load (Ct 18–24), generation of 50,000 reads in 1–2 h was shown to be sufficient for a genome coverage of >90%. Comparison to Illumina data reveals an accuracy that suffices to identify virus mutants. AmpliCoV can be applied whenever sequence information on SARS-CoV-2 is required rapidly, for instance for the identification of circulating virus mutants.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Rueben G. Das ◽  
Doreen Becker ◽  
Vidhya Jagannathan ◽  
Orly Goldstein ◽  
Evelyn Santana ◽  
...  

Abstract Congenital stationary night blindness (CSNB), in the complete form, is caused by dysfunctions in ON-bipolar cells (ON-BCs) which are secondary neurons of the retina. We describe the first disease causative variant associated with CSNB in the dog. A genome-wide association study using 12 cases and 11 controls from a research colony determined a 4.6 Mb locus on canine chromosome 32. Subsequent whole-genome sequencing identified a 1 bp deletion in LRIT3 segregating with CSNB. The canine mutant LRIT3 gives rise to a truncated protein with unaltered subcellular expression in vitro. Genetic variants in LRIT3 have been associated with CSNB in patients although there is limited evidence regarding its apparently critical function in the mGluR6 pathway in ON-BCs. We determine that in the canine CSNB retina, the mutant LRIT3 is correctly localized to the region correlating with the ON-BC dendritic tips, albeit with reduced immunolabelling. The LRIT3-CSNB canine model has direct translational potential enabling studies to help understand the CSNB pathogenesis as well as to develop new therapies targeting the secondary neurons of the retina.


BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
E. A. Hisey ◽  
H. Hermans ◽  
Z. T. Lounsberry ◽  
F. Avila ◽  
R. A. Grahn ◽  
...  

Abstract Background Distichiasis, an ocular disorder in which aberrant cilia (eyelashes) grow from the opening of the Meibomian glands of the eyelid, has been reported in Friesian horses. These misplaced cilia can cause discomfort, chronic keratitis, and corneal ulceration, potentially impacting vision due to corneal fibrosis, or, if secondary infection occurs, may lead to loss of the eye. Friesian horses represent the vast majority of reported cases of equine distichiasis, and as the breed is known to be affected with inherited monogenic disorders, this condition was hypothesized to be a simply inherited Mendelian trait. Results A genome wide association study (GWAS) was performed using the Axiom 670 k Equine Genotyping array (MNEc670k) utilizing 14 cases and 38 controls phenotyped for distichiasis. An additive single locus mixed linear model (EMMAX) approach identified a 1.83 Mb locus on ECA5 and a 1.34 Mb locus on ECA13 that reached genome-wide significance (pcorrected = 0.016 and 0.032, respectively). Only the locus on ECA13 withstood replication testing (p = 1.6 × 10− 5, cases: n = 5 and controls: n = 37). A 371 kb run of homozygosity (ROH) on ECA13 was found in 13 of the 14 cases, providing evidence for a recessive mode of inheritance. Haplotype analysis (hapQTL) narrowed the region of association on ECA13 to 163 kb. Whole-genome sequencing data from 3 cases and 2 controls identified a 16 kb deletion within the ECA13 associated haplotype (ECA13:g.178714_195130del). Functional annotation data supports a tissue-specific regulatory role of this locus. This deletion was associated with distichiasis, as 18 of the 19 cases were homozygous (p = 4.8 × 10− 13). Genotyping the deletion in 955 horses from 54 different breeds identified the deletion in only 11 non-Friesians, all of which were carriers, suggesting that this could be causal for this Friesian disorder. Conclusions This study identified a 16 kb deletion on ECA13 in an intergenic region that was associated with distichiasis in Friesian horses. Further functional analysis in relevant tissues from cases and controls will help to clarify the precise role of this deletion in normal and abnormal eyelash development and investigate the hypothesis of incomplete penetrance.


2017 ◽  
Vol 94 (2) ◽  
pp. 151-157 ◽  
Author(s):  
Jason C Kwong ◽  
Eric P F Chow ◽  
Kerrie Stevens ◽  
Timothy P Stinear ◽  
Torsten Seemann ◽  
...  

ObjectivesDrug-resistant Neisseria gonorrhoeae are now a global public health threat. Direct transmission of antibiotic-resistant gonococci between individuals has been proposed as a driver for the increased transmission of resistance, but direct evidence of such transmission is limited. Whole-genome sequencing (WGS) has superior resolution to investigate outbreaks and disease transmission compared with traditional molecular typing methods such as multilocus sequence typing (MLST) and N. gonorrhoeae multiantigen sequence (NG-MAST). We therefore aimed to systematically investigate the transmission of N. gonorrhoeae between men in sexual partnerships using WGS to compare isolates and their resistance to antibiotics at a genome level.Methods458 couples from a large prospective cohort of men who have sex with men (MSM) tested for gonorrhoea together between 2005 and 2014 were included, and WGS was conducted on all isolates from couples where both men were culture-positive for N. gonorrhoeae. Resistance-determining sequences were identified from genome assemblies, and comparison of isolates between and within individuals was performed by pairwise single nucleotide polymorphism and pangenome comparisons, and in silico predictions of NG-MAST and MLST.ResultsFor 33 of 34 (97%; 95% CI 85% to 100%) couples where both partners were positive for gonorrhoea, the resistance-determining genes and mutations were identical in isolates from each partner (94 isolates in total). Resistance determinants in isolates from 23 of 23 (100%; 95% CI 86% to 100%) men with multisite infections were also identical within an individual. These partner and within-host isolates were indistinguishable by NG-MAST, MLST and whole genomic comparisons.ConclusionsThese data support the transmission of antibiotic-resistant strains between sexual partners as a key driver of resistance rates in gonorrhoea among MSM. This improved understanding of the transmission dynamics of N. gonorrhoeae between sexual partners will inform treatment and prevention guidelines.


2021 ◽  
Author(s):  
Amaia Carrion-Castillo ◽  
Sara B. Estruch ◽  
Ben Maassen ◽  
Barbara Franke ◽  
Clyde Francks ◽  
...  

AbstractDyslexia is a common heritable developmental disorder involving impaired reading abilities. Its genetic underpinnings are thought to be complex and heterogeneous, involving common and rare genetic variation. Multigenerational families segregating apparent monogenic forms of language-related disorders can provide useful entrypoints into biological pathways. In the present study, we performed a genome-wide linkage scan in a three-generational family in which dyslexia affects 14 of its 30 members and seems to be transmitted with an autosomal dominant pattern of inheritance. We identified a locus on chromosome 7q21.11 which cosegregated with dyslexia status, with the exception of two cases of phenocopy (LOD = 2.83). Whole-genome sequencing of key individuals enabled the assessment of coding and noncoding variation in the family. Two rare single-nucleotide variants (rs144517871 and rs143835534) within the first intron of the SEMA3C gene cosegregated with the 7q21.11 risk haplotype. In silico characterization of these two variants predicted effects on gene regulation, which we functionally validated for rs144517871 in human cell lines using luciferase reporter assays. SEMA3C encodes a secreted protein that acts as a guidance cue in several processes, including cortical neuronal migration and cellular polarization. We hypothesize that these intronic variants could have a cis-regulatory effect on SEMA3C expression, making a contribution to dyslexia susceptibility in this family.


Sign in / Sign up

Export Citation Format

Share Document