scholarly journals IDENTIFICATION OF GENOMIC REGIONS CARRYING A CAUSAL MUTATION IN UNORDERED GENOMES

2015 ◽  
Author(s):  
Pilar Corredor-Moreno ◽  
Ed Chalstrey ◽  
Carlos A Lugo ◽  
Dan MacLean

Whole genome sequencing using high-throughput sequencing (HTS) technologies offers powerful opportunities to study genetic variation. Mapping the mutations responsible for different phenotypes is generally an involved and time-consuming process so researchers have developed user-friendly tools for mapping-by-sequencing, yet they are not applica- ble to organisms with non-sequenced genomes. We introduce SDM (SNP Distribution Method), a reference independent method for rapid discovery of mutagen-induced muta- tions in typical forward genetic screens. SDM aims to order a disordered collection of HTS reads or contigs such that the fragment carrying the causative mutation can be identified. SDM uses typical distributions of homozygous SNPs that are linked to a phenotype-altering SNP in a non-recombinant region as a model to order the fragments. To implement and test SDM, we created model genomes with an idealised SNP density based on Arabidop- sis thaliana chromosome 1 and analysed fragments with size distribution similar to reads or contigs assembled from HTS sequencing experiments. SDM groups the contigs by their normalised SNP density and arranges them to maximise the fit to the expected SNP distribution. We tested the procedure in existing datasets by examining SNP distributions in recent out-cross and back-cross experiments in Arabidopsis thaliana backgrounds. In all the examples we analysed, homozygous SNPs were normally distributed around the causal mutation. We used the real SNP densities obtained from these experiments to prove the efficiency and accuracy of SDM. The algorithm was able to successfully identify small sized (10-100 kb) genomic regions containing the causative mutation.

2019 ◽  
Vol 47 (21) ◽  
pp. e140-e140
Author(s):  
David Wilson-Sánchez ◽  
Samuel Daniel Lup ◽  
Raquel Sarmiento-Mañús ◽  
María Rosa Ponce ◽  
José Luis Micol

Abstract Forward genetic screens have successfully identified many genes and continue to be powerful tools for dissecting biological processes in Arabidopsis and other model species. Next-generation sequencing technologies have revolutionized the time-consuming process of identifying the mutations that cause a phenotype of interest. However, due to the cost of such mapping-by-sequencing experiments, special attention should be paid to experimental design and technical decisions so that the read data allows to map the desired mutation. Here, we simulated different mapping-by-sequencing scenarios. We first evaluated which short-read technology was best suited for analyzing gene-rich genomic regions in Arabidopsis and determined the minimum sequencing depth required to confidently call single nucleotide variants. We also designed ways to discriminate mutagenesis-induced mutations from background Single Nucleotide Polymorphisms in mutants isolated in Arabidopsis non-reference lines. In addition, we simulated bulked segregant mapping populations for identifying point mutations and monitored how the size of the mapping population and the sequencing depth affect mapping precision. Finally, we provide the computational basis of a protocol that we already used to map T-DNA insertions with paired-end Illumina-like reads, using very low sequencing depths and pooling several mutants together; this approach can also be used with single-end reads as well as to map any other insertional mutagen. All these simulations proved useful for designing experiments that allowed us to map several mutations in Arabidopsis.


2021 ◽  
Vol 22 (S2) ◽  
Author(s):  
Daniele D’Agostino ◽  
Pietro Liò ◽  
Marco Aldinucci ◽  
Ivan Merelli

Abstract Background High-throughput sequencing Chromosome Conformation Capture (Hi-C) allows the study of DNA interactions and 3D chromosome folding at the genome-wide scale. Usually, these data are represented as matrices describing the binary contacts among the different chromosome regions. On the other hand, a graph-based representation can be advantageous to describe the complex topology achieved by the DNA in the nucleus of eukaryotic cells. Methods Here we discuss the use of a graph database for storing and analysing data achieved by performing Hi-C experiments. The main issue is the size of the produced data and, working with a graph-based representation, the consequent necessity of adequately managing a large number of edges (contacts) connecting nodes (genes), which represents the sources of information. For this, currently available graph visualisation tools and libraries fall short with Hi-C data. The use of graph databases, instead, supports both the analysis and the visualisation of the spatial pattern present in Hi-C data, in particular for comparing different experiments or for re-mapping omics data in a space-aware context efficiently. In particular, the possibility of describing graphs through statistical indicators and, even more, the capability of correlating them through statistical distributions allows highlighting similarities and differences among different Hi-C experiments, in different cell conditions or different cell types. Results These concepts have been implemented in NeoHiC, an open-source and user-friendly web application for the progressive visualisation and analysis of Hi-C networks based on the use of the Neo4j graph database (version 3.5). Conclusion With the accumulation of more experiments, the tool will provide invaluable support to compare neighbours of genes across experiments and conditions, helping in highlighting changes in functional domains and identifying new co-organised genomic compartments.


2019 ◽  
Author(s):  
Elizabeth R. Cebul ◽  
Ian G. McLachlan ◽  
Maxwell G. Heiman

ABSTRACTDendrites develop elaborate morphologies in concert with surrounding glia, but the molecules that coordinate dendrite and glial morphogenesis are mostly unknown.C. elegansoffers a powerful model for identifying such factors. Previous work in this system examined dendrites and glia that develop within epithelia, similar to mammalian sense organs. Here, we focus on the neurons BAG and URX, which are not part of an epithelium but instead form membranous attachments to a single glial cell at the nose, reminiscent of dendrite-glia contacts in the mammalian brain. We show that these dendrites develop by retrograde extension, in which the nascent dendrite endings anchor to the presumptive nose and then extend by stretch during embryo elongation. Using forward genetic screens, we find that dendrite development requires the adhesion protein SAX-7/L1CAM and the cytoplasmic protein GRDN-1/CCDC88C to anchor dendrite endings at the nose. SAX-7 acts in neurons and glia, while GRDN-1 acts in glia to non-autonomously promote dendrite extension. Thus, this work shows how glial factors can help to shape dendrites, and identifies a novel molecular mechanism for dendrite growth by retrograde extension.


PLoS ONE ◽  
2012 ◽  
Vol 7 (6) ◽  
pp. e39651 ◽  
Author(s):  
Lidia M. Duncan ◽  
Richard T. Timms ◽  
Eszter Zavodszky ◽  
Florencia Cano ◽  
Gordon Dougan ◽  
...  

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Marius Welzel ◽  
Anja Lange ◽  
Dominik Heider ◽  
Michael Schwarz ◽  
Bernd Freisleben ◽  
...  

Abstract Background Sequencing of marker genes amplified from environmental samples, known as amplicon sequencing, allows us to resolve some of the hidden diversity and elucidate evolutionary relationships and ecological processes among complex microbial communities. The analysis of large numbers of samples at high sequencing depths generated by high throughput sequencing technologies requires efficient, flexible, and reproducible bioinformatics pipelines. Only a few existing workflows can be run in a user-friendly, scalable, and reproducible manner on different computing devices using an efficient workflow management system. Results We present Natrix, an open-source bioinformatics workflow for preprocessing raw amplicon sequencing data. The workflow contains all analysis steps from quality assessment, read assembly, dereplication, chimera detection, split-sample merging, sequence representative assignment (OTUs or ASVs) to the taxonomic assignment of sequence representatives. The workflow is written using Snakemake, a workflow management engine for developing data analysis workflows. In addition, Conda is used for version control. Thus, Snakemake ensures reproducibility and Conda offers version control of the utilized programs. The encapsulation of rules and their dependencies support hassle-free sharing of rules between workflows and easy adaptation and extension of existing workflows. Natrix is freely available on GitHub (https://github.com/MW55/Natrix) or as a Docker container on DockerHub (https://hub.docker.com/r/mw55/natrix). Conclusion Natrix is a user-friendly and highly extensible workflow for processing Illumina amplicon data.


Sign in / Sign up

Export Citation Format

Share Document