scholarly journals Gene Content Evolution in the Arthropods

2018 ◽  
Author(s):  
Gregg W.C. Thomas ◽  
Elias Dohmen ◽  
Daniel S.T. Hughes ◽  
Shwetha C. Murali ◽  
Monica Poelchau ◽  
...  

AbstractBackgroundArthropods comprise the largest and most diverse phylum on Earth and play vital roles in nearly every ecosystem. Their diversity stems in part from variations on a conserved body plan, resulting from and recorded in adaptive changes in the genome. Dissection of the genomic record of sequence change enables broad questions regarding genome evolution to be addressed, even across hyper-diverse taxa within arthropods.ResultsUsing 76 whole genome sequences representing 21 orders spanning more than 500 million years of arthropod evolution, we document changes in gene and protein domain content and provide temporal and phylogenetic context for interpreting these innovations. We identify many novel gene families that arose early in the evolution of arthropods and during the diversification of insects into modern orders. We reveal unexpected variation in patterns of DNA methylation across arthropods and examples of gene family and protein domain evolution coincident with the appearance of notable phenotypic and physiological adaptations such as flight, metamorphosis, sociality and chemoperception.ConclusionsThese analyses demonstrate how large-scale comparative genomics can provide broad new insights into the genotype to phenotype map and generate testable hypotheses about the evolution of animal diversity.

2017 ◽  
Vol 73 (8) ◽  
pp. 628-640 ◽  
Author(s):  
Su Datt Lam ◽  
Sayoni Das ◽  
Ian Sillitoe ◽  
Christine Orengo

Computational modelling of proteins has been a major catalyst in structural biology. Bioinformatics groups have exploited the repositories of known structures to predict high-quality structural models with high efficiency at low cost. This article provides an overview of comparative modelling, reviews recent developments and describes resources dedicated to large-scale comparative modelling of genome sequences. The value of subclustering protein domain superfamilies to guide the template-selection process is investigated. Some recent cases in which structural modelling has aided experimental work to determine very large macromolecular complexes are also cited.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Habiba S. AlSafar ◽  
Mariam Al-Ali ◽  
Gihan Daw Elbait ◽  
Mustafa H. Al-Maini ◽  
Dymitr Ruta ◽  
...  

Abstract Whole Genome Sequencing (WGS) provides an in depth description of genome variation. In the era of large-scale population genome projects, the assembly of ethnic-specific genomes combined with mapping human reference genomes of underrepresented populations has improved the understanding of human diversity and disease associations. In this study, for the first time, whole genome sequences of two nationals of the United Arab Emirates (UAE) at >27X coverage are reported. The two Emirati individuals were predominantly of Central/South Asian ancestry. An in-house customized pipeline using BWA, Picard followed by the GATK tools to map the raw data from whole genome sequences of both individuals was used. A total of 3,994,521 variants (3,350,574 Single Nucleotide Polymorphisms (SNPs) and 643,947 indels) were identified for the first individual, the UAE S001 sample. A similar number of variants, 4,031,580 (3,373,501 SNPs and 658,079 indels), were identified for UAE S002. Variants that are associated with diabetes, hypertension, increased cholesterol levels, and obesity were also identified in these individuals. These Whole Genome Sequences has provided a starting point for constructing a UAE reference panel which will lead to improvements in the delivery of precision medicine, quality of life for affected individuals and a reduction in healthcare costs. The information compiled will likely lead to the identification of target genes that could potentially lead to the development of novel therapeutic modalities.


2016 ◽  
Vol 4 (5) ◽  
Author(s):  
Jennifer Ronholm ◽  
Nicholas Petronella ◽  
Sandeep Tamber

The diversity of the genus Salmonella is reflected in the physiological adaptations used by its members in response to stressors such as high pressure. Here we report the draft whole genome sequences of 11 Salmonella enterica strains, five sensitive strains and six demonstrating high levels of pressure resistance.


2018 ◽  
Vol 6 (3) ◽  
Author(s):  
Sara Lomonaco ◽  
Silvia Gallina ◽  
Virginia Filipello ◽  
Maria Sanchez Leon ◽  
George John Kastanis ◽  
...  

ABSTRACT Listeriosis outbreaks are frequently multistate/multicountry outbreaks, underlining the importance of molecular typing data for several diverse and well-characterized isolates. Large-scale whole-genome sequencing studies on Listeria monocytogenes isolates from non-U.S. locations have been limited. Herein, we describe the draft genome sequences of 510 L. monocytogenes isolates from northern Italy from different sources.


2022 ◽  
Author(s):  
Patrick Hüther ◽  
Jörg Hagmann ◽  
Adam Nunn ◽  
Ioanna Kakoulidou ◽  
Rahul Pisupati ◽  
...  

Whole-genome bisulfite sequencing (WGBS) is the standard method for profiling DNA methylation at single-nucleotide resolution. Many WGBS-based studies aim to identify biologically relevant loci that display differential methylation between genotypes, treatment groups, tissues, or developmental stages. Over the years, different tools have been developed to extract differentially methylated regions (DMRs) from whole-genome data. Often, such tools are built upon assumptions from mammalian data and do not consider the substantially more complex and variable nature of plant DNA methylation. Here, we present MethylScore, a pipeline to analyze WGBS data and to account for plant-specific DNA methylation properties. MethylScore processes data from genomic alignments to DMR output and is designed to be usable by novice and expert users alike. It uses an unsupervised machine learning approach to segment the genome by classification into states of high and low methylation, substantially reducing the number of necessary statistical tests while increasing the signal-to-noise ratio and the statistical power. We show how MethylScore can identify DMRs from hundreds of samples and how its data-driven approach can stratify associated samples without prior information. We identify DMRs in the A. thaliana 1001 Genomes dataset to unveil known and unknown genotype-epigenotype associations. MethylScore is an accessible pipeline for plant WGBS data, with unprecedented features for DMR calling in small- and large-scale datasets; it is built as a Nextflow pipeline and its source code is available at https://github.com/Computomics/MethylScore.


2016 ◽  
Vol 94 (suppl_5) ◽  
pp. 146-146
Author(s):  
D. M. Bickhart ◽  
L. Xu ◽  
J. L. Hutchison ◽  
J. B. Cole ◽  
D. J. Null ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document