scholarly journals Construction and Characterization of Two Novel Transcriptome Assemblies in the Congeneric Porcelain Crabs Petrolisthes cinctipes and P. manimaculis

2016 ◽  
Vol 56 (6) ◽  
pp. 1092-1102 ◽  
Author(s):  
Eric J. Armstrong ◽  
Jonathon H. Stillman

Crustaceans have commonly been used as non-model systems in basic biological research, especially physiological regulation. With the recent and rapid adoption of functional genomic tools, crustaceans are increasingly becoming model systems for ecological investigations of development and evolution and for mechanistic examinations of genotype–phenotype interactions and molecular pathways of response to environmental stressors. Comparative transcriptomic approaches, however, remain constrained by a lack of sequence data in closely related crustacean taxa. We identify challenges in the use of functional genomics tools in comparative analysis among decapod crustacean in light of recent advances. We present RNA-seq data from two congeneric species of porcelain crabs (Petrolisthes cinctipes and P. manimaculis) used to construct two de novo transcriptome assemblies with ∼194K and ∼278K contigs, respectively. We characterize and contrast these assemblies and compare them to a previously generated EST sequence library for P. cinctipes. We also discuss the potential use of these data as a case-study system in the broader context of crustacean comparative transcriptomics.

2014 ◽  
Author(s):  
Kristi E Kim ◽  
Paul Peluso ◽  
Primo Baybayan ◽  
Patricia Jane Yeadon ◽  
Charles Yu ◽  
...  

Single molecule, real-time (SMRT) sequencing from Pacific Biosciences is increasingly used in many areas of biological research including de novo genome assembly, structural-variant identification, haplotype phasing, mRNA isoform discovery, and base-modification analyses. High-quality, public datasets of SMRT sequences can spur development of analytic tools that can accommodate unique characterisitcs of SMRT data (long read lengths, lack of GC or amplification bias, and a random error profile leading to high consensus accuracy). In this paper, we describe eight high-coverage SMRT sequence datasets from five organisms (Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, Arabidopsis thaliana, and Drosophila melanogaster) that have been publicly released to the general scientific community (NCBI Sequence Read Archive ID SRP040522). Data were generated using two sequencing chemistries (P4-C2 and P5-C3) on the PacBio RS II instrument. The datasets reported here can be used without restriction by the research community to generate whole-genome assemblies, test new algorithms, investigate genome structure and evolution, and identify base modifications in some of the most widely-studied model systems in biological research.


Author(s):  
Guangtu Gao ◽  
Susana Magadan ◽  
Geoffrey C Waldbieser ◽  
Ramey C Youngblood ◽  
Paul A Wheeler ◽  
...  

Abstract Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Víctor Faundes ◽  
Martin D. Jennings ◽  
Siobhan Crilly ◽  
Sarah Legraie ◽  
Sarah E. Withers ◽  
...  

AbstractThe structure of proline prevents it from adopting an optimal position for rapid protein synthesis. Poly-proline-tract (PPT) associated ribosomal stalling is resolved by highly conserved eIF5A, the only protein to contain the amino acid hypusine. We show that de novo heterozygous EIF5A variants cause a disorder characterized by variable combinations of developmental delay, microcephaly, micrognathia and dysmorphism. Yeast growth assays, polysome profiling, total/hypusinated eIF5A levels and PPT-reporters studies reveal that the variants impair eIF5A function, reduce eIF5A-ribosome interactions and impair the synthesis of PPT-containing proteins. Supplementation with 1 mM spermidine partially corrects the yeast growth defects, improves the polysome profiles and restores expression of PPT reporters. In zebrafish, knockdown eif5a partly recapitulates the human phenotype that can be rescued with 1 µM spermidine supplementation. In summary, we uncover the role of eIF5A in human development and disease, demonstrate the mechanistic complexity of EIF5A-related disorder and raise possibilities for its treatment.


2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi206-vi206
Author(s):  
Tomohiro Yamasaki ◽  
Lumin Zhang ◽  
Tyrone Dowdy ◽  
Adrian Lita ◽  
Mark Gilbert ◽  
...  

Abstract BACKGROUND Increased de novo lipogenesis is a hallmark of cancer metabolism. In this study, we interrogated the role of de novo lipogenesis in IDH1 mutated glioma’s growth and identified the key enzyme, Stearoyl-CoA desaturase 1 (SCD1) that provides this growth advantage. MATERIALS ANDMETHODS We prepared genetically engineered glioma cell lines (U251 wild-type: U251WT and U251 IDHR132H mutant: U251RH) and normal human astrocytes (empty vector induced-NHA: NHAEV and IDHR132H mutant: NHARH). Lipid metabolic analysis was conducted by using LC-MS and Raman imaging microscopy. SCD1 expression was investigated by The Cancer Genome Atlas (TCGA) data analysis and Western-blotting method. Knock-out of SCD1 was conducted by using CRISPR/Cas9 and shRNA. RESULTS Previously, we showed that IDH1 mut glioma cells have increased monounsaturated fatty acids (MUFAs). TCGA data revealed IDH mut glioma shows significantly higher SCD1 mRNA expression than wild-type glioma. Our model systems of IDH1 mut (U251RH, NHARH) showed increased expression of this enzyme compared with their wild-type counterpart. Moreover, addition of D-2HG to U251WT increased SCD1 expression. Herein, we showed that inhibition of SCD1 with CAY10566 decreased relative cell number and sphere forming capacity in a dose-dependent manner. Furthermore, addition of MUFAs were able to rescue the SCD1 inhibitor induced-cell death and sphere forming capacity. Knock out of SCD1 revealed decreased cell proliferation and sphere forming ability. Decreasing lipid content from the media did not alter the growth of these cells, suggesting that glioma cells rely on de novo lipid synthesis rather than scavenging them from the microenvironment. CONCLUSION Overexpression of IDH mutant gene altered lipid composition in U251 cells to enrich MUFA levels and we confirmed that D-2HG caused SCD1 upregulation in U251WT. We demonstrated the glioma cell growth requires SCD1 expression and the results of the present study may provide novel insights into the role of SCD1 in IDH mut gliomas growth.


2006 ◽  
Vol 361 (1475) ◽  
pp. 2045-2053 ◽  
Author(s):  
Daniel Falush ◽  
Mia Torpdahl ◽  
Xavier Didelot ◽  
Donald F Conrad ◽  
Daniel J Wilson ◽  
...  

In bacteria, DNA sequence mismatches act as a barrier to recombination between distantly related organisms and can potentially promote the cohesion of species. We have performed computer simulations which show that the homology dependence of recombination can cause de novo speciation in a neutrally evolving population once a critical population size has been exceeded. Our model can explain the patterns of divergence and genetic exchange observed in the genus Salmonella , without invoking either natural selection or geographical population subdivision. If this model was validated, based on extensive sequence data, it would imply that the named subspecies of Salmonella enterica correspond to good biological species, making species boundaries objective. However, multilocus sequence typing data, analysed using several conventional tools, provide a misleading impression of relationships within S. enterica subspecies enterica and do not provide the resolution to establish whether new species are presently being formed.


2021 ◽  
Author(s):  
Tuomo Hartonen ◽  
Teemu Kivioja ◽  
Jussi Taipale

Deep learning models have in recent years gained success in various tasks related to understanding information coded in the DNA sequence. Rapidly developing genome-wide measurement technologies provide large quantities of data ideally suited for modeling using deep learning or other powerful machine learning approaches. Although offering state-of-the art predictive performance, the predictions made by deep learning models can be difficult to understand. In virtually all biological research, the understanding of how a predictive model works is as important as the raw predictive performance. Thus interpretation of deep learning models is an emerging hot topic especially in context of biological research. Here we describe plotMI, a mutual information based model interpretation strategy that can intuitively visualize positional preferences and pairwise interactions learned by any machine learning model trained on sequence data with a defined alphabet as input. PlotMI is freely available at https://github.com/hartonen/plotMI.


2021 ◽  
Author(s):  
Julia M. Kreiner ◽  
Amalia Caballero ◽  
Stephen I. Wright ◽  
John R. Stinchcombe

The relative role of hybridization, de novo evolution, and standing variation in weed adaptation to agricultural environments is largely unknown. In Amaranthus tuberculatus, a widespread North American agricultural weed, adaptation is likely influenced by recent secondary contact and admixture of two previously isolated subspecies. We characterized the extent of adaptation and phenotypic differentiation accompanying the spread of A. tuberculatus into agricultural environments and the contribution of subspecies divergence. We generated phenotypic and whole-genome sequence data from a manipulative common garden experiment, using paired samples from natural and agricultural populations. We found strong latitudinal, longitudinal, and sex differentiation in phenotypes, and subtle differences among agricultural and natural environments that were further resolved with ancestry-based inference. The transition into agricultural environments has favoured southwestern var. rudis ancestry that leads to higher biomass and environment-specific phenotypes: increased biomass and earlier flowering under reduced water availability, and reduced plasticity in fitness-related traits. We also detected de novo adaptation to agricultural habitats independent of ancestry effects, including marginally higher biomass and later flowering in agricultural populations, and a time to germination home advantage. Therefore, the invasion of A. tuberculatus into agricultural environments has drawn on adaptive variation across multiple timescales—through both preadaptation via the preferential sorting of var. rudis ancestry and de novo local adaptation.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12129
Author(s):  
Paul E. Oluniyi ◽  
Fehintola Ajogbasile ◽  
Judith Oguzie ◽  
Jessica Uwanibe ◽  
Adeyemi Kayode ◽  
...  

Next generation sequencing (NGS)-based studies have vastly increased our understanding of viral diversity. Viral sequence data obtained from NGS experiments are a rich source of information, these data can be used to study their epidemiology, evolution, transmission patterns, and can also inform drug and vaccine design. Viral genomes, however, represent a great challenge to bioinformatics due to their high mutation rate and forming quasispecies in the same infected host, bringing about the need to implement advanced bioinformatics tools to assemble consensus genomes well-representative of the viral population circulating in individual patients. Many tools have been developed to preprocess sequencing reads, carry-out de novo or reference-assisted assembly of viral genomes and assess the quality of the genomes obtained. Most of these tools however exist as standalone workflows and usually require huge computational resources. Here we present (Viral Genomes Easily Analyzed), a Snakemake workflow for analyzing RNA viral genomes. VGEA enables users to map sequencing reads to the human genome to remove human contaminants, split bam files into forward and reverse reads, carry out de novo assembly of forward and reverse reads to generate contigs, pre-process reads for quality and contamination, map reads to a reference tailored to the sample using corrected contigs supplemented by the user’s choice of reference sequences and evaluate/compare genome assemblies. We designed a project with the aim of creating a flexible, easy-to-use and all-in-one pipeline from existing/stand-alone bioinformatics tools for viral genome analysis that can be deployed on a personal computer. VGEA was built on the Snakemake workflow management system and utilizes existing tools for each step: fastp (Chen et al., 2018) for read trimming and read-level quality control, BWA (Li & Durbin, 2009) for mapping sequencing reads to the human reference genome, SAMtools (Li et al., 2009) for extracting unmapped reads and also for splitting bam files into fastq files, IVA (Hunt et al., 2015) for de novo assembly to generate contigs, shiver (Wymant et al., 2018) to pre-process reads for quality and contamination, then map to a reference tailored to the sample using corrected contigs supplemented with the user’s choice of existing reference sequences, SeqKit (Shen et al., 2016) for cleaning shiver assembly for QUAST, QUAST (Gurevich et al., 2013) to evaluate/assess the quality of genome assemblies and MultiQC (Ewels et al., 2016) for aggregation of the results from fastp, BWA and QUAST. Our pipeline was successfully tested and validated with SARS-CoV-2 (n = 20), HIV-1 (n = 20) and Lassa Virus (n = 20) datasets all of which have been made publicly available. VGEA is freely available on GitHub at: https://github.com/pauloluniyi/VGEA under the GNU General Public License.


2019 ◽  
Author(s):  
Kenta Shirasawa ◽  
Akifumi Azuma ◽  
Fumiya Taniguchi ◽  
Toshiya Yamamoto ◽  
Akihiko Sato ◽  
...  

AbstractThis study presents the first genome sequence of an interspecific grape hybrid, ‘Shine Muscat’ (Vitis labruscana × V. vinifera), an elite table grape cultivar bred in Japan. The complexity of the genome structure, arising from the interspecific hybridization, necessitated the use of a sophisticated genome assembly pipeline with short-read genome sequence data. The resultant genome assemblies consisted of two types of sequences: a haplotype-phased sequence of the highly heterozygous genomes and an unphased sequence representing a “haploid” genome. The unphased sequences spanned 490.1 Mb in length, 99.4% of the estimated genome size, with 8,696 scaffold sequences with an N50 length of 13.2 Mb. The phased sequences had 15,650 scaffolds spanning 1.0 Gb with N50 of 4.2 Mb. The two sequences comprised 94.7% and 96.3% of the core eukaryotic genes, indicating that the entire genome of ‘Shine Muscat’ was represented. Examination of genome structures revealed possible genome rearrangements between the genomes of ‘Shine Muscat’ and a V. vinifera line. Furthermore, full-length transcriptome sequencing analysis revealed 13,947 gene loci on the ‘Shine Muscat’ genome, from which 26,199 transcript isoforms were transcribed. These genome resources provide new insights that could help cultivation and breeding strategies produce more high-quality table grapes such as ‘Shine Muscat’.


2021 ◽  
Author(s):  
Kathryn Campbell ◽  
Robert J Gifford ◽  
Joshua Singer ◽  
Verity Hill ◽  
Aine O'Toole ◽  
...  

The availability of pathogen sequence data and use of genomic surveillance is rapidly increasing. Genomic tools and classification systems need updating to reflect this. Here, rabies virus is used as an example to showcase the potential value of updated genomic tools to enhance surveillance to better understand epidemiological dynamics and improve disease control. Previous studies have described the evolutionary history of rabies virus, however the resulting taxonomy lacks the definition necessary to identify incursions, lineage turnover and transmission routes at high resolution. Here we propose a lineage classification system based on the dynamic nomenclature used for SARS-CoV-2, defining a lineage by phylogenetic methods for tracking virus spread and comparing sequences across geographic areas. We demonstrate this system through application to the globally distributed Cosmopolitan clade of rabies virus, defining 73 total lineages within the clade, beyond the 22 previously reported. We further show how integration of this tool with a new rabies virus sequence data resource (RABV-GLUE) enables rapid application, for example, highlighting lineage dynamics relevant to control and elimination programmes, such as identifying importations and their sources, and areas of persistence and transmission, including transboundary incursions. This system and the tools developed should be useful for coordinating and targeting control programmes and monitoring progress as we work towards eliminating dog-mediated rabies, as well as having potential for broad application to the surveillance of other viruses.


Sign in / Sign up

Export Citation Format

Share Document