scholarly journals SNiPloid: A Utility to Exploit High-Throughput SNP Data Derived from RNA-Seq in Allopolyploid Species

2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Marine Peralta ◽  
Marie-Christine Combes ◽  
Alberto Cenci ◽  
Philippe Lashermes ◽  
Alexis Dereeper

High-throughput sequencing is a common approach to discover SNP variants, especially in plant species. However, methods to analyze predicted SNPs are often optimized for diploid plant species whereas many crop species are allopolyploids and combine related but divergent subgenomes (homoeologous chromosome sets). We created a software tool, SNiPloid, that exploits and interprets putative SNPs in the context of allopolyploidy by comparing SNPs from an allopolyploid with those obtained in its modern-day diploid progenitors. SNiPloid can compare SNPs obtained from a sample to estimate the subgenome contribution to the transcriptome or SNPs obtained from two polyploid accessions to search for SNP divergence.

2014 ◽  
Author(s):  
Simon Anders ◽  
Paul Theodor Pyl ◽  
Wolfgang Huber

Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard work flows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data such as genomic coordinates, sequences, sequencing reads, alignments, gene model information, variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability: HTSeq is released as open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index, https://pypi.python.org/pypi/HTSeq


mBio ◽  
2011 ◽  
Vol 2 (2) ◽  
Author(s):  
Lindsey Bomar ◽  
Michele Maltz ◽  
Sophie Colston ◽  
Joerg Graf

ABSTRACTThe vast majority of bacterial species remain uncultured, and this severely limits the investigation of their physiology, metabolic capabilities, and role in the environment. High-throughput sequencing of RNA transcripts (RNA-seq) allows the investigation of the diverse physiologies from uncultured microorganisms in their natural habitat. Here, we report the use of RNA-seq for characterizing the metatranscriptome of the simple gut microbiome from the medicinal leechHirudo verbanaand for utilizing this information to design a medium for cultivating members of the microbiome. Expression data suggested that aRikenella-like bacterium, the most abundant but uncultured symbiont, forages on sulfated- and sialated-mucin glycans that are fermented, leading to the secretion of acetate. Histological stains were consistent with the presence of sulfated and sialated mucins along the crop epithelium. The second dominant symbiont,Aeromonas veronii, grows in two different microenvironments and is predicted to utilize either acetate or carbohydrates. Based on the metatranscriptome, a medium containing mucin was designed, which enabled the cultivation of theRikenella-like bacterium. Metatranscriptomes shed light on microbial metabolismin situand provide critical clues for directing the culturing of uncultured microorganisms. By choosing a condition under which the desired organism is rapidly proliferating and focusing on highly expressed genes encoding hydrolytic enzymes, binding proteins, and transporters, one can identify an organism’s nutritional preferences and design a culture medium.IMPORTANCEThe number of prokaryotes on the planet has been estimated to exceed 1030cells, and the overwhelming majority of them have evaded cultivation, making it difficult to investigate their ecological, medical, and industrial relevance. The application of transcriptomics based on high-throughput sequencing of RNA transcripts (RNA-seq) to microorganisms in their natural environment can provide investigators with insight into their physiologies under optimal growth conditions. We utilized RNA-seq to learn more about the uncultured and cultured symbionts that comprise the relatively simple digestive-tract microbiome of the medicinal leech. The expression data revealed highly expressed hydrolytic enzymes and transporters that provided critical clues for the design of a culture medium enabling the isolation of the previously unculturedRikenella-like symbiont. This directed culturing method will greatly aid efforts aimed at understanding uncultured microorganisms, including beneficial symbionts, pathogens, and ecologically relevant microorganisms, by facilitating genome sequencing, physiological characterization, and genetic manipulation of the previously uncultured microbes.


2021 ◽  
Author(s):  
Yu Hamaguchi ◽  
Chao Zeng ◽  
Michiaki Hamada

Abstract Background: Differential expression (DE) analysis of RNA-seq data typically depends on gene annotations. Different sets of gene annotations are available for the human genome and are continually updated–a process complicated with the development and application of high-throughput sequencing technologies. However, the impact of the complexity of gene annotations on DE analysis remains unclear.Results: Using “mappability”, a metric of the complexity of gene annotation, we compared three distinct human gene annotations, GENCODE, RefSeq, and NONCODE, and evaluated how mappability affected DE analysis. We found that mappability was significantly different among the human gene annotations. We also found that increasing mappability improved the performance of DE analysis, and the impact of mappability mainly evident in the quantification step and propagated downstream of DE analysis systematically.Conclusions: We assessed how the complexity of gene annotations affects DE analysis using mappability. Our findings indicate that the growth and complexity of gene annotations negatively impact the performance of DE analysis, suggesting that an approach that excludes unnecessary gene models from gene annotations improves the performance of DE analysis.


2019 ◽  
Author(s):  
Ayman Yousif ◽  
Nizar Drou ◽  
Jillian Rowe ◽  
Mohammed Khalfan ◽  
Kristin C Gunsalus

AbstractBackgroundAs high-throughput sequencing applications continue to evolve, the rapid growth in quantity and variety of sequence-based data calls for the development of new software libraries and tools for data analysis and visualization. Often, effective use of these tools requires computational skills beyond those of many researchers. To ease this computational barrier, we have created a dynamic web-based platform, NASQAR (Nucleic Acid SeQuence Analysis Resource).ResultsNASQAR offers a collection of custom and publicly available open-source web applications that make extensive use of a variety of R packages to provide interactive data analysis and visualization. The platform is publicly accessible at http://nasqar.abudhabi.nyu.edu/. Open-source code is on GitHub at https://github.com/nasqar/NASQAR, and the system is also available as a Docker image at https://hub.docker.com/r/aymanm/nasqarall. NASQAR is a collaboration between the core bioinformatics teams of the NYU Abu Dhabi and NYU New York Centers for Genomics and Systems Biology.ConclusionsNASQAR empowers non-programming experts with a versatile and intuitive toolbox to easily and efficiently explore, analyze, and visualize their Transcriptomics data interactively. Popular tools for a variety of applications are currently available, including Transcriptome Data Preprocessing, RNA-seq Analysis (including Single-cell RNA-seq), Metagenomics, and Gene Enrichment.


Author(s):  
Michael G. Schimek ◽  
Eva Budinská ◽  
Karl G. Kugler ◽  
Vendula Švendová ◽  
Jie Ding ◽  
...  

AbstractHigh-throughput sequencing techniques are increasingly affordable and produce massive amounts of data. Together with other high-throughput technologies, such as microarrays, there are an enormous amount of resources in databases. The collection of these valuable data has been routine for more than a decade. Despite different technologies, many experiments share the same goal. For instance, the aims of RNA-seq studies often coincide with those of differential gene expression experiments based on microarrays. As such, it would be logical to utilize all available data. However, there is a lack of biostatistical tools for the integration of results obtained from different technologies. Although diverse technological platforms produce different raw data, one commonality for experiments with the same goal is that all the outcomes can be transformed into a platform-independent data format – rankings – for the same set of items. Here we present the


RMD Open ◽  
2021 ◽  
Vol 7 (1) ◽  
pp. e001324
Author(s):  
Sebastian Boegel ◽  
John C Castle ◽  
Andreas Schwarting

ObjectiveHere, we assess the usage of high throughput sequencing (HTS) in rheumatic research and the availability of public HTS data of rheumatic samples.MethodsWe performed a semiautomated literature review on PubMed, consisting of an R-script and manual curation as well as a manual search on the Sequence Read Archive for public available HTS data.ResultsOf the 699 identified articles, rheumatoid arthritis (n=182 publications, 26%), systemic lupus erythematous (n=161, 23%) and osteoarthritis (n=152, 22%) are among the rheumatic diseases with the most reported use of HTS assays. The most represented assay is RNA-Seq (n=457, 65%) for the identification of biomarkers in blood or synovial tissue. We also find, that the quality of accompanying clinical characterisation of the sequenced patients differs dramatically and we propose a minimal set of clinical data necessary to accompany rheumatological-relevant HTS data.ConclusionHTS allows the analysis of a broad spectrum of molecular features in many samples at the same time. It offers enormous potential in novel personalised diagnosis and treatment strategies for patients with rheumatic diseases. Being established in cancer research and in the field of Mendelian diseases, rheumatic diseases are about to become the third disease domain for HTS, especially the RNA-Seq assay. However, we need to start a discussion about reporting of clinical characterisation accompany rheumatological-relevant HTS data to make clinical meaningful use of this data.


2015 ◽  
Author(s):  
Ben Busby ◽  
Allissa Dillman ◽  
Claire L. Simpson ◽  
Ian Fingerman ◽  
Sijung Yun ◽  
...  

We assembled teams of genomics professionals to assess whether we could rapidly develop pipelines to answer biological questions commonly asked by biologists and others new to bioinformatics by facilitating analysis of high-throughput sequencing data. In January 2015, teams were assembled on the National Institutes of Health (NIH) campus to address questions in the DNA-seq, epigenomics, metagenomics and RNA-seq subfields of genomics. The only two rules for this hackathon were that either the data used were housed at the National Center for Biotechnology Information (NCBI) or would be submitted there by a participant in the next six months, and that all software going into the pipeline was open-source or open-use. Questions proposed by organizers, as well as suggested tools and approaches, were distributed to participants a few days before the event and were refined during the event. Pipelines were published on GitHub, a web service providing publicly available, free-usage tiers for collaborative software development (https://github.com/features/). The code was published at https://github.com/DCGenomics/ with separate repositories for each team, starting with hackathon_v001.


2015 ◽  
Vol 47 (9) ◽  
pp. 420-431 ◽  
Author(s):  
Tamsyn M. Uren Webster ◽  
Janice A. Shears ◽  
Karen Moore ◽  
Eduarda M. Santos

Estrogenic chemicals are major contaminants of surface waters and can threaten the sustainability of natural fish populations. Characterization of the global molecular mechanisms of toxicity of environmental contaminants has been conducted primarily in model species rather than species with limited existing transcriptomic or genomic sequence information. We aimed to investigate the global mechanisms of toxicity of an endocrine disrupting chemical of environmental concern [17β-estradiol (E2)] using high-throughput RNA sequencing (RNA-Seq) in an environmentally relevant species, brown trout ( Salmo trutta). We exposed mature males to measured concentrations of 1.94, 18.06, and 34.38 ng E2/l for 4 days and sequenced three individual liver samples per treatment using an Illumina HiSeq 2500 platform. Exposure to 34.4 ng E2/L resulted in 2,113 differentially regulated transcripts (FDR < 0.05). Functional analysis revealed upregulation of processes associated with vitellogenesis, including lipid metabolism, cellular proliferation, and ribosome biogenesis, together with a downregulation of carbohydrate metabolism. Using real-time quantitative PCR, we validated the expression of eight target genes and identified significant differences in the regulation of several known estrogen-responsive transcripts in fish exposed to the lower treatment concentrations (including esr1 and zp2.5). We successfully used RNA-Seq to identify highly conserved responses to estrogen and also identified some estrogen-responsive transcripts that have been less well characterized, including nots and tgm2l. These results demonstrate the potential application of RNA-Seq as a valuable tool for assessing mechanistic effects of pollutants in ecologically relevant species for which little genomic information is available.


2021 ◽  
Vol 2 (3) ◽  
pp. 100651
Author(s):  
Katherine C. Palozola ◽  
Greg Donahue ◽  
Kenneth S. Zaret

Sign in / Sign up

Export Citation Format

Share Document