RNA-Seq of potato plants reveals a complex of new latent bacterial plant pathogens

Abstract The throughput and single-base resolution of RNA-Sequencing (RNA-Seq) have contributed to a dramatic change in diagnostics of viruses and other plant pathogens. A transcriptome represents all RNA molecules, including the coding mRNAs as well as the noncoding rRNA, tRNA, etc. A distinct advantage of RNA-Seq is that cDNA fragments are directly sequenced and the reads can be compared to available reference genome sequences. This approach allows the simultaneous and hypothesis-free identification of all pathogens in the plant. We conducted surveys for potato (Solanum tuberosum L.) -associated phytopathogenic bacteria in 56 original and GenBank RNA-seq data sets for potato breeding material. Bacteria of genera Pseudomonas, Burkholderia, Ralstonia, Xanthomonas, Agrobacterium, and species of family Enterobacteriaceae were most frequently detected in RNA sets from the studied plants. RNA-seq reads identified as Xanthomonas spp. were within X. vesicatoria, and some other species. Xanthomonas spp. covered up to 9,1% of all reads and included the major clades of these bacteria known as pathogens of solanaceous crops, but potato. Bacteria of genus Xanthomonas infect different plant species under artificial inoculation, suggesting that they are shared among wild plants and crops. Our studies indicated that a larger number of solanaceous plants can be occupied by specific Xanthomonas pathovars as endophytes or latent pathogens. Revealing bacteria distribution in the plant breeding material using RNA-seq data improves our knowledge on the ecology of plant pathogens.

Download Full-text

Development of genic KASP SNP markers from RNA-Seq data for map-based cloning and marker-assisted selection in maize

BMC Plant Biology ◽

10.1186/s12870-021-02932-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Zhengjie Chen ◽

Dengguo Tang ◽

Jixing Ni ◽

Peng Li ◽

Le Wang ◽

...

Keyword(s):

Marker Assisted Selection ◽

Inbred Lines ◽

Average Density ◽

Snp Markers ◽

Data Sets ◽

Rna Seq ◽

Specific Pcr ◽

Maize Inbred Lines ◽

Allele Specific ◽

Allele Specific Pcr

Abstract Background Maize is one of the most important field crops in the world. Most of the key agronomic traits, including yield traits and plant architecture traits, are quantitative. Fine mapping of genes/ quantitative trait loci (QTL) influencing a key trait is essential for marker-assisted selection (MAS) in maize breeding. However, the SNP markers with high density and high polymorphism are lacking, especially kompetitive allele specific PCR (KASP) SNP markers that can be used for automatic genotyping. To date, a large volume of sequencing data has been produced by the next generation sequencing technology, which provides a good pool of SNP loci for development of SNP markers. In this study, we carried out a multi-step screening method to identify kompetitive allele specific PCR (KASP) SNP markers based on the RNA-Seq data sets of 368 maize inbred lines. Results A total of 2,948,985 SNPs were identified in the high-throughput RNA-Seq data sets with the average density of 1.4 SNP/kb. Of these, 71,311 KASP SNP markers (the average density of 34 KASP SNP/Mb) were developed based on the strict criteria: unique genomic region, bi-allelic, polymorphism information content (PIC) value ≥0.4, and conserved primer sequences, and were mapped on 16,161 genes. These 16,161 genes were annotated to 52 gene ontology (GO) terms, including most of primary and secondary metabolic pathways. Subsequently, the 50 KASP SNP markers with the PIC values ranging from 0.14 to 0.5 in 368 RNA-Seq data sets and with polymorphism between the maize inbred lines 1212 and B73 in in silico analysis were selected to experimentally validate the accuracy and polymorphism of SNPs, resulted in 46 SNPs (92.00%) showed polymorphism between the maize inbred lines 1212 and B73. Moreover, these 46 polymorphic SNPs were utilized to genotype the other 20 maize inbred lines, with all 46 SNPs showing polymorphism in the 20 maize inbred lines, and the PIC value of each SNP was 0.11 to 0.50 with an average of 0.35. The results suggested that the KASP SNP markers developed in this study were accurate and polymorphic. Conclusions These high-density polymorphic KASP SNP markers will be a valuable resource for map-based cloning of QTL/genes and marker-assisted selection in maize. Furthermore, the method used to develop SNP markers in maize can also be applied in other species.

Download Full-text

MUREN: a robust and multi-reference approach of RNA-seq transcript normalization

BMC Bioinformatics ◽

10.1186/s12859-021-04288-0 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Yance Feng ◽

Lei M. Li

Keyword(s):

Biological Significance ◽

Housekeeping Genes ◽

R Package ◽

Data Sets ◽

Statistical Regression ◽

Rna Seq ◽

Least Trimmed Squares ◽

Standard Data ◽

Wide Range ◽

Multiple References

Abstract Background Normalization of RNA-seq data aims at identifying biological expression differentiation between samples by removing the effects of unwanted confounding factors. Explicitly or implicitly, the justification of normalization requires a set of housekeeping genes. However, the existence of housekeeping genes common for a very large collection of samples, especially under a wide range of conditions, is questionable. Results We propose to carry out pairwise normalization with respect to multiple references, selected from representative samples. Then the pairwise intermediates are integrated based on a linear model that adjusts the reference effects. Motivated by the notion of housekeeping genes and their statistical counterparts, we adopt the robust least trimmed squares regression in pairwise normalization. The proposed method (MUREN) is compared with other existing tools on some standard data sets. The goodness of normalization emphasizes on preserving possible asymmetric differentiation, whose biological significance is exemplified by a single cell data of cell cycle. MUREN is implemented as an R package. The code under license GPL-3 is available on the github platform: github.com/hippo-yf/MUREN and on the conda platform: anaconda.org/hippo-yf/r-muren. Conclusions MUREN performs the RNA-seq normalization using a two-step statistical regression induced from a general principle. We propose that the densities of pairwise differentiations are used to evaluate the goodness of normalization. MUREN adjusts the mode of differentiation toward zero while preserving the skewness due to biological asymmetric differentiation. Moreover, by robustly integrating pre-normalized counts with respect to multiple references, MUREN is immune to individual outlier samples.

Download Full-text

Small RNA-Sequencing: Approaches and Considerations for miRNA Analysis

Diagnostics ◽

10.3390/diagnostics11060964 ◽

2021 ◽

Vol 11 (6) ◽

pp. 964

Author(s):

Sarka Benesova ◽

Mikael Kubista ◽

Lukas Valihrach

Keyword(s):

Rna Sequencing ◽

Small Rna ◽

High Sensitivity ◽

Small Rna Sequencing ◽

Rna Seq ◽

Liquid Biopsies ◽

Comprehensive Overview ◽

Rna Molecules ◽

Novel Mirna ◽

The Many

MicroRNAs (miRNAs) are a class of small RNA molecules that have an important regulatory role in multiple physiological and pathological processes. Their disease-specific profiles and presence in biofluids are properties that enable miRNAs to be employed as non-invasive biomarkers. In the past decades, several methods have been developed for miRNA analysis, including small RNA sequencing (RNA-seq). Small RNA-seq enables genome-wide profiling and analysis of known, as well as novel, miRNA variants. Moreover, its high sensitivity allows for profiling of low input samples such as liquid biopsies, which have now found applications in diagnostics and prognostics. Still, due to technical bias and the limited ability to capture the true miRNA representation, its potential remains unfulfilled. The introduction of many new small RNA-seq approaches that tried to minimize this bias, has led to the existence of the many small RNA-seq protocols seen today. Here, we review all current approaches to cDNA library construction used during the small RNA-seq workflow, with particular focus on their implementation in commercially available protocols. We provide an overview of each protocol and discuss their applicability. We also review recent benchmarking studies comparing each protocol’s performance and summarize the major conclusions that can be gathered from their usage. The result documents variable performance of the protocols and highlights their different applications in miRNA research. Taken together, our review provides a comprehensive overview of all the current small RNA-seq approaches, summarizes their strengths and weaknesses, and provides guidelines for their applications in miRNA research.

Download Full-text

The cytochrome P450 genes of channel catfish: Their involvement in disease defense responses as revealed by meta-analysis of RNA-Seq data sets

Biochimica et Biophysica Acta (BBA) - General Subjects ◽

10.1016/j.bbagen.2014.04.016 ◽

2014 ◽

Vol 1840 (9) ◽

pp. 2813-2828 ◽

Cited By ~ 22

Author(s):

Jiaren Zhang ◽

Jun Yao ◽

Ruijia Wang ◽

Yu Zhang ◽

Shikai Liu ◽

...

Keyword(s):

Cytochrome P450 ◽

Channel Catfish ◽

Meta Analysis ◽

Defense Responses ◽

Data Sets ◽

Rna Seq ◽

Cytochrome P450 Genes ◽

Disease Defense

Download Full-text

Cancer-Specific Immune Prognostic Signature in Solid Tumors and Its Relation to Immune Checkpoint Therapies

Cancers ◽

10.3390/cancers12092476 ◽

2020 ◽

Vol 12 (9) ◽

pp. 2476 ◽

Cited By ~ 1

Author(s):

Shaoli Das ◽

Kevin Camphausen ◽

Uma Shankavaram

Keyword(s):

Immune Function ◽

Solid Tumors ◽

Immune Checkpoint ◽

Data Sets ◽

Cancer Type ◽

Rna Seq ◽

Prognostic Signature ◽

Cell Functions ◽

Cancer Types ◽

Disease Free

To elucidate the role of immune cell infiltration as a prognostic signature in solid tumors, we analyzed immune-function-related genes from four publicly available single-cell RNA-Seq data sets and twenty bulk tumor RNA-Seq data sets from The Cancer Genome Atlas (TCGA). Unsupervised clustering of pan-cancer transcriptomic signature showed two major immune function types: one related to NK-, T-, and B-cell functions and the other related to monocyte, macrophage, dendritic cell, and Toll-like receptor functions. Kaplan–Meier analysis showed differential prognosis of these two groups, dependent on the cancer type. Our analysis of TCGA solid tumors with an elastic net model identified 155 genes associated with disease-free survival in different tumor types with varied influence across different cancer types. With this gene set, we computed cancer-specific prognostic immune score models for individual cancer types that predicted disease-free and overall survival. Validation of our model on available published data of immune checkpoint blockade therapies on melanoma, kidney renal cell carcinoma, non-small cell lung cancer, gastric cancer and bladder cancer confirmed that cancer-specific higher immune scores are associated with response to immunotherapy. Our analysis provides a comprehensive map of cancer-specific immune-related prognostic gene sets that are associated with immunotherapy response.

Download Full-text

Enhancing droplet-based single-nucleus RNA-seq resolution using the semi-supervised machine learning classifier DIEM

10.1101/786285 ◽

2019 ◽

Cited By ~ 4

Author(s):

Marcus Alvarez ◽

Elior Rahmani ◽

Brandon Jew ◽

Kristina M. Garske ◽

Zong Miao ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cell Types ◽

Supervised Machine Learning ◽

Data Sets ◽

Rna Seq ◽

Novel Approach ◽

Single Nucleus ◽

Downstream Analysis

AbstractSingle-nucleus RNA sequencing (snRNA-seq) measures gene expression in individual nuclei instead of cells, allowing for unbiased cell type characterization in solid tissues. Contrary to single-cell RNA seq (scRNA-seq), we observe that snRNA-seq is commonly subject to contamination by high amounts of extranuclear background RNA, which can lead to identification of spurious cell types in downstream clustering analyses if overlooked. We present a novel approach to remove debris-contaminated droplets in snRNA-seq experiments, called Debris Identification using Expectation Maximization (DIEM). Our likelihood-based approach models the gene expression distribution of debris and cell types, which are estimated using EM. We evaluated DIEM using three snRNA-seq data sets: 1) human differentiating preadipocytes in vitro, 2) fresh mouse brain tissue, and 3) human frozen adipose tissue (AT) from six individuals. All three data sets showed various degrees of extranuclear RNA contamination. We observed that existing methods fail to account for contaminated droplets and led to spurious cell types. When compared to filtering using these state of the art methods, DIEM better removed droplets containing high levels of extranuclear RNA and led to higher quality clusters. Although DIEM was designed for snRNA-seq data, we also successfully applied DIEM to single-cell data. To conclude, our novel method DIEM removes debris-contaminated droplets from single-cell-based data fast and effectively, leading to cleaner downstream analysis. Our code is freely available for use at https://github.com/marcalva/diem.

Download Full-text

Uncovering the mesendoderm gene regulatory network through multi-omic data integration

10.1101/2020.11.01.362053 ◽

2020 ◽

Author(s):

Camden Jansen ◽

Kitt D. Paraiso ◽

Jeff J. Zhou ◽

Ira L. Blitz ◽

Margaret B. Fish ◽

...

Keyword(s):

Cell Differentiation ◽

Cell Fate ◽

Data Sets ◽

List Type ◽

Rna Seq ◽

Data Types ◽

Cell Fate Decisions ◽

Gene Regulatory ◽

Genome Scale ◽

Omic Data

SummaryMesendodermal specification is one of the earliest events in embryogenesis, where cells first acquire distinct identities. Cell differentiation is a highly regulated process that involves the function of numerous transcription factors (TFs) and signaling molecules, which can be described with gene regulatory networks (GRNs). Cell differentiation GRNs are difficult to build because existing mechanistic methods are low-throughput, and high-throughput methods tend to be non-mechanistic. Additionally, integrating highly dimensional data comprised of more than two data types is challenging. Here, we use linked self-organizing maps to combine ChIP-seq/ATAC-seq with temporal, spatial and perturbation RNA-seq data from Xenopus tropicalis mesendoderm development to build a high resolution genome scale mechanistic GRN. We recovered both known and previously unsuspected TF-DNA/TF-TF interactions and validated through reporter assays. Our analysis provides new insights into transcriptional regulation of early cell fate decisions and provides a general approach to building GRNs using highly-dimensional multi-omic data sets.HighlightsBuilt a generally applicable pipeline to creating GRNs using highly-dimensional multi-omic data setsPredicted new TF-DNA/TF-TF interactions during mesendoderm developmentGenerate the first genome scale GRN for vertebrate mesendoderm and expanded the core mesendodermal developmental network with high fidelityDeveloped a resource to visualize hundreds of RNA-seq and ChIP-seq data using 2D SOM metaclusters.

Download Full-text

Host-induced gene silencing and spray-induced gene silencing for crop protection against viruses.

RNAi for plant improvement and protection ◽

10.1079/9781789248890.0008 ◽

2021 ◽

pp. 72-85

Author(s):

Angela Ricci ◽

Silvia Sabbadini ◽

Laura Miozzi ◽

Bruno Mezzetti ◽

Emanuela Noris

Keyword(s):

Gene Silencing ◽

Plant Pathogens ◽

Crop Protection ◽

Plant Defence ◽

Transcriptional Gene Silencing ◽

Target Sequence ◽

Sequence Specificity ◽

Rna Molecules ◽

Political Concern ◽

Post Transcriptional Gene Silencing

Abstract Since the beginning of agriculture, plant virus diseases have been a strong challenge for farming. Following its discovery at the very beginning of the 1990s, the RNA interference (RNAi) mechanism has been widely studied and exploited as an integrative tool to obtain resistance to viruses in several plant species, with high target-sequence specificity. In this chapter, we describe and review the major aspects of host-induced gene silencing (HIGS), as one of the possible plant defence methods, using genetic engineering techniques. In particular, we focus our attention on the use of RNAi-based gene constructs to introduce stable resistance in host plants against viral diseases, by triggering post-transcriptional gene silencing (PTGS). Recently, spray-induced gene silencing (SIGS), consisting of the topical application of small RNA molecules to plants, has been explored as an alternative tool to the stable integration of RNAi-based gene constructs in plants. SIGS has great and innovative potential for crop defence against different plant pathogens and pests and is expected to raise less public and political concern, as it does not alter the genetic structure of the plant.

Download Full-text

Gene Expression Does Not Support the Developmental Hourglass Model in Three Animals with Spiralian Development

Molecular Biology and Evolution ◽

10.1093/molbev/msz065 ◽

2019 ◽

Vol 36 (7) ◽

pp. 1373-1383 ◽

Cited By ~ 1

Author(s):

Longjun Wu ◽

Kailey E Ferger ◽

J David Lambert

Keyword(s):

Gene Expression ◽

Large Fraction ◽

Molecular Data ◽

Development Stage ◽

Data Sets ◽

Rna Seq ◽

Developmental Evolution ◽

Phylotypic Stage ◽

Hourglass Model ◽

Almost All

Abstract It has been proposed that animals have a pattern of developmental evolution resembling an hourglass because the most conserved development stage—often called the phylotypic stage—is always in midembryonic development. Although the topic has been debated for decades, recent studies using molecular data such as RNA-seq gene expression data sets have largely supported the existence of periods of relative evolutionary conservation in middevelopment, consistent with the phylotypic stage and the hourglass concepts. However, so far this approach has only been applied to a limited number of taxa across the tree of life. Here, using established phylotranscriptomic approaches, we found a surprising reverse hourglass pattern in two molluscs and a polychaete annelid, representatives of the Spiralia, an understudied group that contains a large fraction of metazoan body plan diversity. These results suggest that spiralians have a divergent midembryonic stage, with more conserved early and late development, which is the inverse of the pattern seen in almost all other organisms where these phylotranscriptomic approaches have been reported. We discuss our findings in light of proposed reasons for the phylotypic stage and hourglass model in other systems.

Download Full-text

Bioinformatics Approaches to Studying Plant Long Noncoding RNAs (lncRNAs): Identification and Functional Interpretation of lncRNAs from RNA-Seq Data Sets

Methods in Molecular Biology - Plant Long Non-Coding RNAs ◽

10.1007/978-1-4939-9045-0_11 ◽

2019 ◽

pp. 197-205 ◽

Cited By ~ 5

Author(s):

Hai-Xi Sun ◽

Nam-Hai Chua

Keyword(s):

Noncoding Rnas ◽

Long Noncoding Rnas ◽

Data Sets ◽

Rna Seq ◽

Functional Interpretation ◽

Approaches To Studying

Download Full-text