scholarly journals sRNATargetDigger: A bioinformatics software for bidirectional identification of sRNA-target pairs with co-regulatory sRNAs information

PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0244480
Author(s):  
Xinghuo Ye ◽  
Zhihong Yang ◽  
Yeqin Jiang ◽  
Lan Yu ◽  
Rongkai Guo ◽  
...  

Identification of the target genes of microRNAs (miRNAs), trans-acting small interfering RNAs (ta-siRNAs), and small interfering RNAs (siRNAs) is an important step for understanding their regulatory roles in plants. In recent years, many bioinformatics software packages based on small RNA (sRNA) high-throughput sequencing (HTS) and degradome sequencing data analysis have provided strong technical support for large-scale mining of sRNA-target pairs. However, sRNA-target regulation is achieved using a complex network of interactions since one transcript might be co-regulated by multiple sRNAs and one sRNA may also affect multiple targets. Currently used mining software can realize the mining of multiple unknown targets using known sRNA, but it cannot rule out the possibility of co-regulation of the same target by other unknown sRNAs. Hence, the obtained regulatory network may be incomplete. We have developed a new mining software, sRNATargetDigger, that includes two function modules, “Forward Digger” and “Reverse Digger”, which can identify regulatory sRNA-target pairs bidirectionally. Moreover, it has the ability to identify unknown sRNAs co-regulating the same target, in order to obtain a more authentic and reliable sRNA-target regulatory network. Upon re-examination of the published sRNA-target pairs in Arabidopsis thaliana, sRNATargetDigger found 170 novel co-regulatory sRNA-target pairs. This software can be downloaded from http://www.bioinfolab.cn/sRNATD.html.

2021 ◽  
Vol 99 (2) ◽  
Author(s):  
Yuhua Fu ◽  
Pengyu Fan ◽  
Lu Wang ◽  
Ziqiang Shu ◽  
Shilin Zhu ◽  
...  

Abstract Despite the broad variety of available microRNA (miRNA) research tools and methods, their application to the identification, annotation, and target prediction of miRNAs in nonmodel organisms is still limited. In this study, we collected nearly all public sRNA-seq data to improve the annotation for known miRNAs and identify novel miRNAs that have not been annotated in pigs (Sus scrofa). We newly annotated 210 mature sequences in known miRNAs and found that 43 of the known miRNA precursors were problematic due to redundant/missing annotations or incorrect sequences. We also predicted 811 novel miRNAs with high confidence, which was twice the current number of known miRNAs for pigs in miRBase. In addition, we proposed a correlation-based strategy to predict target genes for miRNAs by using a large amount of sRNA-seq and RNA-seq data. We found that the correlation-based strategy provided additional evidence of expression compared with traditional target prediction methods. The correlation-based strategy also identified the regulatory pairs that were controlled by nonbinding sites with a particular pattern, which provided abundant complementarity for studying the mechanism of miRNAs that regulate gene expression. In summary, our study improved the annotation of known miRNAs, identified a large number of novel miRNAs, and predicted target genes for all pig miRNAs by using massive public data. This large data-based strategy is also applicable for other nonmodel organisms with incomplete annotation information.


Viruses ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 2006
Author(s):  
Anna Y Budkina ◽  
Elena V Korneenko ◽  
Ivan A Kotov ◽  
Daniil A Kiselev ◽  
Ilya V Artyushin ◽  
...  

According to various estimates, only a small percentage of existing viruses have been discovered, naturally much less being represented in the genomic databases. High-throughput sequencing technologies develop rapidly, empowering large-scale screening of various biological samples for the presence of pathogen-associated nucleotide sequences, but many organisms are yet to be attributed specific loci for identification. This problem particularly impedes viral screening, due to vast heterogeneity in viral genomes. In this paper, we present a new bioinformatic pipeline, VirIdAl, for detecting and identifying viral pathogens in sequencing data. We also demonstrate the utility of the new software by applying it to viral screening of the feces of bats collected in the Moscow region, which revealed a significant variety of viruses associated with bats, insects, plants, and protozoa. The presence of alpha and beta coronavirus reads, including the MERS-like bat virus, deserves a special mention, as it once again indicates that bats are indeed reservoirs for many viral pathogens. In addition, it was shown that alignment-based methods were unable to identify the taxon for a large proportion of reads, and we additionally applied other approaches, showing that they can further reveal the presence of viral agents in sequencing data. However, the incompleteness of viral databases remains a significant problem in the studies of viral diversity, and therefore necessitates the use of combined approaches, including those based on machine learning methods.


2020 ◽  
Vol 48 (W1) ◽  
pp. W200-W207
Author(s):  
Simone Puccio ◽  
Giorgio Grillo ◽  
Arianna Consiglio ◽  
Maria Felicia Soluri ◽  
Daniele Sblattero ◽  
...  

Abstract High-Throughput Sequencing technologies are transforming many research fields, including the analysis of phage display libraries. The phage display technology coupled with deep sequencing was introduced more than a decade ago and holds the potential to circumvent the traditional laborious picking and testing of individual phage rescued clones. However, from a bioinformatics point of view, the analysis of this kind of data was always performed by adapting tools designed for other purposes, thus not considering the noise background typical of the ‘interactome sequencing’ approach and the heterogeneity of the data. InteractomeSeq is a web server allowing data analysis of protein domains (‘domainome’) or epitopes (‘epitome’) from either Eukaryotic or Prokaryotic genomic phage libraries generated and selected by following an Interactome sequencing approach. InteractomeSeq allows users to upload raw sequencing data and to obtain an accurate characterization of domainome/epitome profiles after setting the parameters required to tune the analysis. The release of this tool is relevant for the scientific and clinical community, because InteractomeSeq will fill an existing gap in the field of large-scale biomarkers profiling, reverse vaccinology, and structural/functional studies, thus contributing essential information for gene annotation or antigen identification. InteractomeSeq is freely available at https://InteractomeSeq.ba.itb.cnr.it/


2020 ◽  
Vol 36 (12) ◽  
pp. 3632-3636 ◽  
Author(s):  
Weibo Zheng ◽  
Jing Chen ◽  
Thomas G Doak ◽  
Weibo Song ◽  
Ying Yan

Abstract Motivation Programmed DNA elimination (PDE) plays a crucial role in the transitions between germline and somatic genomes in diverse organisms ranging from unicellular ciliates to multicellular nematodes. However, software specific for the detection of DNA splicing events is scarce. In this paper, we describe Accurate Deletion Finder (ADFinder), an efficient detector of PDEs using high-throughput sequencing data. ADFinder can predict PDEs with relatively low sequencing coverage, detect multiple alternative splicing forms in the same genomic location and calculate the frequency for each splicing event. This software will facilitate research of PDEs and all down-stream analyses. Results By analyzing genome-wide DNA splicing events in two micronuclear genomes of Oxytricha trifallax and Tetrahymena thermophila, we prove that ADFinder is effective in predicting large scale PDEs. Availability and implementation The source codes and manual of ADFinder are available in our GitHub website: https://github.com/weibozheng/ADFinder. Supplementary information Supplementary data are available at Bioinformatics online.


2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Yuming Zhao ◽  
Fang Wang ◽  
Su Chen ◽  
Jun Wan ◽  
Guohua Wang

MicroRNAs (miRNAs) are short (~22 nucleotides) noncoding RNAs and disseminated throughout the genome, either in the intergenic regions or in the intronic sequences of protein-coding genes. MiRNAs have been proved to play important roles in regulating gene expression. Hence, understanding the transcriptional mechanism of miRNA genes is a very critical step to uncover the whole regulatory network. A number of miRNA promoter prediction models have been proposed in the past decade. This review summarized several most popular miRNA promoter prediction models which used genome sequence features, or other features, for example, histone markers, RNA Pol II binding sites, and nucleosome-free regions, achieved by high-throughput sequencing data. Some databases were described as resources for miRNA promoter information. We then performed comprehensive discussion on prediction and identification of transcription factor mediated microRNA regulatory networks.


Genes ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 536 ◽  
Author(s):  
Xiaobo Zhao ◽  
Liming Gan ◽  
Caixia Yan ◽  
Chunjuan Li ◽  
Quanxi Sun ◽  
...  

Long non-coding RNAs (lncRNAs) are involved in various regulatory processes although they do not encode protein. Presently, there is little information regarding the identification of lncRNAs in peanut (Arachis hypogaea Linn.). In this study, 50,873 lncRNAs of peanut were identified from large-scale published RNA sequencing data that belonged to 124 samples involving 15 different tissues. The average lengths of lncRNA and mRNA were 4335 bp and 954 bp, respectively. Compared to the mRNAs, the lncRNAs were shorter, with fewer exons and lower expression levels. The 4713 co-expression lncRNAs (expressed in all samples) were used to construct co-expression networks by using the weighted correlation network analysis (WGCNA). LncRNAs correlating with the growth and development of different peanut tissues were obtained, and target genes for 386 hub lncRNAs of all lncRNAs co-expressions were predicted. Taken together, these findings can provide a comprehensive identification of lncRNAs in peanut.


2011 ◽  
Vol 7 (11) ◽  
pp. e1002190 ◽  
Author(s):  
Chao Cheng ◽  
Koon-Kiu Yan ◽  
Woochang Hwang ◽  
Jiang Qian ◽  
Nitin Bhardwaj ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Waqas Ahmed ◽  
Yanshi Xia ◽  
Ronghua Li ◽  
Hua Zhang ◽  
Kadambot H.M Siddique ◽  
...  

Endogenous small interfering RNAs (siRNAs) are substantial gene regulators in eukaryotes and play key functions in plant development and stress tolerance. Among environmental factors, heat is serious abiotic stress that severely influences the productivity and quality of flowering Chinese cabbage (Brassica campestris L. ssp. chinensis var. utilis Tsen et Lee). However, how siRNAs are involved in regulating gene expression during heat stress is not fully understood in flowering Chinese cabbage. Combining bioinformatical and next-generation sequencing approaches, we identified heat-responsive siRNAs in four small RNA libraries of flowering Chinese cabbage using leaves collected at 0, 1, 6, and 12 h after a 38°C heat-stress treatment; 536, 816, and 829 siRNAs exhibited substantial differential expression at 1, 6, and 12 h, respectively. Seventy-five upregulated and 69 downregulated differentially expressed siRNAs (DE-siRNAs) were common for the three time points of heat stress. We identified 795 target genes of DE-siRNAs, including serine/threonine-protein kinase SRK2I, CTR1-like, disease resistance protein RML1A-like, and RPP1, which may play a role in regulating heat tolerance. Gene ontology showed that predictive targets of DE-siRNAs may have key roles in the positive regulation of biological processes, organismal processes, responses to temperature stimulus, signaling, and growth and development. These novel results contribute to further understanding how siRNAs modulate the expression of their target genes to control heat tolerance in flowering Chinese cabbage.


2017 ◽  
Author(s):  
Mark J.P. Chaisson ◽  
Ashley D. Sanders ◽  
Xuefang Zhao ◽  
Ankit Malhotra ◽  
David Porubsky ◽  
...  

ABSTRACTThe incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, and strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent–child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per human genome. We also discover 156 inversions per genome—most of which previously escaped detection. Fifty-eight of the inversions we discovered intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The method and the dataset serve as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies.


2020 ◽  
Author(s):  
Yang Zhao ◽  
Qiye Wei ◽  
Tianci Chen ◽  
Lijuan Xu ◽  
Jing Liu ◽  
...  

Abstract Background: MicroRNAs (miRNAs) are a class of small non-coding RNAs, which have been demonstrated to play essential roles in plant growth and development, and in responses to abiotic stress. Heat stress is one of the most serious stresses that affecting crop yield and quality, however, the related regulatory mechanisms of miRNAs remains poorly understanding in maize. Results: In this study, a total of 340 miRNAs, including 215 known and 125 novel members, were identified from maize seedlings under heat stress (MH) and control conditions (MC) using high-throughput sequencing approach. The 215 known miRNAs can be further divided into 40 different families, and 21-nt miRNAs were found to be most abundant among the known miRNAs. Thirty-five miRNAs, including 26 known and 9 novel members, were significantly different expressed between MC and MH libraries. Furthermore, 174 target genes were predicted to be cleaved by 115 miRNAs using degradome sequencing. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed for these targets to explore the biological function and pathways involved. Based on the relationships of miRNAs, target genes and the enriched results, a regulatory network was constructed for the miRNAs and their respective target genes, and 16 significantly differently expressed miRNAs (DEMs) were involved in the network. Conclusions: The results revealed novel insights into the roles of miRNAs in heat stress response and provided a useful foundation for understanding the regulatory mechanisms of heat-responsive miRNAs in maize.


Sign in / Sign up

Export Citation Format

Share Document