scholarly journals Integrative genome, transcriptome, microRNA, and degradome analysis of water dropwort (Oenanthe javanica) in response to water stress

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Jie-Xia Liu ◽  
Qian Jiang ◽  
Jian-Ping Tao ◽  
Kai Feng ◽  
Tong Li ◽  
...  

AbstractWater dropwort (Liyang Baiqin, Oenanthe javanica (BI.) DC.) is an aquatic perennial plant from the Apiaceae family with abundant protein, dietary fiber, vitamins, and minerals. It usually grows in wet soils and can even grow in water. Here, whole-genome sequencing of O. javanica via HiSeq 2000 sequencing technology was reported for the first time. The genome size was 1.28 Gb, including 42,270 genes, of which 93.92% could be functionally annotated. An online database of the whole-genome sequences of water dropwort, Water dropwortDB, was established to share the results and facilitate further research on O. javanica (database homepage: http://apiaceae.njau.edu.cn/waterdropwortdb). Water dropwortDB offers whole-genome and transcriptome sequences and a Basic Local Alignment Search Tool. Comparative analysis with other species showed that the evolutionary relationship between O. javanica and Daucus carota was the closest. Twenty-five gene families of O. javanica were found to be expanded, and some genetic factors (such as genes and miRNAs) related to phenotypic and anatomic differentiation in O. javanica under different water conditions were further investigated. Two miRNA and target gene pairs (miR408 and Oja15472, miR171 and Oja47040) were remarkably regulated by water stress. The obtained reference genome of O. javanica provides important information for future work, thus making in-depth genetic breeding and gene editing possible. The present study also provides a foundation for the understanding of the O. javanica response to water stress, including morphological, anatomical, and genetic differentiation.

Author(s):  
Kailash Chandra Samal ◽  
Jyoti Prakash Sahoo ◽  
Laxmipreeya Behera ◽  
Trupti Dash

Bioinformatics is the new branch of science which deals with the acquisition, storage, analysis and dissemination of biological data with the help of computer science and information technology. It has the enormous ability to analyze a vast quantity of biological data quickly and cost-effectively. In the past decades, enormous sequence information has been generated due to the advances in DNA and protein sequencing techniques. Estimating similarities between biological sequences is becoming necessary to obtain hidden information present within the sequence and to trace evolutionary relationship exist within the sequences. This sequence comparison can be achieved by basic local alignment search tool (BLAST). So BLAST has become a fundamental tools of life science research. Hence it is essential to know how to do sequence comparison using BLAST and how to accurately interpret the BLAST output data. The present article aims to familiarize the biologists and researchers with different BLAST programs and their use in research program.


2019 ◽  
Author(s):  
Jose Manuel Martí ◽  
Carlos P. Garay

AbstractSince its introduction in 1990 and with over 50k citations, the NCBI BLAST family has been an essential tool of in silico molecular biology. The BLAST nt database, based on the traditional divisions of GenBank, has been the default and most comprehensive database for nucleotide BLAST searches and for taxonomic classification software in metagenomics. Here we argue that this is no longer the case. Currently, the NCBI WGS database contains one billion reads (almost five times more than GenBank), and with 4.4 trillion nucleotides, WGS has about 14 times more nucleotides than GenBank. This ratio is growing with time. We advocate a change in the database paradigm in taxonomic classification by systematically combining the nt and WGS databases in order to boost taxonomic classifiers sensitivity. We present here a case in which, by adding WGS data, we obtained over five times more classified reads and with a higher confidence score. To facilitate the adoption of this approach, we provide the draftGenomes script.Author summaryCulture-independent methods are revolutionizing biology. The NIH/NCBI Basic Local Alignment Search Tool (BLAST) is one of the most widely used methods in computational biology. The BLAST nt database has become a de facto standard for taxonomic classifiers in metagenomics. We believe that it is time for a change in the database paradigm for such a classification. We advocate the systematic combination of the BLAST nt database with genomes of the massive NCBI Whole-Genome Shotgun (WGS) database. We make draftGenomes available, a script that eases the adoption of this approach. Current developments and technologies make it feasible now. Our recent results in several metagenomic projects indicate that this strategy boosts the sensitivity in taxonomic classifications.


2019 ◽  
Vol 14 (2) ◽  
pp. 157-163
Author(s):  
Majid Hajibaba ◽  
Mohsen Sharifi ◽  
Saeid Gorgin

Background: One of the pivotal challenges in nowadays genomic research domain is the fast processing of voluminous data such as the ones engendered by high-throughput Next-Generation Sequencing technologies. On the other hand, BLAST (Basic Local Alignment Search Tool), a longestablished and renowned tool in Bioinformatics, has shown to be incredibly slow in this regard. Objective: To improve the performance of BLAST in the processing of voluminous data, we have applied a novel memory-aware technique to BLAST for faster parallel processing of voluminous data. Method: We have used a master-worker model for the processing of voluminous data alongside a memory-aware technique in which the master partitions the whole data in equal chunks, one chunk for each worker, and consequently each worker further splits and formats its allocated data chunk according to the size of its memory. Each worker searches every split data one-by-one through a list of queries. Results: We have chosen a list of queries with different lengths to run insensitive searches in a huge database called UniProtKB/TrEMBL. Our experiments show 20 percent improvement in performance when workers used our proposed memory-aware technique compared to when they were not memory aware. Comparatively, experiments show even higher performance improvement, approximately 50 percent, when we applied our memory-aware technique to mpiBLAST. Conclusion: We have shown that memory-awareness in formatting bulky database, when running BLAST, can improve performance significantly, while preventing unexpected crashes in low-memory environments. Even though distributed computing attempts to mitigate search time by partitioning and distributing database portions, our memory-aware technique alleviates negative effects of page-faults on performance.


Crop Science ◽  
2011 ◽  
Vol 51 (1) ◽  
pp. 157-172 ◽  
Author(s):  
Kristen A. Leach ◽  
Lindsey G. Hejlek ◽  
Leonard B. Hearne ◽  
Henry T. Nguyen ◽  
Robert E. Sharp ◽  
...  

Genetics ◽  
2000 ◽  
Vol 156 (3) ◽  
pp. 1249-1257
Author(s):  
Ilya Ruvinsky ◽  
Lee M Silver ◽  
Jeremy J Gibson-Brown

Abstract The duplication of preexisting genes has played a major role in evolution. To understand the evolution of genetic complexity it is important to reconstruct the phylogenetic history of the genome. A widely held view suggests that the vertebrate genome evolved via two successive rounds of whole-genome duplication. To test this model we have isolated seven new T-box genes from the primitive chordate amphioxus. We find that each amphioxus gene generally corresponds to two or three vertebrate counterparts. A phylogenetic analysis of these genes supports the idea that a single whole-genome duplication took place early in vertebrate evolution, but cannot exclude the possibility that a second duplication later took place. The origin of additional paralogs evident in this and other gene families could be the result of subsequent, smaller-scale chromosomal duplications. Our findings highlight the importance of amphioxus as a key organism for understanding evolution of the vertebrate genome.


2021 ◽  
Vol 7 (6) ◽  
pp. 453
Author(s):  
Annie Lebreton ◽  
François Bonnardel ◽  
Yu-Cheng Dai ◽  
Anne Imberty ◽  
Francis M. Martin ◽  
...  

Fungal lectins are a large family of carbohydrate-binding proteins with no enzymatic activity. They play fundamental biological roles in the interactions of fungi with their environment and are found in many different species across the fungal kingdom. In particular, their contribution to defense against feeders has been emphasized, and when secreted, lectins may be involved in the recognition of bacteria, fungal competitors and specific host plants. Carbohydrate specificities and quaternary structures vary widely, but evidence for an evolutionary relationship within the different classes of fungal lectins is supported by a high degree of amino acid sequence identity. The UniLectin3D database contains 194 fungal lectin 3D structures, of which 129 are characterized with a carbohydrate ligand. Using the UniLectin3D lectin classification system, 109 lectin sequence motifs were defined to screen 1223 species deposited in the genomic portal MycoCosm of the Joint Genome Institute. The resulting 33,485 putative lectin sequences are organized in MycoLec, a publicly available and searchable database. These results shed light on the evolution of the lectin gene families in fungi.


Agriculture ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 244
Author(s):  
Seung Hee Eom ◽  
Tae Kyung Hyun

Histone deacetylases (HDACs) are known as erasers that remove acetyl groups from lysine residues in histones. Although plant HDACs play essential roles in physiological processes, including various stress responses, our knowledge concerning HDAC gene families and their evolutionary relationship remains limited. In Brassica rapa genome, we identified 20 HDAC genes, which are divided into three major groups: RPD3/HDA1, HD2, and SIR2 families. In addition, seven pairs of segmental duplicated paralogs and one pair of tandem duplicated paralogs were identified in the B. rapa HDAC (BraHDAC) family, indicating that segmental duplication is predominant for the expansion of the BraHDAC genes. The expression patterns of paralogous gene pairs suggest a divergence in the function of BraHDACs under various stress conditions. Furthermore, we suggested that BraHDA3 (homologous of Arabidopsis HDA14) encodes the functional HDAC enzyme, which can be inhibited by Class I/II HDAC inhibitor SAHA. As a first step toward understanding the epigenetic responses to environmental stresses in Chinese cabbage, our results provide a solid foundation for functional analysis of the BraHDAC family.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Xiaoming Song ◽  
Qihang Yang ◽  
Yun Bai ◽  
Ke Gong ◽  
Tong Wu ◽  
...  

AbstractSimple sequence repeats (SSRs) are one of the most important genetic markers and widely exist in most species. Here, we identified 249,822 SSRs from 3,951,919 genes in 112 plants. Then, we conducted a comprehensive analysis of these SSRs and constructed a plant SSR database (PSSRD). Interestingly, more SSRs were found in lower plants than in higher plants, showing that lower plants needed to adapt to early extreme environments. Four specific enriched functional terms in the lower plant Chlamydomonas reinhardtii were detected when it was compared with seven other higher plants. In addition, Guanylate_cyc existed in more genes of lower plants than of higher plants. In our PSSRD, we constructed an interactive plotting function in the chart interface, and users can easily view the detailed information of SSRs. All SSR information, including sequences, primers, and annotations, can be downloaded from our database. Moreover, we developed Web SSR Finder and Batch SSR Finder tools, which can be easily used for identifying SSRs. Our database was developed using PHP, HTML, JavaScript, and MySQL, which are freely available at http://www.pssrd.info/. We conducted an analysis of the Myb gene families and flowering genes as two applications of the PSSRD. Further analysis indicated that whole-genome duplication and whole-genome triplication played a major role in the expansion of the Myb gene families. These SSR markers in our database will greatly facilitate comparative genomics and functional genomics studies in the future.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Dimitri Boeckaerts ◽  
Michiel Stock ◽  
Bjorn Criel ◽  
Hans Gerstmans ◽  
Bernard De Baets ◽  
...  

AbstractNowadays, bacteriophages are increasingly considered as an alternative treatment for a variety of bacterial infections in cases where classical antibiotics have become ineffective. However, characterizing the host specificity of phages remains a labor- and time-intensive process. In order to alleviate this burden, we have developed a new machine-learning-based pipeline to predict bacteriophage hosts based on annotated receptor-binding protein (RBP) sequence data. We focus on predicting bacterial hosts from the ESKAPE group, Escherichia coli, Salmonella enterica and Clostridium difficile. We compare the performance of our predictive model with that of the widely used Basic Local Alignment Search Tool (BLAST). Our best-performing predictive model reaches Precision-Recall Area Under the Curve (PR-AUC) scores between 73.6 and 93.8% for different levels of sequence similarity in the collected data. Our model reaches a performance comparable to that of BLASTp when sequence similarity in the data is high and starts outperforming BLASTp when sequence similarity drops below 75%. Therefore, our machine learning methods can be especially useful in settings in which sequence similarity to other known sequences is low. Predicting the hosts of novel metagenomic RBP sequences could extend our toolbox to tune the host spectrum of phages or phage tail-like bacteriocins by swapping RBPs.


Sign in / Sign up

Export Citation Format

Share Document