scholarly journals MGDb: An analyzed database and a genomic resource of mango (Mangifera Indica L.) cultivars for mango research

2018 ◽  
Author(s):  
Tayyaba Qamar-ul-Islam ◽  
M. Ahmed Khan ◽  
Rabia Faizan ◽  
Uzma Mahmood

AbstractMango is one of the famous and fifth most important subtropical/tropical fruit crops worldwide with the production centered in India and South-East Asia. Recently, there has been a worldwide interest in mango genomics to produce tools for Marker Assisted Selection and trait association. There are no web-based analyzed genomic resources available for mango particularly. Hence a complete mango genomic resource was required for improvement in research and management of mango germplasm. In this project, we have done comparative transcriptome analysis of four mango cultivars i.e. cv. Langra, cv. Zill, cv. Shelly and cv. Kent from Pakistan, China, Israel, and Mexico respectively. The raw data is obtained through De-novo sequence assembly which generated 30,953-85,036 unigenes from RNA-Seq datasets of mango cultivars. The project is aimed to provide the scientific community and general public a mango genomic resource and allow the user to examine their data against our analyzed mango genome databases of four cultivars (cv. Langra, cv. Zill, cv. Shelly and cv. Kent). A mango web genomic resource MGdb, is based on 3-tier architecture, developed using Python, flat file database, and JavaScript. It contains the information of predicted genes of the whole genome, the unigenes annotated by homologous genes in other species, and GO (Gene Ontology) terms which provide a glimpse of the traits in which they are involved. This web genomic resource can be of immense use in the assessment of the research, development of the medicines, understanding genetics and provides useful bioinformatics solution for analysis of nucleotide sequence data. We report here world’s first web-based genomic resource particularly of mango for genetic improvement and management of mango genome.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Ian S. E. Bally ◽  
◽  
Aureliano Bombarely ◽  
Alan H. Chambers ◽  
Yuval Cohen ◽  
...  

Abstract Background Mango, Mangifera indica L., an important tropical fruit crop, is grown for its sweet and aromatic fruits. Past improvement of this species has predominantly relied on chance seedlings derived from over 1000 cultivars in the Indian sub-continent with a large variation for fruit size, yield, biotic and abiotic stress resistance, and fruit quality among other traits. Historically, mango has been an orphan crop with very limited molecular information. Only recently have molecular and genomics-based analyses enabled the creation of linkage maps, transcriptomes, and diversity analysis of large collections. Additionally, the combined analysis of genomic and phenotypic information is poised to improve mango breeding efficiency. Results This study sequenced, de novo assembled, analyzed, and annotated the genome of the monoembryonic mango cultivar ‘Tommy Atkins’. The draft genome sequence was generated using NRGene de-novo Magic on high molecular weight DNA of ‘Tommy Atkins’, supplemented by 10X Genomics long read sequencing to improve the initial assembly. A hybrid population between ‘Tommy Atkins’ x ‘Kensington Pride’ was used to generate phased haplotype chromosomes and a highly resolved phased SNP map. The final ‘Tommy Atkins’ genome assembly was a consensus sequence that included 20 pseudomolecules representing the 20 chromosomes of mango and included ~ 86% of the ~ 439 Mb haploid mango genome. Skim sequencing identified ~ 3.3 M SNPs using the ‘Tommy Atkins’ x ‘Kensington Pride’ mapping population. Repeat masking identified 26,616 genes with a median length of 3348 bp. A whole genome duplication analysis revealed an ancestral 65 MYA polyploidization event shared with Anacardium occidentale. Two regions, one on LG4 and one on LG7 containing 28 candidate genes, were associated with the commercially important fruit size characteristic in the mapping population. Conclusions The availability of the complete ‘Tommy Atkins’ mango genome will aid global initiatives to study mango genetics.



Author(s):  
Torbjørn Rognes ◽  
Tomáš Flouri ◽  
Ben Nichols ◽  
Christopher Quince ◽  
Frédéric Mahé

Background. VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing metagenomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods. When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results. VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e. format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion. VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.



2017 ◽  
Author(s):  
Waqasuddin Khan ◽  
Safina Abdul Razzak ◽  
M. Kamran Azim

AbstractMango is an economically important fruit crop of many tropical and subtropical countries. Recently, leaf and fruit transcriptomes of mango cultivars grown in different geographical regions have characterized. Here, we presented comparative transcriptome analysis of four mango cultivars i.e. cv. Langra, cv. Zill, cv. Shelly and cv. Kent from Pakistan, China, Israel and Mexico respectively. De-novo sequence assembly generated 30,953-85,036 unigenes from RNASeq datasets of mango cultivars. KEGG pathway mapping of mango unigenes identified terpenoids, flavonoids and carotenoids biosynthetic pathways involved in flavor and color. The analysis revealed linalool as major monoterpenoid found in all cultivars studied whereas, monoterpene α-terpineol was specifically found in cv. Shelly. Ditepene gibberellin biosynthesis pathway was found in all cultivars whereas, homoterpene synthase involved in biosynthesis of 4,8,12-trimethyltrideca-1,3,7,11-tetraene (TMTT; an insect induced diterpene) was found in cv. Kent. Among sesquiterpenes and triterpenes, biosynthetic pathway of Germacrene-D, an antibacterial and anti-insecticidal metabolite was found in cv. Zill and cv. Shelly. Two bioactive triterpenes, lupeol and β-amyrin were found in cv. Langra and cv. Zill. Unigenes involved in biosynthesis of carotenoids, β-carotene and lycopene, were found in cultivars studied. Many unigenes involved in flavonoid biosynthesis were also found. Comparative transcriptomics revealed naringenin (an anti-inflammatory and antioxidant metabolite) as ‘central’ flavanone responsible for biosynthesis of an array of flavonoids. The present study provided insights on genetic resources responsible for flavor and color of mango fruit.



PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2584 ◽  
Author(s):  
Torbjørn Rognes ◽  
Tomáš Flouri ◽  
Ben Nichols ◽  
Christopher Quince ◽  
Frédéric Mahé

BackgroundVSEARCH is an open source and free of charge multithreaded 64-bit tool for processing and preparing metagenomics, genomics and population genomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar, 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use.MethodsWhen searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads.ResultsVSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based orde novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e., format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available athttps://github.com/torognes/vsearchunder either the BSD 2-clause license or the GNU General Public License version 3.0.DiscussionVSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.



Plant Disease ◽  
2021 ◽  
Author(s):  
Feifei Sun ◽  
Suli Sun ◽  
Wenwu Ye ◽  
Canxing Duan ◽  
Benjin Li ◽  
...  

Phytophthora vignae is an important oomycete pathogen causing Phytophthora stem rot on some Vigna species. Three P. vignae isolates, obtained from mung bean, adzuki bean and cowpea, respectively, exhibited high similarities in morphology and physiology but are specialized to infect different hosts. Here we reported the first de novo assembly of the draft genomes of three P. vignae isolates, which were performed using the PacBio SMRT Sequel platform. This study will extend the genomic resource available for the Phytophthora genus and provide a good foundation for further research on comparative genomics of Phytophthora species and interaction mechanism between hosts and pathogens.



Author(s):  
Torbjørn Rognes ◽  
Tomáš Flouri ◽  
Ben Nichols ◽  
Christopher Quince ◽  
Frédéric Mahé

Background. VSEARCH is an open source and free of charge multithreaded 64-bit tool for processing metagenomics nucleotide sequence data. It is designed as an alternative to the widely used USEARCH tool (Edgar 2010) for which the source code is not publicly available, algorithm details are only rudimentarily described, and only a memory-confined 32-bit version is freely available for academic use. Methods. When searching nucleotide sequences, VSEARCH uses a fast heuristic based on words shared by the query and target sequences in order to quickly identify similar sequences, a similar strategy is probably used in USEARCH. VSEARCH then performs optimal global sequence alignment of the query against potential target sequences, using full dynamic programming instead of the seed-and-extend heuristic used by USEARCH. Pairwise alignments are computed in parallel using vectorisation and multiple threads. Results. VSEARCH includes most commands for analysing nucleotide sequences available in USEARCH version 7 and several of those available in USEARCH version 8, including searching (exact or based on global alignment), clustering by similarity (using length pre-sorting, abundance pre-sorting or a user-defined order), chimera detection (reference-based or de novo), dereplication (full length or prefix), pairwise alignment, reverse complementation, sorting, and subsampling. VSEARCH also includes commands for FASTQ file processing, i.e. format detection, filtering, read quality statistics, and merging of paired reads. Furthermore, VSEARCH extends functionality with several new commands and improvements, including shuffling, rereplication, masking of low-complexity sequences with the well-known DUST algorithm, a choice among different similarity definitions, and FASTQ file format conversion. VSEARCH is here shown to be more accurate than USEARCH when performing searching, clustering, chimera detection and subsampling, while on a par with USEARCH for paired-ends read merging. VSEARCH is slower than USEARCH when performing clustering and chimera detection, but significantly faster when performing paired-end reads merging and dereplication. VSEARCH is available at https://github.com/torognes/vsearch under either the BSD 2-clause license or the GNU General Public License version 3.0. Discussion. VSEARCH has been shown to be a fast, accurate and full-fledged alternative to USEARCH. A free and open-source versatile tool for sequence analysis is now available to the metagenomics community.



2020 ◽  
Author(s):  
R.J.S Orr ◽  
M. M. Sannum ◽  
S. Boessenkool ◽  
E. Di Martino ◽  
D.P. Gordon ◽  
...  

AbstractResolution of relationships at lower taxonomic levels is crucial for answering many evolutionary questions, and as such, sufficiently varied species representation is vital. This latter goal is not always achievable with relatively fresh samples. To alleviate the difficulties in procuring rarer taxa, we have seen increasing utilization of historical specimens in building molecular phylogenies using high throughput sequencing. This effort, however, has mainly focused on large-bodied or well-studied groups, with small-bodied and under-studied taxa under-prioritized. Here, we present a pipeline that utilizes both historical and contemporary specimens, to increase the resolution of phylogenetic relationships among understudied and small-bodied metazoans, namely, cheilostome bryozoans. In this study, we pioneer sequencing of air-dried bryozoans, utilizing a recent library preparation method for low DNA input. We use the de novo mitogenome assembly from the target specimen itself as reference for iterative mapping, and the comparison thereof. In doing so, we present mitochondrial and ribosomal RNA sequences of 43 cheilostomes representing 37 species, including 14 from historical samples ranging from 50 to 149 years old. The inferred phylogenetic relationships of these samples, analyzed together with publicly available sequence data, are shown in a statistically well-supported 65 taxa and 17 genes cheilostome tree. Finally, the methodological success is emphasized by circularizing a total of 27 mitogenomes, seven from historical cheilostome samples. Our study highlights the potential of utilizing DNA from micro-invertebrate specimens stored in natural history collections for resolving phylogenetic relationships between species.



Author(s):  
Guangtu Gao ◽  
Susana Magadan ◽  
Geoffrey C Waldbieser ◽  
Ramey C Youngblood ◽  
Paul A Wheeler ◽  
...  

Abstract Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.



Sign in / Sign up

Export Citation Format

Share Document