scholarly journals Synchronous dissection of chloroplast and mitochondrial genomes remodels the intra- and inter-genus phylogeny for the agriculturally important genus Brassica

2019 ◽  
Author(s):  
Jiangwei Qiao ◽  
Xiaojun Zhang ◽  
Biyun Chen ◽  
Fei Huang ◽  
Kun Xu ◽  
...  

Abstract Background : The genus Brassica mainly comprises three diploid and three recently derived allotetraploid species, most of which are highly important vegetable, oil or ornamental crops cultivated worldwide. Despite being extensively studied, the origination of the allotetraploid crops and the overall phylogeny of Brassica genus are still far from completely resolved, which has greatly hindered the development of novel Brassica crops. Here, we target and integrate the chloroplast DNA and mitochondrial DNA to investigate the genetic diversity and relationships in large plant populations centering on Brassica genus. Results : The phylogenetic analyses based on a data set including 72 de novo assembled whole chloroplast genomes, delineated a comprehensive evolutional atlas inside and around Brassica genus. The maternal origin of both B. juncea and B. carinata are monophyletic from cam-type B. rapa and B. nigra , respectively. Nonetheless, the current B. napus contains three major cytoplasmic haplotypes: the cam -type which directly inherited from B. rapa , polima -type which is close to cam -type as a sister, and the predominant nap -type. Intriguingly, nap -type seems phylogenetically integrated with certain sparse C-genome wild species, thus implying that which may have primarily contributed the cytoplasm and the corresponding C subgenome to B. napus . Human breeding creation of the B. napus cytoplasmic male sterile lines (e.g., mori and nsa ) dramatically disturbed the concurrent inheritance between mtDNA and cpDNA. Strong parallel evolution among genera Raphanus , Sinapis, Eruca , Moricandia with Brassica indicates their uncomplete divergence from each other. Conclusions : The overall variation data and elaborated phylogenetic relationships obtained herein can substantially facilitate the development of novel Brassica crops, e.g. the allotetraploid rapeseed with new cytonuclear integrations and the allohexaploid rapeseed.

2020 ◽  
Author(s):  
Jiangwei Qiao ◽  
Xiaojun Zhang ◽  
Biyun Chen ◽  
Fei Huang ◽  
Kun Xu ◽  
...  

Abstract Background : The genus Brassica mainly comprises three diploid and three recently derived allotetraploid species, which are highly important vegetable, oil or ornamental crops cultivated worldwide. Despite being extensively studied, the origination of B. napus and the detailed interspecific relationships within Brassica genus remains unresolved and somewhere confused. By synchronous sequencing of both the chloroplast DNA and mitochondrial DNA, the whole Brassica phylogeny and the origination of the predominant nap -type B. napus have been clarified based on a large plant population, which maximally integrated the known Brassica species. Results : The phylogenetic analyses based on a data set including 72 de novo assembled whole chloroplast genomes, delineated a comprehensive evolutional atlas inside and around Brassica genus. Different from the monophyletical maternal origin of B. juncea and B. carinata from cam-type B. rapa and B. nigra , respectively, the natural B. napus has multiplex maternal origins. It contains three major cytoplasmic haplotypes: the cam -type which directly inherited from B. rapa , polima -type which is close to cam -type as a sister, and the predominant nap -type. Intriguingly, nap -type seems phylogenetically integrated with certain sparse C-genome wild species , thus implying that which may have primarily contributed the cytoplasm and the corresponding C subgenome to B. napus . Human breeding creation of the B. napus cytoplasmic male sterile lines (e.g., mori and nsa ) have dramatically disturbed the concurrent inheritance between mtDNA and cpDNA. Strong parallel evolution among genera Raphanus , Sinapis, Eruca , Moricandia with Brassica indicates their uncomplete divergence from each other. Conclusions : The elaborated phylogenetic relationships and overall variation data obtained herein can substantially facilitate to develop novel Brassica germplasms and to improve the Brassica crops.


2019 ◽  
Author(s):  
Agnes Scheunert ◽  
Marco Dorfner ◽  
Thomas Lingl ◽  
Christoph Oberprieler

AbstractThe chloroplast genome harbors plenty of valuable information for phylogenetic research. Illumina short-read data is generally used for de novo assembly of whole plastomes. PacBio or Oxford Nanopore long reads are additionally employed in hybrid approaches to enable assembly across the highly similar inverted repeats of a chloroplast genome. Unlike for PacBio, plastome assemblies based solely on Nanopore reads are rarely found, due to their high error rate and non-random error profile. However, the actual quality decline connected to their use has never been quantified. Furthermore, no study has employed reference-based assembly using Nanopore reads, which is common with Illumina data. Using Leucanthemum Mill. as an example, we compared the sequence quality of seven plastome assemblies of the same species, using combinations of two sequencing platforms and three analysis pipelines. In addition, we assessed the factors which might influence Nanopore assembly quality during sequence generation and bioinformatic processing.The consensus sequence derived from de novo assembly of Nanopore data had a sequence identity of 99.59% compared to Illumina short-read de novo assembly. Most of the found errors comprise indels (81.5%), and a large majority of them is part of homopolymer regions. The quality of reference-based assembly is heavily dependent upon the choice of a close-enough reference. Using a reference with 0.83% sequence divergence from the studied species, mapping of Nanopore reads results in a consensus comparable to that from Nanopore de novo assembly, and of only slightly inferior quality compared to a reference-based assembly with Illumina data (0.49% and 0.26% divergence from Illumina de novo). For optimal assembly of Nanopore data, appropriate filtering of contaminants and chimeric sequences, as well as employing moderate read coverage, is essential.Based on these results, we conclude that Nanopore long reads are a suitable alternative to Illumina short reads in plastome phylogenomics. Only few errors remain in the finalized assembly, which can be easily masked in phylogenetic analyses without loss in analytical accuracy. The easily applicable and cost-effective technology might warrant more attention by researchers dealing with plant chloroplast genomes.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3919 ◽  
Author(s):  
Hui Cheng ◽  
Jinfeng Li ◽  
Hong Zhang ◽  
Binhua Cai ◽  
Zhihong Gao ◽  
...  

Compared with other members of the family Rosaceae, the chloroplast genomes ofFragariaspecies exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing ofFragariaspecies is needed. In this study, we sequenced the complete chloroplast genome ofF. × ananassa‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination ofde novoassembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of theF. × ananassa‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallestFragariachloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content inFragaria, particularly among three octoploid strawberries which wereF. × ananassa‘Benihoppe’,F.chiloensis(GP33) andF.virginiana(O477). However, when the sequences of the coding and non-coding regions ofF. × ananassa‘Benihoppe’ were compared in detail with those ofF.chiloensis(GP33) andF.virginiana(O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK,trnS-trnG,atpF-atpH,trnC-petN,trnT-psbDandtrnP-psaJ) with a percentage of variable sites greater than 1% and no less than five parsimony-informative sites were identified and may be useful for phylogenetic analysis of the genusFragaria.


Plants ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 737 ◽  
Author(s):  
Abdullah ◽  
Claudia L. Henriquez ◽  
Furrukh Mehmood ◽  
Iram Shahzadi ◽  
Zain Ali ◽  
...  

The chloroplast genome provides insight into the evolution of plant species. We de novo assembled and annotated chloroplast genomes of four genera representing three subfamilies of Araceae: Lasia spinosa (Lasioideae), Stylochaeton bogneri, Zamioculcas zamiifolia (Zamioculcadoideae), and Orontium aquaticum (Orontioideae), and performed comparative genomics using these chloroplast genomes. The sizes of the chloroplast genomes ranged from 163,770 bp to 169,982 bp. These genomes comprise 113 unique genes, including 79 protein-coding, 4 rRNA, and 30 tRNA genes. Among these genes, 17–18 genes are duplicated in the inverted repeat (IR) regions, comprising 6–7 protein-coding (including trans-splicing gene rps12), 4 rRNA, and 7 tRNA genes. The total number of genes ranged between 130 and 131. The infA gene was found to be a pseudogene in all four genomes reported here. These genomes exhibited high similarities in codon usage, amino acid frequency, RNA editing sites, and microsatellites. The oligonucleotide repeats and junctions JSB (IRb/SSC) and JSA (SSC/IRa) were highly variable among the genomes. The patterns of IR contraction and expansion were shown to be homoplasious, and therefore unsuitable for phylogenetic analyses. Signatures of positive selection were seen in three genes in S. bogneri, including ycf2, clpP, and rpl36. This study is a valuable addition to the evolutionary history of chloroplast genome structure in Araceae.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Juan Wang ◽  
Yuan Li ◽  
Chunjuan Li ◽  
Caixia Yan ◽  
Xiaobo Zhao ◽  
...  

Abstract Background The cultivated peanut (Arachis hypogaea) is one of the most important oilseed crops worldwide, however, its improvement is restricted by its narrow genetic base. The highly variable wild peanut species, especially within Sect. Arachis, may serve as a rich genetic source of favorable alleles to peanut improvement; Sect. Arachis is the biggest taxonomic section within genus Arachis and its members also include the cultivated peanut. In order to make good use of these wild resources, the genetic bases and the relationships of the Arachis species need first to be better understood. Results Here, in this study, we have sequenced and/or assembled twelve Arachis complete chloroplast (cp) genomes (eleven from Sect. Arachis). These cp genome sequences enriched the published Arachis cp genome data. From the twelve acquired cp genomes, substantial genetic variation (1368 SNDs, 311 indels) has been identified, which, together with 69 SSR loci that have been identified from the same data set, will provide powerful tools for future explorations. Phylogenetic analyses in our study have grouped the Sect. Arachis species into two major lineages (I & II), this result together with reports from many earlier studies show that lineage II is dominated by AA genome species that are mostly perennial, while lineage I includes species that have more diverse genome types and are mostly annual/biennial. Moreover, the cultivated peanuts and A. monticola that are the only tetraploid (AABB) species within Arachis are nested within the AA genome species-dominated lineage, this result together with the maternal inheritance of chloroplast indicate a maternal origin of the two tetraploid species from an AA genome species. Conclusion In summary, we have acquired sequences of twelve complete Arachis cp genomes, which have not only helped us better understand how the cultivated peanut and its close wild relatives are related, but also provided us with rich genetic resources that may hold great potentials for future peanut breeding.


Plants ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1443
Author(s):  
Rachele Tamburino ◽  
Lorenza Sannino ◽  
Donata Cafasso ◽  
Concita Cantarella ◽  
Luigi Orrù ◽  
...  

In various crops, genetic bottlenecks occurring through domestication can limit crop resilience to biotic and abiotic stresses. In the present study, we investigated nucleotide diversity in tomato chloroplast genome through sequencing seven plastomes of cultivated accessions from the Campania region (Southern Italy) and two wild species among the closest (Solanum pimpinellifolium) and most distantly related (S. neorickii) species to cultivated tomatoes. Comparative analyses among the chloroplast genomes sequenced in this work and those available in GenBank allowed evaluating the variability of plastomes and defining phylogenetic relationships. A dramatic reduction in genetic diversity was detected in cultivated tomatoes, nonetheless, a few de novo mutations, which still differentiated the cultivated tomatoes from the closest wild relative S. pimpinellifolium, were detected and are potentially utilizable as diagnostic markers. Phylogenetic analyses confirmed that S. pimpinellifolium is the closest ancestor of all cultivated tomatoes. Local accessions all clustered together and were strictly related with other cultivated tomatoes (S. lycopersicum group). Noteworthy, S. lycopersicum var. cerasiforme resulted in a mixture of both cultivated and wild tomato genotypes since one of the two analyzed accessions clustered with cultivated tomato, whereas the other with S. pimpinellifolium. Overall, our results revealed a very reduced cytoplasmic variability in cultivated tomatoes and suggest the occurrence of a cytoplasmic bottleneck during their domestication.


2013 ◽  
Author(s):  
Deren A. R. Eaton

Restriction-site associated genomic markers are a powerful tool for investigating evolutionary questions at the population level, but are limited in their utility at deeper phylogenetic scales where fewer orthologous loci are typically recovered across disparate taxa. While this limitation stems in part from mutations to restriction recognition sites that disrupt data generation, an alternative source of data loss comes from the failure to identify homology during bioinformatic analyses. Clustering methods that allow for lower similarity thresholds and the inclusion of indel variation will perform better at assembling RADseq loci at the phylogenetic scale.PyRADis a pipeline to assemblede novoRADseq loci with the aim of optimizing coverage across phylogenetic data sets. It utilizes a wrapper around an alignment-clustering algorithm which allows for indel variation within and between samples, as well as for incomplete overlap among reads (e.g., paired-end). Here I comparePyRADwith the programStacksin their performance analyzing a simulated RADseq data set that includes indel variation. Indels disrupt clustering of homologous loci inStacksbut not inPyRAD, such that the latter recovers more shared loci across disparate taxa. I show through re-analysis of an empirical RADseq data set that indels are a common feature of such data, even at shallow phylogenetic scales.PyRADutilizes parallel processing as well as an optional hierarchical clustering method which allow it to rapidly assemble phylogenetic data sets with hundreds of sampled individuals.


2020 ◽  
Author(s):  
Jiangwei Qiao ◽  
Xiaojun Zhang ◽  
Biyun Chen ◽  
Fei Huang ◽  
Kun Xu ◽  
...  

Abstract Background: The genus Brassica mainly comprises three diploid and three recently derived allotetraploid species, most of which are highly important vegetable, oil or ornamental crops cultivated worldwide. Despite being extensively studied, the origination of B. napus and certain detailed interspecific relationships within Brassica genus remains undetermined and somewhere confused. In the current high-throughput sequencing era, a systemic comparative genomic study based on a large population is necessary and would be crucial to resolve these questions. Results: The chloroplast DNA and mitochondrial DNA were synchronously resequenced in a selected set of Brassica materials, which contain 72 accessions and maximally integrated the known Brassica species. The Brassica genomewide cpDNA and mtDNA variations have been identified. Detailed phylogenetic relationships inside and around Brassica genus have been delineated by the cpDNA- and mtDNA- variation derived phylogenies. Different from B. juncea and B. carinata, the natural B. napus contains three major cytoplasmic haplotypes: the cam-type which directly inherited from B. rapa, polima-type which is close to cam-type as a sister, and the mysterious but predominant nap-type. Certain sparse C-genome wild species might have primarily contributed the nap-type cytoplasm and the corresponding C subgenome to B. napus, implied by their con-clustering in both phylogenies. The strictly concurrent inheritance of mtDNA and cpDNA were dramatically disturbed in the B. napus cytoplasmic male sterile lines (e.g., mori and nsa). The genera Raphanus, Sinapis, Eruca, Moricandia show a strong parallel evolutional relationships with Brassica. Conclusions: The overall variation data and elaborated phylogenetic relationships provide further insights into genetic understanding of Brassica, which can substantially facilitate the development of novel Brassica germplasms.


2020 ◽  
Author(s):  
Abdullah ◽  
Claudia L. Henriquez ◽  
Furrukh Mehmood ◽  
Iram Shahzadi ◽  
Zain Ali ◽  
...  

AbstractThe chloroplast genome provides insight into the evolution of plant species. We de novo assembled and annotated chloroplast genomes of the first representatives of four genera representing three subfamilies: Lasia spinosa (Lasioideae), Stylochaeton bogneri, Zamioculcas zamiifolia (Zamioculcadoideae), and Orontium aquaticum (Orontioideae), and performed comparative genomics using the plastomes. The size of the chloroplast genomes ranged from 163,770–169,982 bp. These genomes comprise 114 unique genes, including 80 protein-coding, 4 rRNA, and 30 tRNA genes. These genomes exhibited high similarities in codon usage, amino acid frequency, RNA editing sites, and microsatellites. The junctions JSB (IRb/SSC) and JSA (SSC/IRa) are highly variable, as is oligonucleotide repeats content among the genomes. The patterns of inverted repeats contraction and expansion were shown to be homoplasious and therefore unsuitable for phylogenetic analyses. Signatures of positive selection were shown for several genes in S. bogneri. This study is a valuable addition to the evolutionary history of chloroplast genome structure in Araceae.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
César Augusto Diniz Xavier ◽  
Margaret Louise Allen ◽  
Anna Elizabeth Whitfield

Abstract Background Advances in sequencing and analysis tools have facilitated discovery of many new viruses from invertebrates, including ants. Solenopsis invicta is an invasive ant that has quickly spread worldwide causing significant ecological and economic impacts. Its virome has begun to be characterized pertaining to potential use of viruses as natural enemies. Although the S. invicta virome is the best characterized among ants, most studies have been performed in its native range, with less information from invaded areas. Methods Using a metatranscriptome approach, we further identified and molecularly characterized virus sequences associated with S. invicta, in two introduced areas, U.S and Taiwan. The data set used here was obtained from different stages (larvae, pupa, and adults) of S. invicta life cycle. Publicly available RNA sequences from GenBank’s Sequence Read Archive were downloaded and de novo assembled using CLC Genomics Workbench 20.0.1. Contigs were compared against the non-redundant protein sequences and those showing similarity to viral sequences were further analyzed. Results We characterized five putative new viruses associated with S. invicta transcriptomes. Sequence comparisons revealed extensive divergence across ORFs and genomic regions with most of them sharing less than 40% amino acid identity with those closest homologous sequences previously characterized. The first negative-sense single-stranded RNA virus genomic sequences included in the orders Bunyavirales and Mononegavirales are reported. In addition, two positive single-strand virus genome sequences and one single strand DNA virus genome sequence were also identified. While the presence of a putative tenuivirus associated with S. invicta was previously suggested to be a contamination, here we characterized and present strong evidence that Solenopsis invicta virus 14 (SINV-14) is a tenui-like virus that has a long-term association with the ant. Furthermore, based on virus sequence abundance compared to housekeeping genes, phylogenetic relationships, and completeness of viral coding sequences, our results suggest that four of five virus sequences reported, those being SINV-14, SINV-15, SINV-16 and SINV-17, may be associated to viruses actively replicating in the ant S. invicta. Conclusions The present study expands our knowledge about viral diversity associated with S. invicta in introduced areas with potential to be used as biological control agents, which will require further biological characterization.


Sign in / Sign up

Export Citation Format

Share Document