Complete chloroplast genome of Stephania tetrandra (Menispermaceae) from Zhejiang Province: insights into molecular structures, comparative genome analysis, mutational hotspots and phylogenetic relationships

Abstract Background The Stephania tetrandra S. Moore (S. tetrandra) is a medicinal plant belonging to the family Menispermaceae that has high medicinal value and is well worth doing further exploration. The wild resources of S. tetrandra were widely distributed in tropical and subtropical regions of China, generating potential genetic diversity and unique population structures. The geographical origin of S. tetrandra is an important factor influencing its quality and price in the market. In addition, the species relationship within Stephania genus still remains uncertain due to high morphological similarity and low support values of molecular analysis approach. The complete chloroplast (cp) genome data has become a promising strategy to determine geographical origin and understand species evolution for closely related plant species. Herein, we sequenced the complete cp genome of S. tetrandra from Zhejiang Province and conducted a comparative analysis within Stephania plants to reveal the structural variations, informative markers and phylogenetic relationship of Stephania species. Results The cp genome of S. tetrandra voucher ZJ was 157,725 bp, consisting of a large single copy region (89,468 bp), a small single copy region (19,685 bp) and a pair of inverted repeat regions (24,286 bp each). A total of 134 genes were identified in the cp genome of S. tetrandra, including 87 protein-coding genes, 8 rRNA genes, 37 tRNA genes and 2 pseudogene copies (ycf1 and rps19). The gene order and GC content were highly consistent in the Stephania species according to the comparative analysis results, with the highest RSCU value in arginine (1.79) and lowest RSCU value in serine of S. tetrandra, respectively. A total of 90 SSRs have been identified in the cp genome of S. tetrandra, where repeats that consisting of A or T bases were much higher than that of G or C bases. In addition, 92 potential RNA editing sites were identified in 25 protein-coding genes, with the most predicted RNA editing sites in ndhB gene. The variations on length and expansion extent to the junction of ycf1 gene were observed between S. tetrandra vouchers from different regions, indicating potential markers for further geographical origin discrimination. Moreover, the values of transition to transversion ratio (Ts/Tv) in the Stephania species were significantly higher than 1 using Pericampylus glaucus as reference. Comparative analysis of the Stephania cp genomes revealed 5 highly variable regions, including 3 intergenic regions (trnH-psbA, trnD-trnY, trnP) and two protein coding genes (rps16 and ndhA). The identified mutational hotspots of Stephania plants exhibited multiple SNP sites and Gaps, as well as different Ka/Ks ratio values. In addition, five pairs of specific primers targeting the divergence regions were accordingly designed, which could be utilized as potential molecular markers for species identification, population genetic and phylogenetic analysis in Stephania species. Phylogenetic tree analysis based on the conserved chloroplast protein coding genes indicated a sister relationship between S. tetrandra and the monophyletic group of S. japonica and S. kwangsiensis with high support values, suggesting a close genetic relationship within Stephania plants. However, two S. tetrandra vouches from different regions failed to cluster into one clade, confirming the occurrences of genetic diversities and requiring further investigation for geographical tracing strategy. Conclusions Overall, we provided comprehensive and detailed information on the complete chloroplast genome and identified nucleotide diversity hotspots of Stephania species. The obtained genetic resource of S. tetrandra from Zhejiang Province would facilitate future studies in DNA barcode, species discrimination, the intraspecific and interspecific variability and the phylogenetic relationships of Stephania plants.

Download Full-text

Complete Chloroplast Genome Sequence of Justicia flava: Genome Comparative Analysis and Phylogenetic Relationships among Acanthaceae

BioMed Research International ◽

10.1155/2019/4370258 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Samaila S. Yaradua ◽

Dhafer A. Alzahrani ◽

Enas J. Albokhary ◽

Abidina Abba ◽

Abubakar Bello

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Phylogenetic Relationships ◽

Inverted Repeat ◽

Gc Content ◽

Single Copy ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Protein Coding Genes ◽

Cp Genome

The complete chloroplast genome of J. flava, an endangered medicinal plant in Saudi Arabia, was sequenced and compared with cp genome of three Acanthaceae species to characterize the cp genome, identify SSRs, and also detect variation among the cp genomes of the sampled Acanthaceae. NOVOPlasty was used to assemble the complete chloroplast genome from the whole genome data. The cp genome of J. flava was 150, 888bp in length with GC content of 38.2%, and has a quadripartite structure; the genome harbors one pair of inverted repeat (IRa and IRb 25, 500bp each) separated by large single copy (LSC, 82, 995 bp) and small single copy (SSC, 16, 893 bp). There are 132 genes in the genome, which includes 80 protein coding genes, 30 tRNA, and 4 rRNA; 113 are unique while the remaining 19 are duplicated in IR regions. The repeat analysis indicates that the genome contained all types of repeats with palindromic occurring more frequently; the analysis also identified total number of 98 simple sequence repeats (SSR) of which majority are mononucleotides A/T and are found in the intergenic spacer. The comparative analysis with other cp genomes sampled indicated that the inverted repeat regions are conserved than the single copy regions and the noncoding regions show high rate of variation than the coding region. All the genomes have ndhF and ycf1 genes in the border junction of IRb and SSC. Sequence divergence analysis of the protein coding genes showed that seven genes (petB, atpF, psaI, rpl32, rpl16, ycf1, and clpP) are under positive selection. The phylogenetic analysis revealed that Justiceae is sister to Ruellieae. This study reported the first cp genome of the largest genus in Acanthaceae and provided resources for studying genetic diversity of J. flava as well as resolving phylogenetic relationships within the core Acanthaceae.

Download Full-text

Complete Chloroplast Genome of Argania spinosa: Structural Organization and Phylogenetic Relationships in Sapotaceae

Plants ◽

10.3390/plants9101354 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1354

Author(s):

Slimane Khayi ◽

Fatima Gaboun ◽

Stacy Pirro ◽

Tatiana Tatusova ◽

Abdelhamid El Mousadik ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Important Species ◽

Complete Chloroplast Genome ◽

Argania Spinosa ◽

Protein Coding Genes ◽

Cp Genome

Argania spinosa (Sapotaceae), an important endemic Moroccan oil tree, is a primary source of argan oil, which has numerous dietary and medicinal proprieties. The plant species occupies the mid-western part of Morocco and provides great environmental and socioeconomic benefits. The complete chloroplast (cp) genome of A. spinosa was sequenced, assembled, and analyzed in comparison with those of two Sapotaceae members. The A. spinosa cp genome is 158,848 bp long, with an average GC content of 36.8%. The cp genome exhibits a typical quadripartite and circular structure consisting of a pair of inverted regions (IR) of 25,945 bp in length separating small single-copy (SSC) and large single-copy (LSC) regions of 18,591 and 88,367 bp, respectively. The annotation of A. spinosa cp genome predicted 130 genes, including 85 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, and 37 transfer RNA (tRNA) genes. A total of 44 long repeats and 88 simple sequence repeats (SSR) divided into mononucleotides (76), dinucleotides (7), trinucleotides (3), tetranucleotides (1), and hexanucleotides (1) were identified in the A. spinosa cp genome. Phylogenetic analyses using the maximum likelihood (ML) method were performed based on 69 protein-coding genes from 11 species of Ericales. The results confirmed the close position of A. spinosa to the Sideroxylon genus, supporting the revisiting of its taxonomic status. The complete chloroplast genome sequence will be valuable for further studies on the conservation and breeding of this medicinally and culinary important species and also contribute to clarifying the phylogenetic position of the species within Sapotaceae.

Download Full-text

The complete chloroplast genome of Saxifraga sinomontana (Saxifragaceae) and comparative analysis with other Saxifragaceae species

Revista Brasileira de Botânica ◽

10.1007/s40415-019-00561-y ◽

2019 ◽

Vol 42 (4) ◽

pp. 601-611 ◽

Cited By ~ 1

Author(s):

Yan Li ◽

Liukun Jia ◽

Zhihua Wang ◽

Rui Xing ◽

Xiaofeng Chi ◽

...

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Phylogenetic Relationships ◽

De Novo ◽

Single Copy ◽

Bootstrap Support ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Protein Coding Genes ◽

Chloroplast Genomes

Abstract Saxifraga sinomontana J.-T. Pan & Gornall belongs to Saxifraga sect. Ciliatae subsect. Hirculoideae, a lineage containing ca. 110 species whose phylogenetic relationships are largely unresolved due to recent rapid radiations. Analyses of complete chloroplast genomes have the potential to significantly improve the resolution of phylogenetic relationships in this young plant lineage. The complete chloroplast genome of S. sinomontana was de novo sequenced, assembled and then compared with that of other six Saxifragaceae species. The S. sinomontana chloroplast genome is 147,240 bp in length with a typical quadripartite structure, including a large single-copy region of 79,310 bp and a small single-copy region of 16,874 bp separated by a pair of inverted repeats (IRs) of 25,528 bp each. The chloroplast genome contains 113 unique genes, including 79 protein-coding genes, four rRNAs and 30 tRNAs, with 18 duplicates in the IRs. The gene content and organization are similar to other Saxifragaceae chloroplast genomes. Sixty-one simple sequence repeats were identified in the S. sinomontana chloroplast genome, mostly represented by mononucleotide repeats of polyadenine or polythymine. Comparative analysis revealed 12 highly divergent regions in the intergenic spacers, as well as coding genes of matK, ndhK, accD, cemA, rpoA, rps19, ndhF, ccsA, ndhD and ycf1. Phylogenetic reconstruction of seven Saxifragaceae species based on 66 protein-coding genes received high bootstrap support values for nearly all identified nodes, suggesting a promising opportunity to resolve infrasectional relationships of the most species-rich section Ciliatae of Saxifraga.

Download Full-text

Deciphering tea tree chloroplast and mitochondrial genomes of Camellia sinensis var. assamica

10.1101/532853 ◽

2019 ◽

Author(s):

Fen Zhang ◽

Wei Li ◽

Cheng-wen Gao ◽

Li-zhi Gao

Keyword(s):

Rna Editing ◽

Camellia Sinensis ◽

De Novo ◽

Single Copy ◽

Rrna Genes ◽

Protein Coding ◽

Tea Tree ◽

Protein Coding Genes ◽

Cp Genome ◽

Mt Genome

ABSTRACTTea is the most popular non-alcoholic caffeine-containing and the oldest beverage in the world. Despite its enormous industrial, cultural and medicinal values, the chloroplast (cp) and mitochondrial (mt) genomes are not available for Camellia sinensis var. assamica. In this study, we de novo assembled the cp genome sequence of C. sinensis var. assamica into a circular contig of 157,100 bp in length with an overall GC content of 37.29%, comprising a large single-copy region (LSC, 86,649 bp) and a small single-copy region (SSC, 18,285 bp) separated by a pair of inverted repeats (IRs, 26,083 bp). We annotated a total of 141 cp genes, of which 87 are protein-coding genes, 46 are tRNA genes, and eight are rRNA genes. We also de novo assembled the mt genome of C. sinensis var. assamica into two complete circular scaffolds (702,253 bp and 178,082 bp) with overall GC contents of 45.63% and 45.81%, respectively. We annotated a total of 71 mt genes, including 44 protein-coding genes, 24 tRNAs, and 3 rRNAs. Comparative analysis suggests repeat-rich nature of the mt genome compared to the cp genome, for example, with the characterization of 37,878 bp and 149 bp of long repeat sequences and 665 and 214 SSRs, respectively. We also detected 478 RNA-editing sites in 42 protein-coding mt genes, which are ∼4.4-fold more than 54 RNA-editing sites detected in 21 protein-coding cp genes. The high-quality cp and mt genomes of C. sinensis var. assamica presented in this study will become an invaluable resource for a range of genetic, functional, evolutionary and comparative genomic studies in tea tree and other Camellia species of the Theaceae family.

Download Full-text

Complete Chloroplast Genome Sequence of Erigeron breviscapus and Characterization of Chloroplast Regulatory Elements

Frontiers in Plant Science ◽

10.3389/fpls.2021.758290 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yifan Yu ◽

Zhen Ouyang ◽

Juan Guo ◽

Wen Zeng ◽

Yujun Zhao ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Regulatory Elements ◽

Rrna Genes ◽

Expression Vectors ◽

Protein Coding ◽

Protein Coding Genes ◽

Flanking Sequences ◽

Erigeron Breviscapus ◽

Cp Genome

Erigeron breviscapus is a famous medicinal plant. However, the limited chloroplast genome information of E. breviscapus, especially for the chloroplast DNA sequence resources, has hindered the study of E. breviscapus chloroplast genome transformation. Here, the complete chloroplast (cp) genome of E. breviscapus was reported. This genome was 152,164bp in length, included 37.2% GC content and was structurally arranged into two 24,699bp inverted repeats (IRs) and two single-copy areas. The sizes of the large single-copy region and the small single-copy region were 84,657 and 18,109bp, respectively. The E. breviscapus cp genome consisted of 127 coding genes, including 83 protein coding genes, 36 transfer RNA (tRNA) genes, and eight ribosomal RNA (rRNA) genes. For those genes, 95 genes were single copy genes and 16 genes were duplicated in two inverted regions with seven tRNAs, four rRNAs, and five protein coding genes. Then, genomic DNA of E. breviscapus was used as a template, and the endogenous 5' and 3' flanking sequences of the trnI gene and trnA gene were selected as homologous recombinant fragments in vector construction and cloned through PCR. The endogenous 5' flanking sequences of the psbA gene and rrn16S gene, the endogenous 3' flanking sequences of the psbA gene, rbcL gene, and rps16 gene and one sequence element from the psbN-psbH chloroplast operon were cloned, and certain chloroplast regulatory elements were identified. Two homologous recombination fragments and all of these elements were constructed into the cloning vector pBluescript SK (+) to yield a series of chloroplast expression vectors, which harbored the reporter gene EGFP and the selectable marker aadA gene. After identification, the chloroplast expression vectors were transformed into Escherichia coli and the function of predicted regulatory elements was confirmed by a spectinomycin resistance test and fluorescence intensity measurement. The results indicated that aadA gene and EGFP gene were efficiently expressed under the regulation of predicted regulatory elements and the chloroplast expression vector had been successfully constructed, thereby providing a solid foundation for establishing subsequent E. breviscapus chloroplast transformation system and genetic improvement of E. breviscapus.

Download Full-text

Characterization of the complete chloroplast genome sequence and phylogenetic analysis of B. oleracea var. italica

10.21203/rs.2.20976/v1 ◽

2020 ◽

Author(s):

Zhenchao Zhang ◽

Zhongliang Dai ◽

Yuemei Yao ◽

Yongfei Pan ◽

Guosheng Sun ◽

...

Keyword(s):

Chloroplast Genome ◽

Genome Sequence ◽

Genomic Structure ◽

Gc Content ◽

Single Copy ◽

Biological Research ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome ◽

Functional Components

Abstract Backgrounds: Broccoli (Brassica. oleracea var. italica L.) is known as one of the most nutritionally rich vegetables, as well as rich in functional components that benefit to health. The main purposes of this research were sequencing, assembling and annotation of chloroplast genome of broccoli based on Illumina HiSeq2500 sequencing platform. Results: The size of the broccoli cp genome is 153,364 bp, including two inverted repeat (IR) regions of 26,197 bp each, separated by a small single copy (SSC) region of 17,834 bp and a large single copy (LSC) region of 83,136 bp. The GC content of the complete genome is 36.36%, while those of SSC, LSC, and IR are 29.1%, 34.15% and 42.35%, respectively. It harbors 134 functional genes, including 87 protein-coding genes, 39 tRNAs and 8 rRNAs, with 31 duplicates in the IRs. The most abundant amino acid in the protein-coding genes is leucine, while the least is cysteine. Codon usage frequency showed bias for A/T-ending codons in the cp genome. In the repeat structure analysis, a total of 34 repeat sequences and 291 simple sequence repeat (SSRs) were detected in the work. Although cp genomic structure and size are highly conserved, the SC-IR boundary regions are variable between the 7 cp genomes. The phylogenetic relationships based on complete cp genome from 9 species suggest that B. oleracea var. italica is closely related to Brassica juncea. Conclusions: The complete cp genome sequence was obtained and annotated for broccoli for the first time. The information acquired from this research will be useful for further species identification, population genetics and biological research of broccoli.

Download Full-text

Analyzing and characterization of the chloroplast genome of Salix suchowensis

10.7287/peerj.preprints.2388 ◽

2016 ◽

Author(s):

Congrui Sun ◽

Jie Li ◽

Xiaogang Dai ◽

Yingnan Chen

Keyword(s):

Tandem Repeats ◽

Gene Annotation ◽

Repetitive Sequences ◽

Single Copy ◽

Phylogenetic Position ◽

Shrub Willow ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome ◽

Rna Genes

By screening sequence reads from the chloroplast (cp) genome of S. suchowensis that generated by the next generation sequencing platforms, we built the complete circular pseudomolecule for its cp genome. This pseudomolecule is 155,508 bp in length, which has a typical quadripartite structure containing two single copy regions, a large single copy region (LSC 84,385 bp), and a small single copy region (SSC 16,209 bp) separated by inverted repeat regions (IRs 27,457 bp). Gene annotation revealed that the cp genome of S. suchowensis encoded 119 unique genes, including 4 ribosome RNA genes, 30 transfer RNA genes, 82 protein-coding genes and 3 pseudogenes. Analyzing the repetitive sequences detected 15 tandem repeats, 16 forward repeats and 5 palindromic repeats. In addition, a total of 188 perfect microsatellites were detected, which were characterized as A/T predominance in nucleotide compositions. Significant shifting of the IR/SSC boundaries was revealed by comparing this cp genome with that of other rosids plants. We also built phylogenetic trees to demonstrate the phylogenetic position of S. suchowensis in Rosidae, with 66 orthologous protein-coding genes presented in the cp genomes of 32 species. By sequencing 30 amplicons based on the pseudomolecule, experimental verification achieved accuracy up to 99.84% for the cp genome assembly of S. suchowensis. In conclusion, this study built a high quality pseudomolecule for the cp genome of S. suchowensis, which is a useful resource for facilitating the development of this shrub willow into a more productive bioenergy crop.

Download Full-text

Complete Chloroplast Genome of Michelia shiluensis and a Comparative Analysis with Four Magnoliaceae Species

Forests ◽

10.3390/f11030267 ◽

2020 ◽

Vol 11 (3) ◽

pp. 267 ◽

Cited By ~ 2

Author(s):

Yanwen Deng ◽

Yiyang Luo ◽

Yu He ◽

Xinsheng Qin ◽

Chonggao Li ◽

...

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Single Copy ◽

Closely Related Species ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Phylogenetic Studies ◽

Natural Reproduction ◽

Rare And Endangered ◽

Landscape Gardening

Michelia shiluensis is a rare and endangered magnolia species found in South China. This species produces beautiful flowers and is thus widely used in landscape gardening. Additionally, its timber is also used for furniture production. As a result of low rates of natural reproduction and increasing levels of human impact, wild M. shiluensis populations have become fragmented. This species is now classified as endangered by the IUCN. In the present study, we characterized the complete chloroplast genome of M. shiluensis and found it to be 160,075 bp in length with two inverted repeat regions (26,587 bp each), a large single-copy region (88,105 bp), and a small copy region (18,796 bp). The genome contained 131 genes, including 86 protein-coding genes, 37 tRNAs, and 8 rRNAs. The guanine-cytosine content represented 39.26% of the overall genome. Comparative analysis revealed high similarity between the M. shiluensis chloroplast genome and those of four closely related species: Michelia odora, Magnolia laevifolia, Magnolia insignis, and Magnolia cathcartii. Phylogenetic analysis shows that M. shiluensis is most closely related to M. odora. The genomic information presented in this study is valuable for further classification, phylogenetic studies, and to support ongoing conservation efforts.

Download Full-text

The Complete Chloroplast Genome Sequence of the Speirantha gardenii: Comparative and Adaptive Evolutionary Analysis

Agronomy ◽

10.3390/agronomy10091405 ◽

2020 ◽

Vol 10 (9) ◽

pp. 1405

Author(s):

Gurusamy Raman ◽

SeonJoo Park

Keyword(s):

Chloroplast Genome ◽

Genome Sequence ◽

Gene Evolution ◽

Single Copy ◽

Specific Marker ◽

Evolutionary Analysis ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Protein Coding Genes ◽

Chloroplast Genome Sequence

The plant “False Lily of the Valley”, Speirantha gardenii is restricted to south-east China and considered as an endemic plant. Due to its limited availability, this plant was less studied. Hence, this study is focused on its molecular studies, where we have sequenced the complete chloroplast genome of S. gardenii and this is the first report on the chloroplast genome sequence of Speirantha. The complete S. gardenii chloroplast genome is of 156,869 bp in length with 37.6% GC, which included a pair of inverted repeats (IRs) each of 26,437 bp that separated a large single-copy (LSC) region of 85,368 bp and a small single-copy (SSC) region of 18,627 bp. The chloroplast genome comprises 81 protein-coding genes, 30 tRNA and four rRNA unique genes. Furthermore, a total of 699 repeats and 805 simple-sequence repeats (SSRs) markers are identified in the genome. Additionally, KA/KS nucleotide substitution analysis showed that seven protein-coding genes have highly diverged and identified nine amino acid sites under potentially positive selection in these genes. Phylogenetic analyses suggest that S. gardenii species has a closer genetic relationship to the Reineckea, Rohdea and Convallaria genera. The present study will provide insights into developing a lineage-specific marker for genetic diversity and gene evolution studies in the Nolinoideae taxa.

Download Full-text

Characterization of the complete chloroplast genome sequence of Vitis vinifera ‘Guifeimeigui’

Scientific Journal of Genetics and Gene Therapy ◽

10.17352/sjggt.000019 ◽

2021 ◽

pp. 001-003

Author(s):

Liu Li ◽

Yang Yang ◽

Li Xiujie ◽

Li Bo

Keyword(s):

Vitis Vinifera ◽

Gc Content ◽

Single Copy ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Chloroplast Genome Sequence ◽

Cp Genome ◽

Eurasian Species ◽

Rna Genes

Vitis vinifera ‘Guifeimeigui’ is a diploid table grape, a Eurasian species. This research first reported the complete chloroplast (cp) genome of Vitis vinifera ‘Guifeimeigui’. The size of the complete cp genome is 160,928 bp and its GC content is 37.38%, including a pair of inverted repeats (26,353 bp each) separated by large (89,150 bp) and small (19,072 bp) single-copy regions. It encodes 85 genes, including 40 protein coding genes, 37 transfer RNA genes (tRNA), and 8 ribosomal RNA genes (rRNA). The Maximum Likelihood (ML) phylogenetic tree demonstrated that Vitis vinifera ‘Guifeimeigui’ is close to Vitis vinifera.

Download Full-text