Comparison and Phylogenetic Analysis of Chloroplast Genomes of Three Medicinal and Edible Amomum Species

Amomum villosum is an important medicinal and edible plant with several pharmacologically active volatile oils. However, identifying A. villosum from A. villosum var. xanthioides and A. longiligulare which exhibit similar morphological characteristics to A. villosum, is difficult. The main goal of this study, therefore, is to mine genetic resources and improve molecular methods that could be used to distinguish these species. A total of eight complete chloroplasts (cp) genomes of these Amomum species which were collected from the main producing areas in China were determined to be 163,608–164,069 bp in size. All genomes displayed a typical quadripartite structure with a pair of inverted repeat (IR) regions (29,820–29,959 bp) that separated a large single copy (LSC) region (88,680–88,857 bp) from a small single copy (SSC) region (15,288–15,369 bp). Each genome encodes 113 different genes with 79 protein-coding genes, 30 tRNA genes, and four rRNA genes. More than 150 SSRs were identified in the entire cp genomes of these three species. The Sanger sequencing results based on 32 Amomum samples indicated that five highly divergent regions screened from cp genomes could not be used to distinguish Amomum species. Phylogenetic analysis showed that the cp genomes could not only accurately identify Amomum species, but also provide a solid foundation for the establishment of phylogenetic relationships of Amomum species. The availability of cp genome resources and the comparative analysis is beneficial for species authentication and phylogenetic analysis in Amomum.

Download Full-text

Complete Chloroplast Genomes from Sanguisorba: Identity and Variation Among Four Species

Molecules ◽

10.3390/molecules23092137 ◽

2018 ◽

Vol 23 (9) ◽

pp. 2137 ◽

Cited By ~ 6

Author(s):

Xiang-Xiao Meng ◽

Yan-Fang Xian ◽

Li Xiang ◽

Dong Zhang ◽

Yu-Hua Shi ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Future Studies ◽

Chloroplast Genomes ◽

Close Relationship ◽

Cp Genome ◽

Sanguisorba Officinalis

The genus Sanguisorba, which contains about 30 species around the world and seven species in China, is the source of the medicinal plant Sanguisorba officinalis, which is commonly used as a hemostatic agent as well as to treat burns and scalds. Here we report the complete chloroplast (cp) genome sequences of four Sanguisorba species (S. officinalis, S. filiformis, S. stipulata, and S. tenuifolia var. alba). These four Sanguisorba cp genomes exhibit typical quadripartite and circular structures, and are 154,282 to 155,479 bp in length, consisting of large single-copy regions (LSC; 84,405–85,557 bp), small single-copy regions (SSC; 18,550–18,768 bp), and a pair of inverted repeats (IRs; 25,576–25,615 bp). The average GC content was ~37.24%. The four Sanguisorba cp genomes harbored 112 different genes arranged in the same order; these identical sections include 78 protein-coding genes, 30 tRNA genes, and four rRNA genes, if duplicated genes in IR regions are counted only once. A total of 39–53 long repeats and 79–91 simple sequence repeats (SSRs) were identified in the four Sanguisorba cp genomes, which provides opportunities for future studies of the population genetics of Sanguisorba medicinal plants. A phylogenetic analysis using the maximum parsimony (MP) method strongly supports a close relationship between S. officinalis and S. tenuifolia var. alba, followed by S. stipulata, and finally S. filiformis. The availability of these cp genomes provides valuable genetic information for future studies of Sanguisorba identification and provides insights into the evolution of the genus Sanguisorba.

Download Full-text

Complete Chloroplast Genomes of Chlorophytum comosum and Chlorophytum gallabatense: Genome Structures, Comparative and Phylogenetic Analysis

Plants ◽

10.3390/plants9030296 ◽

2020 ◽

Vol 9 (3) ◽

pp. 296 ◽

Cited By ~ 3

Author(s):

Jacinta N. Munyao ◽

Xiang Dong ◽

Jia-Xin Yang ◽

Elijah M. Mbandi ◽

Vincent O. Wanga ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Important Species ◽

Genomic Resources ◽

Phylogenetic Studies ◽

Chloroplast Genomes ◽

Cp Genome ◽

High Level

The genus Chlorophytum includes many economically important species well-known for medicinal, ornamental, and horticultural values. However, to date, few molecular genomic resources have been reported for this genus. Therefore, there is limited knowledge of phylogenetic studies, and the available chloroplast (cp) genome of Chlorophytum (C. rhizopendulum) does not provide enough information on this genus. In this study, we present genomic resources for C. comosum and C. gallabatense, which had lengths of 154,248 and 154,154 base pairs (bp), respectively. They had a pair of inverted repeats (IRa and IRb) of 26,114 and 26,254 bp each in size, separating the large single-copy (LSC) region of 84,004 and 83,686 bp from the small single-copy (SSC) region of 18,016 and 17,960 bp in C. comosum and C. gallabatense, respectively. There were 112 distinct genes in each cp genome, which were comprised of 78 protein-coding genes, 30 tRNA genes, and four rRNA genes. The comparative analysis with five other selected species displayed a generally high level of sequence resemblance in structural organization, gene content, and arrangement. Additionally, the phylogenetic analysis confirmed the previous phylogeny and produced a phylogenetic tree with similar topology. It showed that the Chlorophytum species (C. comosum, C. gallabatense and C. rhizopendulum) were clustered together in the same clade with a closer relationship than other plants to the Anthericum ramosum. This research, therefore, presents valuable records for further molecular evolutionary and phylogenetic studies which help to fill the gap in genomic resources and resolve the taxonomic complexes of the genus.

Download Full-text

Initial Complete Chloroplast Genomes of Alchemilla (Rosaceae): Comparative Analysis and Phylogenetic Relationships

Frontiers in Genetics ◽

10.3389/fgene.2020.560368 ◽

2020 ◽

Vol 11 ◽

Author(s):

Peninah Cheptoo Rono ◽

Xiang Dong ◽

Jia-Xin Yang ◽

Fredrick Munyao Mutie ◽

Millicent A. Oulo ◽

...

Keyword(s):

Tandem Repeats ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Coding Region ◽

Base Pairs ◽

Protein Coding ◽

Chloroplast Genomes ◽

The Family ◽

Cp Genome

The genus Alchemilla L., known for its medicinal and ornamental value, is widely distributed in the Holarctic regions with a few species found in Asia and Africa. Delimitation of species within Alchemilla is difficult due to hybridization, autonomous apomixes, and polyploidy, necessitating efficient molecular-based characterization. Herein, we report the initial complete chloroplast (cp) genomes of Alchemilla. The cp genomes of two African (Afromilla) species Alchemilla pedata and Alchemilla argyrophylla were sequenced, and phylogenetic and comparative analyses were conducted in the family Rosaceae. The cp genomes mapped a typical circular quadripartite structure of lengths 152,438 and 152,427 base pairs (bp) in A. pedata and A. argyrophylla, respectively. Alchemilla cp genomes were composed of a pair of inverted repeat regions (IRa/IRb) of length 25,923 and 25,915 bp, separating the small single copy (SSC) region of 17,980 and 17,981 bp and a large single copy (LSC) region of 82,612 and 82,616 bp in A. pedata and A. argyrophylla, respectively. The cp genomes encoded 114 unique genes including 88 protein-coding genes, 37 transfer RNA (tRNA) genes, and 4 ribosomal RNA (rRNA) genes. Additionally, 88 and 95 simple sequence repeats (SSRs) and 37 and 40 tandem repeats were identified in A. pedata and A. argyrophylla, respectively. Significantly, the loss of group II intron in atpF gene in Alchemilla species was detected. Phylogenetic analysis based on 26 whole cp genome sequences and 78 protein-coding gene sequences of 27 Rosaceae species revealed a monophyletic clustering of Alchemilla nested within subfamily Rosoideae. Based on a protein-coding region, negative selective pressure (Ka/Ks < 1) was detected with an average Ka/Ks value of 0.1322 in A. argyrophylla and 0.1418 in A. pedata. The availability of complete cp genome in the genus Alchemilla will contribute to species delineation and further phylogenetic and evolutionary studies in the family Rosaceae.

Download Full-text

Comparative and Phylogenetic Analysis of Complete Chloroplast Genomes in Eragrostideae (Chloridoideae, Poaceae)

Plants ◽

10.3390/plants10010109 ◽

2021 ◽

Vol 10 (1) ◽

pp. 109

Author(s):

Kuan Liu ◽

Rong Wang ◽

Xiu-Xiu Guo ◽

Xue-Jie Zhang ◽

Xiao-Jian Qu ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Genome Structure ◽

Single Copy ◽

Rrna Genes ◽

Molecular Evidence ◽

Trna Genes ◽

Protein Coding ◽

Hypervariable Regions ◽

Chloroplast Genomes ◽

Small Single Copy

Eragrostideae Stapf, the second-largest tribe in Chloridoideae (Poaceae), is a taxonomically complex tribe. In this study, chloroplast genomes of 13 Eragrostideae species were newly sequenced and used to resolve the phylogenetic relationships within Eragrostideae. Including seven reported chloroplast genomes from Eragrostideae, the genome structure, number and type of genes, codon usage, and repeat sequences of 20 Eragrostideae species were analyzed. The length of these chloroplast genomes varied from 130,773 bp to 135,322 bp. These chloroplast genomes showed a typical quadripartite structure, including a large single-copy region (77,993–80,643 bp), a small single-copy region (12,410–12,668 bp), and a pair of inverted repeats region (19,394–21,074 bp). There were, in total, 129–133 genes annotated in the genome, including 83–87 protein-coding genes, eight rRNA genes, and 38 tRNA genes. Forward and palindromic repeats were the most common repeat types. In total, 10 hypervariable regions (rpl22, rpoA, ndhF, matK, trnG–UCC-trnT–GGU, ndhF–rpl32, ycf4–cemA, rpl32–trnL–UAG, trnG–GCC–trnfM–CAU, and ccsA–ndhD) were found, which can be used as candidate molecular markers for Eragrostideae. Phylogenomic studies concluded that Enneapogon diverged first, and Eragrostis including Harpachne is the sister to Uniola. Furthermore, Harpachne harpachnoides is considered as a species of Eragrostis based on morphological and molecular evidence. In addition, the interspecies relationships within Eragrostis are resolved based on complete chloroplast genomes. This study provides useful chloroplast genomic information for further phylogenetic analysis of Eragrostideae.

Download Full-text

Comparative Analyses of Euonymus Chloroplast Genomes: Genetic Structure, Screening for Loci With Suitable Polymorphism, Positive Selection Genes, and Phylogenetic Relationships Within Celastrineae

Frontiers in Plant Science ◽

10.3389/fpls.2020.593984 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yongtan Li ◽

Yan Dong ◽

Yichao Liu ◽

Xiaoyue Yu ◽

Minsheng Yang ◽

...

Keyword(s):

Positive Selection ◽

Chloroplast Genome ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Evolutionary Relationships ◽

Trna Genes ◽

Protein Coding ◽

Chloroplast Genomes ◽

Cp Genome

In this study, we assembled and annotated the chloroplast (cp) genome of the Euonymus species Euonymus fortunei, Euonymus phellomanus, and Euonymus maackii, and performed a series of analyses to investigate gene structure, GC content, sequence alignment, and nucleic acid diversity, with the objectives of identifying positive selection genes and understanding evolutionary relationships. The results indicated that the Euonymus cp genome was 156,860–157,611bp in length and exhibited a typical circular tetrad structure. Similar to the majority of angiosperm chloroplast genomes, the results yielded a large single-copy region (LSC) (85,826–86,299bp) and a small single-copy region (SSC) (18,319–18,536bp), separated by a pair of sequences (IRA and IRB; 26,341–26,700bp) with the same encoding but in opposite directions. The chloroplast genome was annotated to 130–131 genes, including 85–86 protein coding genes, 37 tRNA genes, and eight rRNA genes, with GC contents of 37.26–37.31%. The GC content was variable among regions and was highest in the inverted repeat (IR) region. The IR boundary of Euonymus happened expanding resulting that the rps19 entered into IR region and doubled completely. Such fluctuations at the border positions might be helpful in determining evolutionary relationships among Euonymus. The simple-sequence repeats (SSRs) of Euonymus species were composed primarily of single nucleotides (A)n and (T)n, and were mostly 10–12bp in length, with an obvious A/T bias. We identified several loci with suitable polymorphism with the potential use as molecular markers for inferring the phylogeny within the genus Euonymus. Signatures of positive selection were seen in rpoB protein encoding genes. Based on data from the whole chloroplast genome, common single copy genes, and the LSC, SSC, and IR regions, we constructed an evolutionary tree of Euonymus and related species, the results of which were consistent with traditional taxonomic classifications. It showed that E. fortunei sister to the Euonymus japonicus, whereby E. maackii appeared as sister to Euonymus hamiltonianus. Our study provides important genetic information to support further investigations into the phylogenetic development and adaptive evolution of Euonymus species.

Download Full-text

Complete chloroplast genome features and phylogenetic analysis of Eruca sativa (Brassicaceae)

PLoS ONE ◽

10.1371/journal.pone.0248556 ◽

2021 ◽

Vol 16 (3) ◽

pp. e0248556

Author(s):

Bin Zhu ◽

Fang Qian ◽

Yunfeng Hou ◽

Weicheng Yang ◽

Mengxian Cai ◽

...

Keyword(s):

Phylogenetic Analysis ◽

De Novo ◽

Repetitive Sequences ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Eruca Sativa ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome

Eruca sativa Mill. (Brassicaceae) is an important edible vegetable and a potential medicinal plant due to the antibacterial activity of its seed oil. Here, the complete chloroplast (cp) genome of E. sativa was de novo assembled with a combination of long PacBio reads and short Illumina reads. The E. sativa cp genome had a quadripartite structure that was 153,522 bp in size, consisting of one large single-copy region of 83,320 bp and one small single-copy region of 17,786 bp which were separated by two inverted repeat (IRa and IRb) regions of 26,208 bp. This complete cp genome harbored 113 unique genes: 79 protein-coding genes, 30 tRNA genes, and four rRNA genes. Forty-nine long repetitive sequences and 69 simple sequence repeats were identified in the E. sativa cp genome. A codon usage analysis of the E. sativa cp genome showed a bias toward codons ending in A/T. The E. sativa cp genome was similar in size, gene composition, and linearity of the structural region when compared with other Brassicaceae cp genomes. Moreover, the analysis of the synonymous (Ks) and non-synonymous (Ka) substitution rates demonstrated that protein-coding genes generally underwent purifying selection pressure, expect ycf1, ycf2, and rps12. A phylogenetic analysis determined that E. sativa is evolutionarily close to important Brassica species, indicating that it may be possible to transfer favorable E. sativa alleles into other Brassica species. Our results will be helpful to advance genetic improvement and breeding of E. sativa, and will provide valuable information for utilizing E. sativa as an important resource to improve other Brassica species.

Download Full-text

The Complete Chloroplast Genome of the Vulnerable Oreocharis esquirolii (Gesneriaceae): Structural Features, Comparative and Phylogenetic Analysis

Plants ◽

10.3390/plants9121692 ◽

2020 ◽

Vol 9 (12) ◽

pp. 1692

Author(s):

Li Gu ◽

Ting Su ◽

Ming-Tai An ◽

Guo-Xiong Hu

Keyword(s):

Phylogenetic Analysis ◽

Sequence Similarity ◽

Single Copy ◽

Structural Features ◽

Rrna Genes ◽

Trna Genes ◽

Sequencing Data ◽

High Sequence Similarity ◽

Plastid Genomes ◽

Cp Genome

Oreocharis esquirolii, a member of Gesneriaceae, is known as Thamnocharis esquirolii, which has been regarded a synonym of the former. The species is endemic to Guizhou, southwestern China, and is evaluated as vulnerable (VU) under the International Union for Conservation of Nature (IUCN) criteria. Until now, the sequence and genome information of O. esquirolii remains unknown. In this study, we assembled and characterized the complete chloroplast (cp) genome of O. esquirolii using Illumina sequencing data for the first time. The total length of the cp genome was 154,069 bp with a typical quadripartite structure consisting of a pair of inverted repeats (IRs) of 25,392 bp separated by a large single copy region (LSC) of 85,156 bp and a small single copy region (SSC) of18,129 bp. The genome comprised 114 unique genes with 80 protein-coding genes, 30 tRNA genes, and four rRNA genes. Thirty-one repeat sequences and 74 simple sequence repeats (SSRs) were identified. Genome alignment across five plastid genomes of Gesneriaceae indicated a high sequence similarity. Four highly variable sites (rps16-trnQ, trnS-trnG, ndhF-rpl32, and ycf 1) were identified. Phylogenetic analysis indicated that O. esquirolii grouped together with O. mileensis, supporting resurrection of the name Oreocharis esquirolii from Thamnocharisesquirolii. The complete cp genome sequence will contribute to further studies in molecular identification, genetic diversity, and phylogeny.

Download Full-text

Comparative analysis of chloroplast genomes for five Dicliptera species (Acanthaceae): molecular structure, phylogenetic relationships, and adaptive evolution

PeerJ ◽

10.7717/peerj.8450 ◽

2020 ◽

Vol 8 ◽

pp. e8450 ◽

Cited By ~ 2

Author(s):

Sunan Huang ◽

Xuejun Ge ◽

Asunción Cano ◽

Betty Gaby Millán Salazar ◽

Yunfei Deng

Keyword(s):

Adaptive Evolution ◽

Phylogenetic Relationships ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Evolutionary Analysis ◽

Protein Coding ◽

Variable Regions ◽

Protein Coding Genes ◽

Chloroplast Genomes

The genus Dicliptera (Justicieae, Acanthaceae) consists of approximately 150 species distributed throughout the tropical and subtropical regions of the world. Newly obtained chloroplast genomes (cp genomes) are reported for five species of Dilciptera (D. acuminata, D. peruviana, D. montana, D. ruiziana and D. mucronata) in this study. These cp genomes have circular structures of 150,689–150,811 bp and exhibit quadripartite organizations made up of a large single copy region (LSC, 82,796–82,919 bp), a small single copy region (SSC, 17,084–17,092 bp), and a pair of inverted repeat regions (IRs, 25,401–25,408 bp). Guanine-Cytosine (GC) content makes up 37.9%–38.0% of the total content. The complete cp genomes contain 114 unique genes, including 80 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes. Comparative analyses of nucleotide variability (Pi) reveal the five most variable regions (trnY-GUA-trnE-UUC, trnG-GCC, psbZ-trnG-GCC, petN-psbM, and rps4-trnL-UUA), which may be used as molecular markers in future taxonomic identification and phylogenetic analyses of Dicliptera. A total of 55-58 simple sequence repeats (SSRs) and 229 long repeats were identified in the cp genomes of the five Dicliptera species. Phylogenetic analysis identified a close relationship between D. ruiziana and D. montana, followed by D. acuminata, D. peruviana, and D. mucronata. Evolutionary analysis of orthologous protein-coding genes within the family Acanthaceae revealed only one gene, ycf15, to be under positive selection, which may contribute to future studies of its adaptive evolution. The completed genomes are useful for future research on species identification, phylogenetic relationships, and the adaptive evolution of the Dicliptera species.

Download Full-text

Complete Chloroplast Genome of Argania spinosa: Structural Organization and Phylogenetic Relationships in Sapotaceae

Plants ◽

10.3390/plants9101354 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1354

Author(s):

Slimane Khayi ◽

Fatima Gaboun ◽

Stacy Pirro ◽

Tatiana Tatusova ◽

Abdelhamid El Mousadik ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Important Species ◽

Complete Chloroplast Genome ◽

Argania Spinosa ◽

Protein Coding Genes ◽

Cp Genome

Argania spinosa (Sapotaceae), an important endemic Moroccan oil tree, is a primary source of argan oil, which has numerous dietary and medicinal proprieties. The plant species occupies the mid-western part of Morocco and provides great environmental and socioeconomic benefits. The complete chloroplast (cp) genome of A. spinosa was sequenced, assembled, and analyzed in comparison with those of two Sapotaceae members. The A. spinosa cp genome is 158,848 bp long, with an average GC content of 36.8%. The cp genome exhibits a typical quadripartite and circular structure consisting of a pair of inverted regions (IR) of 25,945 bp in length separating small single-copy (SSC) and large single-copy (LSC) regions of 18,591 and 88,367 bp, respectively. The annotation of A. spinosa cp genome predicted 130 genes, including 85 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, and 37 transfer RNA (tRNA) genes. A total of 44 long repeats and 88 simple sequence repeats (SSR) divided into mononucleotides (76), dinucleotides (7), trinucleotides (3), tetranucleotides (1), and hexanucleotides (1) were identified in the A. spinosa cp genome. Phylogenetic analyses using the maximum likelihood (ML) method were performed based on 69 protein-coding genes from 11 species of Ericales. The results confirmed the close position of A. spinosa to the Sideroxylon genus, supporting the revisiting of its taxonomic status. The complete chloroplast genome sequence will be valuable for further studies on the conservation and breeding of this medicinally and culinary important species and also contribute to clarifying the phylogenetic position of the species within Sapotaceae.

Download Full-text

Complete Chloroplast Genomes of Ampelopsis humulifolia and Ampelopsis japonica: Molecular Structure, Comparative Analysis, and Phylogenetic Analysis

Plants ◽

10.3390/plants8100410 ◽

2019 ◽

Vol 8 (10) ◽

pp. 410 ◽

Cited By ~ 9

Author(s):

Xiaolei Yu ◽

Wei Tan ◽

Huanyu Zhang ◽

Han Gao ◽

Wenxiu Wang ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Phylogenetic Analyses ◽

Gc Content ◽

Rrna Genes ◽

Trna Genes ◽

Protein Z ◽

Marker Selection ◽

Chloroplast Genomes ◽

Potential Marker ◽

Cp Genome

Ampelopsis humulifolia (A. humulifolia) and Ampelopsis japonica (A. japonica), which belong to the family Vitaceae, are valuably used as medicinal plants. The chloroplast (cp) genomes have been recognized as a convincing data for marker selection and phylogenetic studies. Therefore, in this study we reported the complete cp genome sequences of two Ampelopsis species. Results showed that the cp genomes of A. humulifolia and A. japonica were 161,724 and 161,430 bp in length, respectively, with 37.3% guanine-cytosine (GC) content. A total of 114 unique genes were identified in each cp genome, comprising 80 protein-coding genes, 30 tRNA genes, and 4 rRNA genes. We determined 95 and 99 small sequence repeats (SSRs) in A. humulifolia and A. japonica, respectively. The location and distribution of long repeats in the two cp genomes were identified. A highly divergent region of psbZ (Photosystem II reaction center protein Z) -trnG (tRNA-Glycine) was found and could be treated as a potential marker for Vitaceae, and then the corresponding primers were designed. Additionally, phylogenetic analysis showed that Vitis was closer to Tetrastigma than Ampelopsis. In general, this study provides valuable genetic resources for DNA barcoding marker identification and phylogenetic analyses of Ampelopsis.

Download Full-text