Comparative Analyses of Euonymus Chloroplast Genomes: Genetic Structure, Screening for Loci With Suitable Polymorphism, Positive Selection Genes, and Phylogenetic Relationships Within Celastrineae

In this study, we assembled and annotated the chloroplast (cp) genome of the Euonymus species Euonymus fortunei, Euonymus phellomanus, and Euonymus maackii, and performed a series of analyses to investigate gene structure, GC content, sequence alignment, and nucleic acid diversity, with the objectives of identifying positive selection genes and understanding evolutionary relationships. The results indicated that the Euonymus cp genome was 156,860–157,611bp in length and exhibited a typical circular tetrad structure. Similar to the majority of angiosperm chloroplast genomes, the results yielded a large single-copy region (LSC) (85,826–86,299bp) and a small single-copy region (SSC) (18,319–18,536bp), separated by a pair of sequences (IRA and IRB; 26,341–26,700bp) with the same encoding but in opposite directions. The chloroplast genome was annotated to 130–131 genes, including 85–86 protein coding genes, 37 tRNA genes, and eight rRNA genes, with GC contents of 37.26–37.31%. The GC content was variable among regions and was highest in the inverted repeat (IR) region. The IR boundary of Euonymus happened expanding resulting that the rps19 entered into IR region and doubled completely. Such fluctuations at the border positions might be helpful in determining evolutionary relationships among Euonymus. The simple-sequence repeats (SSRs) of Euonymus species were composed primarily of single nucleotides (A)n and (T)n, and were mostly 10–12bp in length, with an obvious A/T bias. We identified several loci with suitable polymorphism with the potential use as molecular markers for inferring the phylogeny within the genus Euonymus. Signatures of positive selection were seen in rpoB protein encoding genes. Based on data from the whole chloroplast genome, common single copy genes, and the LSC, SSC, and IR regions, we constructed an evolutionary tree of Euonymus and related species, the results of which were consistent with traditional taxonomic classifications. It showed that E. fortunei sister to the Euonymus japonicus, whereby E. maackii appeared as sister to Euonymus hamiltonianus. Our study provides important genetic information to support further investigations into the phylogenetic development and adaptive evolution of Euonymus species.

Download Full-text

Complete Chloroplast Genomes from Sanguisorba: Identity and Variation Among Four Species

Molecules ◽

10.3390/molecules23092137 ◽

2018 ◽

Vol 23 (9) ◽

pp. 2137 ◽

Cited By ~ 6

Author(s):

Xiang-Xiao Meng ◽

Yan-Fang Xian ◽

Li Xiang ◽

Dong Zhang ◽

Yu-Hua Shi ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Future Studies ◽

Chloroplast Genomes ◽

Close Relationship ◽

Cp Genome ◽

Sanguisorba Officinalis

The genus Sanguisorba, which contains about 30 species around the world and seven species in China, is the source of the medicinal plant Sanguisorba officinalis, which is commonly used as a hemostatic agent as well as to treat burns and scalds. Here we report the complete chloroplast (cp) genome sequences of four Sanguisorba species (S. officinalis, S. filiformis, S. stipulata, and S. tenuifolia var. alba). These four Sanguisorba cp genomes exhibit typical quadripartite and circular structures, and are 154,282 to 155,479 bp in length, consisting of large single-copy regions (LSC; 84,405–85,557 bp), small single-copy regions (SSC; 18,550–18,768 bp), and a pair of inverted repeats (IRs; 25,576–25,615 bp). The average GC content was ~37.24%. The four Sanguisorba cp genomes harbored 112 different genes arranged in the same order; these identical sections include 78 protein-coding genes, 30 tRNA genes, and four rRNA genes, if duplicated genes in IR regions are counted only once. A total of 39–53 long repeats and 79–91 simple sequence repeats (SSRs) were identified in the four Sanguisorba cp genomes, which provides opportunities for future studies of the population genetics of Sanguisorba medicinal plants. A phylogenetic analysis using the maximum parsimony (MP) method strongly supports a close relationship between S. officinalis and S. tenuifolia var. alba, followed by S. stipulata, and finally S. filiformis. The availability of these cp genomes provides valuable genetic information for future studies of Sanguisorba identification and provides insights into the evolution of the genus Sanguisorba.

Download Full-text

Complete Chloroplast Genome of Argania spinosa: Structural Organization and Phylogenetic Relationships in Sapotaceae

Plants ◽

10.3390/plants9101354 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1354

Author(s):

Slimane Khayi ◽

Fatima Gaboun ◽

Stacy Pirro ◽

Tatiana Tatusova ◽

Abdelhamid El Mousadik ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Important Species ◽

Complete Chloroplast Genome ◽

Argania Spinosa ◽

Protein Coding Genes ◽

Cp Genome

Argania spinosa (Sapotaceae), an important endemic Moroccan oil tree, is a primary source of argan oil, which has numerous dietary and medicinal proprieties. The plant species occupies the mid-western part of Morocco and provides great environmental and socioeconomic benefits. The complete chloroplast (cp) genome of A. spinosa was sequenced, assembled, and analyzed in comparison with those of two Sapotaceae members. The A. spinosa cp genome is 158,848 bp long, with an average GC content of 36.8%. The cp genome exhibits a typical quadripartite and circular structure consisting of a pair of inverted regions (IR) of 25,945 bp in length separating small single-copy (SSC) and large single-copy (LSC) regions of 18,591 and 88,367 bp, respectively. The annotation of A. spinosa cp genome predicted 130 genes, including 85 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, and 37 transfer RNA (tRNA) genes. A total of 44 long repeats and 88 simple sequence repeats (SSR) divided into mononucleotides (76), dinucleotides (7), trinucleotides (3), tetranucleotides (1), and hexanucleotides (1) were identified in the A. spinosa cp genome. Phylogenetic analyses using the maximum likelihood (ML) method were performed based on 69 protein-coding genes from 11 species of Ericales. The results confirmed the close position of A. spinosa to the Sideroxylon genus, supporting the revisiting of its taxonomic status. The complete chloroplast genome sequence will be valuable for further studies on the conservation and breeding of this medicinally and culinary important species and also contribute to clarifying the phylogenetic position of the species within Sapotaceae.

Download Full-text

Complete Chloroplast Genome of Paphiopedilum delenatii and Phylogenetic Relationships among Orchidaceae

Plants ◽

10.3390/plants9010061 ◽

2020 ◽

Vol 9 (1) ◽

pp. 61 ◽

Cited By ~ 5

Author(s):

Huyen-Trang Vu ◽

Ngan Tran ◽

Thanh-Diem Nguyen ◽

Quoc-Luan Vu ◽

My-Huyen Bui ◽

...

Keyword(s):

Chloroplast Genome ◽

Inverted Repeat ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Complete Chloroplast Genome ◽

Critically Endangered Species ◽

Plastid Genomes ◽

Chloroplast Genomes

Paphiopedilum delenatii is a native orchid of Vietnam with highly attractive floral traits. Unfortunately, it is now listed as a critically endangered species with a few hundred individuals remaining in nature. In this study, we performed next-generation sequencing of P. delenatii and assembled its complete chloroplast genome. The whole chloroplast genome of P. delenatii was 160,955 bp in size, 35.6% of which was GC content, and exhibited typical quadripartite structure of plastid genomes with four distinct regions, including the large and small single-copy regions and a pair of inverted repeat regions. There were, in total, 130 genes annotated in the genome: 77 coding genes, 39 tRNA genes, 8 rRNA genes, and 6 pseudogenes. The loss of ndh genes and variation in inverted repeat (IR) boundaries as well as data of simple sequence repeats (SSRs) and divergent hotspots provided useful information for identification applications and phylogenetic studies of Paphiopedilum species. Whole chloroplast genomes could be used as an effective super barcode for species identification or for developing other identification markers, which subsequently serves the conservation of Paphiopedilum species.

Download Full-text

Comparative Analysis of the Chloroplast Genome for Four Pennisetum Species: Molecular Structure and Phylogenetic Relationships

Frontiers in Genetics ◽

10.3389/fgene.2021.687844 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jin Xu ◽

Chen Liu ◽

Yun Song ◽

Mingfu Li

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Phylogenetic Relationships ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Future Research ◽

Trna Genes ◽

Evolutionary Analysis ◽

Chloroplast Genomes

The genus Pennisetum (Poaceae) is both a forage crop and staple food crop in the tropics. In this study, we obtained chloroplast genome sequences of four species of Pennisetum (P. alopecuroides, P. clandestinum, P. glaucum, and P. polystachion) using Illumina sequencing. These chloroplast genomes have circular structures of 136,346–138,119 bp, including a large single-copy region (LSC, 79,380–81,186 bp), a small single-copy region (SSC, 12,212–12,409 bp), and a pair of inverted repeat regions (IRs, 22,284–22,372 bp). The overall GC content of these chloroplast genomes was 38.6–38.7%. The complete chloroplast genomes contained 110 different genes, including 76 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes. Comparative analysis of nucleotide variability identified nine intergenic spacer regions (psbA-matK, matK-rps16, trnN-trnT, trnY-trnD-psbM, petN-trnC, rbcL-psaI, petA-psbJ, psbE-petL, and rpl32-trnL), which may be used as potential DNA barcodes in future species identification and evolutionary analysis of Pennisetum. The phylogenetic analysis revealed a close relationship between P. polystachion and P. glaucum, followed by P. clandestinum and P. alopecuroides. The completed genomes of this study will help facilitate future research on the phylogenetic relationships and evolution of Pennisetum species.

Download Full-text

Initial Complete Chloroplast Genomes of Alchemilla (Rosaceae): Comparative Analysis and Phylogenetic Relationships

Frontiers in Genetics ◽

10.3389/fgene.2020.560368 ◽

2020 ◽

Vol 11 ◽

Author(s):

Peninah Cheptoo Rono ◽

Xiang Dong ◽

Jia-Xin Yang ◽

Fredrick Munyao Mutie ◽

Millicent A. Oulo ◽

...

Keyword(s):

Tandem Repeats ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Coding Region ◽

Base Pairs ◽

Protein Coding ◽

Chloroplast Genomes ◽

The Family ◽

Cp Genome

The genus Alchemilla L., known for its medicinal and ornamental value, is widely distributed in the Holarctic regions with a few species found in Asia and Africa. Delimitation of species within Alchemilla is difficult due to hybridization, autonomous apomixes, and polyploidy, necessitating efficient molecular-based characterization. Herein, we report the initial complete chloroplast (cp) genomes of Alchemilla. The cp genomes of two African (Afromilla) species Alchemilla pedata and Alchemilla argyrophylla were sequenced, and phylogenetic and comparative analyses were conducted in the family Rosaceae. The cp genomes mapped a typical circular quadripartite structure of lengths 152,438 and 152,427 base pairs (bp) in A. pedata and A. argyrophylla, respectively. Alchemilla cp genomes were composed of a pair of inverted repeat regions (IRa/IRb) of length 25,923 and 25,915 bp, separating the small single copy (SSC) region of 17,980 and 17,981 bp and a large single copy (LSC) region of 82,612 and 82,616 bp in A. pedata and A. argyrophylla, respectively. The cp genomes encoded 114 unique genes including 88 protein-coding genes, 37 transfer RNA (tRNA) genes, and 4 ribosomal RNA (rRNA) genes. Additionally, 88 and 95 simple sequence repeats (SSRs) and 37 and 40 tandem repeats were identified in A. pedata and A. argyrophylla, respectively. Significantly, the loss of group II intron in atpF gene in Alchemilla species was detected. Phylogenetic analysis based on 26 whole cp genome sequences and 78 protein-coding gene sequences of 27 Rosaceae species revealed a monophyletic clustering of Alchemilla nested within subfamily Rosoideae. Based on a protein-coding region, negative selective pressure (Ka/Ks < 1) was detected with an average Ka/Ks value of 0.1322 in A. argyrophylla and 0.1418 in A. pedata. The availability of complete cp genome in the genus Alchemilla will contribute to species delineation and further phylogenetic and evolutionary studies in the family Rosaceae.

Download Full-text

Characterization of the Complete Chloroplast Genome of Buddleja Lindleyana

Journal of AOAC International ◽

10.1093/jaoacint/qsab066 ◽

2021 ◽

Author(s):

Shanshan Liu ◽

Shiyin Feng ◽

Yuying Huang ◽

Wenli An ◽

Zerui Yang ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Future Research ◽

Trna Genes ◽

Similar Species ◽

Protein Coding ◽

Genome Data ◽

Cp Genome ◽

Genomic Resource

Abstract Background Buddleja lindleyana Fort., which belongs to the Loganiaceae with a distribution throughout the tropics, is widely used as an ornamental plant in China. Buddleja contains several morphologically similar species, which need to be identified by molecular identification. But there is little molecular research on the genus Buddleja. Objective Using molecular biology techniques to sequence and analyze the complete chloroplast (cp) genome of B. lindleyana Methods According to next-generation sequencing to sequence the genome data, a series of bioinformatics software were used to assembly and analysis the molecular structure of cp genome of B. lindleyana. Results The complete cp genome of B. lindleyana is a circular 154,487-bp-long molecule with a GC content of 38.1%. It has a familiar quadripartite structure, including a large single-copy region (LSC; 85,489 bp), a small single-copy region (SSC; 17,898bp) and a pair of inverted repeats (IRs; 25,550 bp). A total of 133 genes were identified in the genome, including 86 protein-coding genes, 37 tRNA genes, 8 rRNA genes and 2 pseudogenes. Conclusions These results suggested that B. lindelyana cp genome could be used as a potential genomic resource to resolve the phylogenetic positions and relationships of Loganiaceae, and will offer valuable information for future research in the identification of Buddleja species and will conduce to genomic investigations of these species.

Download Full-text

Comprehensive Analysis of Rhodomyrtus tomentosa Chloroplast Genome

Plants ◽

10.3390/plants8040089 ◽

2019 ◽

Vol 8 (4) ◽

pp. 89 ◽

Cited By ~ 7

Author(s):

Yuying Huang ◽

Zerui Yang ◽

Song Huang ◽

Wenli An ◽

Jing Li ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Sister Relationship ◽

Protein Coding ◽

Protein Coding Genes ◽

Plastid Genomes ◽

Cp Genome ◽

Rhodomyrtus Tomentosa

In the last decade, several studies have relied on a small number of plastid genomes to deduce deep phylogenetic relationships in the species-rich Myrtaceae. Nevertheless, the plastome of Rhodomyrtus tomentosa, an important representative plant of the Rhodomyrtus (DC.) genera, has not yet been reported yet. Here, we sequenced and analyzed the complete chloroplast (CP) genome of R. tomentosa, which is a 156,129-bp-long circular molecule with 37.1% GC content. This CP genome displays a typical quadripartite structure with two inverted repeats (IRa and IRb), of 25,824 bp each, that are separated by a small single copy region (SSC, 18,183 bp) and one large single copy region (LSC, 86,298 bp). The CP genome encodes 129 genes, including 84 protein-coding genes, 37 tRNA genes, eight rRNA genes and three pseudogenes (ycf1, rps19, ndhF). A considerable number of protein-coding genes have a universal ATG start codon, except for psbL and ndhD. Premature termination codons (PTCs) were found in one protein-coding gene, namely atpE, which is rarely reported in the CP genome of plants. Phylogenetic analysis revealed that R. tomentosa has a sister relationship with Eugenia uniflora and Psidium guajava. In conclusion, this study identified unique characteristics of the R. tomentosa CP genome providing valuable information for further investigations on species identification and the phylogenetic evolution between R. tomentosa and related species.

Download Full-text

Complete Chloroplast Genome Sequencing and Phylogenetic Analysis of Two Dracocephalum Plants

BioMed Research International ◽

10.1155/2020/4374801 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Junjun Yao ◽

Fangyu Zhao ◽

Yuanjiang Xu ◽

Kaihui Zhao ◽

Hong Quan ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Chloroplast Genome ◽

De Novo ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Complete Chloroplast Genome ◽

Ssr Analysis ◽

Chloroplast Genomes

Dracocephalum tanguticum and Dracocephalum moldavica are important herbs from Lamiaceae and have great medicinal value. We used the Illumina sequencing technology to sequence the complete chloroplast genome of D. tanguticum and D. moldavica and then conducted de novo assembly. The two chloroplast genomes have a typical quadripartite structure, with the gene’s lengths of 82,221 bp and 81,450 bp, large single-copy region’s (LSC) lengths of 82,221 bp and 81,450 bp, and small single-copy region’s (SSC) lengths of 17,363 bp and 17,066 bp, inverted repeat region’s (IR) lengths of 51,370 bp and 51,352 bp, respectively. The GC content of the two chloroplast genomes was 37.80% and 37.83%, respectively. The chloroplast genomes of the two plants encode 133 and 132 genes, respectively, among which there are 88 and 87 protein-coding genes, respectively, as well as 37 tRNA genes and 8 rRNA genes. Among them, the rps2 gene is unique to D. tanguticum, which is not found in D. moldavica. Through SSR analysis, we also found 6 mutation hotspot regions, which can be used as molecular markers for taxonomic studies. Phylogenetic analysis showed that Dracocephalum was more closely related to Mentha.

Download Full-text

Comparison and Phylogenetic Analysis of Chloroplast Genomes of Three Medicinal and Edible Amomum Species

International Journal of Molecular Sciences ◽

10.3390/ijms20164040 ◽

2019 ◽

Vol 20 (16) ◽

pp. 4040 ◽

Cited By ~ 12

Author(s):

Yingxian Cui ◽

Xinlian Chen ◽

Liping Nie ◽

Wei Sun ◽

Haoyu Hu ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Morphological Characteristics ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Edible Plant ◽

Protein Coding ◽

Chloroplast Genomes ◽

Cp Genome ◽

Pharmacologically Active

Amomum villosum is an important medicinal and edible plant with several pharmacologically active volatile oils. However, identifying A. villosum from A. villosum var. xanthioides and A. longiligulare which exhibit similar morphological characteristics to A. villosum, is difficult. The main goal of this study, therefore, is to mine genetic resources and improve molecular methods that could be used to distinguish these species. A total of eight complete chloroplasts (cp) genomes of these Amomum species which were collected from the main producing areas in China were determined to be 163,608–164,069 bp in size. All genomes displayed a typical quadripartite structure with a pair of inverted repeat (IR) regions (29,820–29,959 bp) that separated a large single copy (LSC) region (88,680–88,857 bp) from a small single copy (SSC) region (15,288–15,369 bp). Each genome encodes 113 different genes with 79 protein-coding genes, 30 tRNA genes, and four rRNA genes. More than 150 SSRs were identified in the entire cp genomes of these three species. The Sanger sequencing results based on 32 Amomum samples indicated that five highly divergent regions screened from cp genomes could not be used to distinguish Amomum species. Phylogenetic analysis showed that the cp genomes could not only accurately identify Amomum species, but also provide a solid foundation for the establishment of phylogenetic relationships of Amomum species. The availability of cp genome resources and the comparative analysis is beneficial for species authentication and phylogenetic analysis in Amomum.

Download Full-text

Chloroplast genome sequence of Chongming lima bean (Phaseolus lunatus L.) and comparative analyses with other legume chloroplast genomes

BMC Genomics ◽

10.1186/s12864-021-07467-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Shoubo Tian ◽

Panling Lu ◽

Zhaohui Zhang ◽

Jian Qiang Wu ◽

Hui Zhang ◽

...

Keyword(s):

Chloroplast Genome ◽

Lima Bean ◽

Single Copy ◽

Rrna Genes ◽

Phaseolus Lunatus ◽

Trna Genes ◽

Chloroplast Genomes ◽

Lima Beans ◽

Cp Genome ◽

First Time

Abstract Background Lima bean (Phaseolus lunatus L.) is a member of subfamily Phaseolinae belonging to the family Leguminosae and an important source of plant proteins for the human diet. As we all know, lima beans have important economic value and great diversity. However, our knowledge of the chloroplast genome level of lima beans is limited. Results The chloroplast genome of lima bean was obtained by Illumina sequencing technology for the first time. The Cp genome with a length of 150,902 bp, including a pair of inverted repeats (IRA and IRB 26543 bp each), a large single-copy (LSC 80218 bp) and a small single-copy region (SSC 17598 bp). In total, 124 unique genes including 82 protein-coding genes, 34 tRNA genes, and 8 rRNA genes were identified in the P. lunatus Cp genome. A total of 61 long repeats and 290 SSRs were detected in the lima bean Cp genome. It has a typical 50 kb inversion of the Leguminosae family and an 70 kb inversion to subtribe Phaseolinae. rpl16, accD, petB, rsp16, clpP, ndhA, ndhF and ycf1 genes in coding regions was found significant variation, the intergenic regions of trnk-rbcL, rbcL-atpB, ndhJ-rps4, psbD-rpoB, atpI-atpA, atpA-accD, accD-psbJ, psbE-psbB, rsp11-rsp19, ndhF-ccsA was found in a high degree of divergence. A phylogenetic analysis showed that P. lunatus appears to be more closely related to P. vulgaris, V.unguiculata and V. radiata. Conclusions The characteristics of the lima bean Cp genome was identified for the first time, these results will provide useful insights for species identification, evolutionary studies and molecular biology research.

Download Full-text