Chloroplast genome sequence of Chongming lima bean (Phaseolus lunatus L.) and comparative analyses with other legume chloroplast genomes

Abstract Background Lima bean (Phaseolus lunatus L.) is a member of subfamily Phaseolinae belonging to the family Leguminosae and an important source of plant proteins for the human diet. As we all know, lima beans have important economic value and great diversity. However, our knowledge of the chloroplast genome level of lima beans is limited. Results The chloroplast genome of lima bean was obtained by Illumina sequencing technology for the first time. The Cp genome with a length of 150,902 bp, including a pair of inverted repeats (IRA and IRB 26543 bp each), a large single-copy (LSC 80218 bp) and a small single-copy region (SSC 17598 bp). In total, 124 unique genes including 82 protein-coding genes, 34 tRNA genes, and 8 rRNA genes were identified in the P. lunatus Cp genome. A total of 61 long repeats and 290 SSRs were detected in the lima bean Cp genome. It has a typical 50 kb inversion of the Leguminosae family and an 70 kb inversion to subtribe Phaseolinae. rpl16, accD, petB, rsp16, clpP, ndhA, ndhF and ycf1 genes in coding regions was found significant variation, the intergenic regions of trnk-rbcL, rbcL-atpB, ndhJ-rps4, psbD-rpoB, atpI-atpA, atpA-accD, accD-psbJ, psbE-psbB, rsp11-rsp19, ndhF-ccsA was found in a high degree of divergence. A phylogenetic analysis showed that P. lunatus appears to be more closely related to P. vulgaris, V.unguiculata and V. radiata. Conclusions The characteristics of the lima bean Cp genome was identified for the first time, these results will provide useful insights for species identification, evolutionary studies and molecular biology research.

Download Full-text

Comparative Analyses of Euonymus Chloroplast Genomes: Genetic Structure, Screening for Loci With Suitable Polymorphism, Positive Selection Genes, and Phylogenetic Relationships Within Celastrineae

Frontiers in Plant Science ◽

10.3389/fpls.2020.593984 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yongtan Li ◽

Yan Dong ◽

Yichao Liu ◽

Xiaoyue Yu ◽

Minsheng Yang ◽

...

Keyword(s):

Positive Selection ◽

Chloroplast Genome ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Evolutionary Relationships ◽

Trna Genes ◽

Protein Coding ◽

Chloroplast Genomes ◽

Cp Genome

In this study, we assembled and annotated the chloroplast (cp) genome of the Euonymus species Euonymus fortunei, Euonymus phellomanus, and Euonymus maackii, and performed a series of analyses to investigate gene structure, GC content, sequence alignment, and nucleic acid diversity, with the objectives of identifying positive selection genes and understanding evolutionary relationships. The results indicated that the Euonymus cp genome was 156,860–157,611bp in length and exhibited a typical circular tetrad structure. Similar to the majority of angiosperm chloroplast genomes, the results yielded a large single-copy region (LSC) (85,826–86,299bp) and a small single-copy region (SSC) (18,319–18,536bp), separated by a pair of sequences (IRA and IRB; 26,341–26,700bp) with the same encoding but in opposite directions. The chloroplast genome was annotated to 130–131 genes, including 85–86 protein coding genes, 37 tRNA genes, and eight rRNA genes, with GC contents of 37.26–37.31%. The GC content was variable among regions and was highest in the inverted repeat (IR) region. The IR boundary of Euonymus happened expanding resulting that the rps19 entered into IR region and doubled completely. Such fluctuations at the border positions might be helpful in determining evolutionary relationships among Euonymus. The simple-sequence repeats (SSRs) of Euonymus species were composed primarily of single nucleotides (A)n and (T)n, and were mostly 10–12bp in length, with an obvious A/T bias. We identified several loci with suitable polymorphism with the potential use as molecular markers for inferring the phylogeny within the genus Euonymus. Signatures of positive selection were seen in rpoB protein encoding genes. Based on data from the whole chloroplast genome, common single copy genes, and the LSC, SSC, and IR regions, we constructed an evolutionary tree of Euonymus and related species, the results of which were consistent with traditional taxonomic classifications. It showed that E. fortunei sister to the Euonymus japonicus, whereby E. maackii appeared as sister to Euonymus hamiltonianus. Our study provides important genetic information to support further investigations into the phylogenetic development and adaptive evolution of Euonymus species.

Download Full-text

Complete Chloroplast Genomes from Sanguisorba: Identity and Variation Among Four Species

Molecules ◽

10.3390/molecules23092137 ◽

2018 ◽

Vol 23 (9) ◽

pp. 2137 ◽

Cited By ~ 6

Author(s):

Xiang-Xiao Meng ◽

Yan-Fang Xian ◽

Li Xiang ◽

Dong Zhang ◽

Yu-Hua Shi ◽

...

Keyword(s):

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Future Studies ◽

Chloroplast Genomes ◽

Close Relationship ◽

Cp Genome ◽

Sanguisorba Officinalis

The genus Sanguisorba, which contains about 30 species around the world and seven species in China, is the source of the medicinal plant Sanguisorba officinalis, which is commonly used as a hemostatic agent as well as to treat burns and scalds. Here we report the complete chloroplast (cp) genome sequences of four Sanguisorba species (S. officinalis, S. filiformis, S. stipulata, and S. tenuifolia var. alba). These four Sanguisorba cp genomes exhibit typical quadripartite and circular structures, and are 154,282 to 155,479 bp in length, consisting of large single-copy regions (LSC; 84,405–85,557 bp), small single-copy regions (SSC; 18,550–18,768 bp), and a pair of inverted repeats (IRs; 25,576–25,615 bp). The average GC content was ~37.24%. The four Sanguisorba cp genomes harbored 112 different genes arranged in the same order; these identical sections include 78 protein-coding genes, 30 tRNA genes, and four rRNA genes, if duplicated genes in IR regions are counted only once. A total of 39–53 long repeats and 79–91 simple sequence repeats (SSRs) were identified in the four Sanguisorba cp genomes, which provides opportunities for future studies of the population genetics of Sanguisorba medicinal plants. A phylogenetic analysis using the maximum parsimony (MP) method strongly supports a close relationship between S. officinalis and S. tenuifolia var. alba, followed by S. stipulata, and finally S. filiformis. The availability of these cp genomes provides valuable genetic information for future studies of Sanguisorba identification and provides insights into the evolution of the genus Sanguisorba.

Download Full-text

Complete Chloroplast Genome of Argania spinosa: Structural Organization and Phylogenetic Relationships in Sapotaceae

Plants ◽

10.3390/plants9101354 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1354

Author(s):

Slimane Khayi ◽

Fatima Gaboun ◽

Stacy Pirro ◽

Tatiana Tatusova ◽

Abdelhamid El Mousadik ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Important Species ◽

Complete Chloroplast Genome ◽

Argania Spinosa ◽

Protein Coding Genes ◽

Cp Genome

Argania spinosa (Sapotaceae), an important endemic Moroccan oil tree, is a primary source of argan oil, which has numerous dietary and medicinal proprieties. The plant species occupies the mid-western part of Morocco and provides great environmental and socioeconomic benefits. The complete chloroplast (cp) genome of A. spinosa was sequenced, assembled, and analyzed in comparison with those of two Sapotaceae members. The A. spinosa cp genome is 158,848 bp long, with an average GC content of 36.8%. The cp genome exhibits a typical quadripartite and circular structure consisting of a pair of inverted regions (IR) of 25,945 bp in length separating small single-copy (SSC) and large single-copy (LSC) regions of 18,591 and 88,367 bp, respectively. The annotation of A. spinosa cp genome predicted 130 genes, including 85 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, and 37 transfer RNA (tRNA) genes. A total of 44 long repeats and 88 simple sequence repeats (SSR) divided into mononucleotides (76), dinucleotides (7), trinucleotides (3), tetranucleotides (1), and hexanucleotides (1) were identified in the A. spinosa cp genome. Phylogenetic analyses using the maximum likelihood (ML) method were performed based on 69 protein-coding genes from 11 species of Ericales. The results confirmed the close position of A. spinosa to the Sideroxylon genus, supporting the revisiting of its taxonomic status. The complete chloroplast genome sequence will be valuable for further studies on the conservation and breeding of this medicinally and culinary important species and also contribute to clarifying the phylogenetic position of the species within Sapotaceae.

Download Full-text

Complete Chloroplast Genome of Paphiopedilum delenatii and Phylogenetic Relationships among Orchidaceae

Plants ◽

10.3390/plants9010061 ◽

2020 ◽

Vol 9 (1) ◽

pp. 61 ◽

Cited By ~ 5

Author(s):

Huyen-Trang Vu ◽

Ngan Tran ◽

Thanh-Diem Nguyen ◽

Quoc-Luan Vu ◽

My-Huyen Bui ◽

...

Keyword(s):

Chloroplast Genome ◽

Inverted Repeat ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Complete Chloroplast Genome ◽

Critically Endangered Species ◽

Plastid Genomes ◽

Chloroplast Genomes

Paphiopedilum delenatii is a native orchid of Vietnam with highly attractive floral traits. Unfortunately, it is now listed as a critically endangered species with a few hundred individuals remaining in nature. In this study, we performed next-generation sequencing of P. delenatii and assembled its complete chloroplast genome. The whole chloroplast genome of P. delenatii was 160,955 bp in size, 35.6% of which was GC content, and exhibited typical quadripartite structure of plastid genomes with four distinct regions, including the large and small single-copy regions and a pair of inverted repeat regions. There were, in total, 130 genes annotated in the genome: 77 coding genes, 39 tRNA genes, 8 rRNA genes, and 6 pseudogenes. The loss of ndh genes and variation in inverted repeat (IR) boundaries as well as data of simple sequence repeats (SSRs) and divergent hotspots provided useful information for identification applications and phylogenetic studies of Paphiopedilum species. Whole chloroplast genomes could be used as an effective super barcode for species identification or for developing other identification markers, which subsequently serves the conservation of Paphiopedilum species.

Download Full-text

Complete Chloroplast Genomes of Chlorophytum comosum and Chlorophytum gallabatense: Genome Structures, Comparative and Phylogenetic Analysis

Plants ◽

10.3390/plants9030296 ◽

2020 ◽

Vol 9 (3) ◽

pp. 296 ◽

Cited By ~ 3

Author(s):

Jacinta N. Munyao ◽

Xiang Dong ◽

Jia-Xin Yang ◽

Elijah M. Mbandi ◽

Vincent O. Wanga ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Important Species ◽

Genomic Resources ◽

Phylogenetic Studies ◽

Chloroplast Genomes ◽

Cp Genome ◽

High Level

The genus Chlorophytum includes many economically important species well-known for medicinal, ornamental, and horticultural values. However, to date, few molecular genomic resources have been reported for this genus. Therefore, there is limited knowledge of phylogenetic studies, and the available chloroplast (cp) genome of Chlorophytum (C. rhizopendulum) does not provide enough information on this genus. In this study, we present genomic resources for C. comosum and C. gallabatense, which had lengths of 154,248 and 154,154 base pairs (bp), respectively. They had a pair of inverted repeats (IRa and IRb) of 26,114 and 26,254 bp each in size, separating the large single-copy (LSC) region of 84,004 and 83,686 bp from the small single-copy (SSC) region of 18,016 and 17,960 bp in C. comosum and C. gallabatense, respectively. There were 112 distinct genes in each cp genome, which were comprised of 78 protein-coding genes, 30 tRNA genes, and four rRNA genes. The comparative analysis with five other selected species displayed a generally high level of sequence resemblance in structural organization, gene content, and arrangement. Additionally, the phylogenetic analysis confirmed the previous phylogeny and produced a phylogenetic tree with similar topology. It showed that the Chlorophytum species (C. comosum, C. gallabatense and C. rhizopendulum) were clustered together in the same clade with a closer relationship than other plants to the Anthericum ramosum. This research, therefore, presents valuable records for further molecular evolutionary and phylogenetic studies which help to fill the gap in genomic resources and resolve the taxonomic complexes of the genus.

Download Full-text

Seven Complete Chloroplast Genomes from Symplocos: Genome Organization and Comparative Analysis

Forests ◽

10.3390/f12050608 ◽

2021 ◽

Vol 12 (5) ◽

pp. 608

Author(s):

Sang-Chul Kim ◽

Jei-Wan Lee ◽

Byoung-Ki Choi

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Coding Regions ◽

Chloroplast Genomes ◽

Ion Torrent Sequencing ◽

Species Specific ◽

Taxonomic Characterization

In the present study, chloroplast genome sequences of four species of Symplocos (S. chinensis for. pilosa, S. prunifolia, S. coreana, and S. tanakana) from South Korea were obtained by Ion Torrent sequencing and compared with the sequences of three previously reported Symplocos chloroplast genomes from different species. The length of the Symplocos chloroplast genome ranged from 156,961 to 157,365 bp. Overall, 132 genes including 87 functional genes, 37 tRNA genes, and eight rRNA genes were identified in all Symplocos chloroplast genomes. The gene order and contents were highly similar across the seven species. The coding regions were more conserved than the non-coding regions, and the large single-copy and small single-copy regions were less conserved than the inverted repeat regions. We identified five new hotspot regions (rbcL, ycf4, psaJ, rpl22, and ycf1) that can be used as barcodes or species-specific Symplocos molecular markers. These four novel chloroplast genomes provide basic information on the plastid genome of Symplocos and enable better taxonomic characterization of this genus.

Download Full-text

Comparative Analysis of the Chloroplast Genome for Four Pennisetum Species: Molecular Structure and Phylogenetic Relationships

Frontiers in Genetics ◽

10.3389/fgene.2021.687844 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jin Xu ◽

Chen Liu ◽

Yun Song ◽

Mingfu Li

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Phylogenetic Relationships ◽

Gc Content ◽

Single Copy ◽

Rrna Genes ◽

Future Research ◽

Trna Genes ◽

Evolutionary Analysis ◽

Chloroplast Genomes

The genus Pennisetum (Poaceae) is both a forage crop and staple food crop in the tropics. In this study, we obtained chloroplast genome sequences of four species of Pennisetum (P. alopecuroides, P. clandestinum, P. glaucum, and P. polystachion) using Illumina sequencing. These chloroplast genomes have circular structures of 136,346–138,119 bp, including a large single-copy region (LSC, 79,380–81,186 bp), a small single-copy region (SSC, 12,212–12,409 bp), and a pair of inverted repeat regions (IRs, 22,284–22,372 bp). The overall GC content of these chloroplast genomes was 38.6–38.7%. The complete chloroplast genomes contained 110 different genes, including 76 protein-coding genes, 30 transfer RNA (tRNA) genes, and four ribosomal RNA (rRNA) genes. Comparative analysis of nucleotide variability identified nine intergenic spacer regions (psbA-matK, matK-rps16, trnN-trnT, trnY-trnD-psbM, petN-trnC, rbcL-psaI, petA-psbJ, psbE-petL, and rpl32-trnL), which may be used as potential DNA barcodes in future species identification and evolutionary analysis of Pennisetum. The phylogenetic analysis revealed a close relationship between P. polystachion and P. glaucum, followed by P. clandestinum and P. alopecuroides. The completed genomes of this study will help facilitate future research on the phylogenetic relationships and evolution of Pennisetum species.

Download Full-text

Initial Complete Chloroplast Genomes of Alchemilla (Rosaceae): Comparative Analysis and Phylogenetic Relationships

Frontiers in Genetics ◽

10.3389/fgene.2020.560368 ◽

2020 ◽

Vol 11 ◽

Author(s):

Peninah Cheptoo Rono ◽

Xiang Dong ◽

Jia-Xin Yang ◽

Fredrick Munyao Mutie ◽

Millicent A. Oulo ◽

...

Keyword(s):

Tandem Repeats ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Coding Region ◽

Base Pairs ◽

Protein Coding ◽

Chloroplast Genomes ◽

The Family ◽

Cp Genome

The genus Alchemilla L., known for its medicinal and ornamental value, is widely distributed in the Holarctic regions with a few species found in Asia and Africa. Delimitation of species within Alchemilla is difficult due to hybridization, autonomous apomixes, and polyploidy, necessitating efficient molecular-based characterization. Herein, we report the initial complete chloroplast (cp) genomes of Alchemilla. The cp genomes of two African (Afromilla) species Alchemilla pedata and Alchemilla argyrophylla were sequenced, and phylogenetic and comparative analyses were conducted in the family Rosaceae. The cp genomes mapped a typical circular quadripartite structure of lengths 152,438 and 152,427 base pairs (bp) in A. pedata and A. argyrophylla, respectively. Alchemilla cp genomes were composed of a pair of inverted repeat regions (IRa/IRb) of length 25,923 and 25,915 bp, separating the small single copy (SSC) region of 17,980 and 17,981 bp and a large single copy (LSC) region of 82,612 and 82,616 bp in A. pedata and A. argyrophylla, respectively. The cp genomes encoded 114 unique genes including 88 protein-coding genes, 37 transfer RNA (tRNA) genes, and 4 ribosomal RNA (rRNA) genes. Additionally, 88 and 95 simple sequence repeats (SSRs) and 37 and 40 tandem repeats were identified in A. pedata and A. argyrophylla, respectively. Significantly, the loss of group II intron in atpF gene in Alchemilla species was detected. Phylogenetic analysis based on 26 whole cp genome sequences and 78 protein-coding gene sequences of 27 Rosaceae species revealed a monophyletic clustering of Alchemilla nested within subfamily Rosoideae. Based on a protein-coding region, negative selective pressure (Ka/Ks < 1) was detected with an average Ka/Ks value of 0.1322 in A. argyrophylla and 0.1418 in A. pedata. The availability of complete cp genome in the genus Alchemilla will contribute to species delineation and further phylogenetic and evolutionary studies in the family Rosaceae.

Download Full-text

Unlocking the Complete Chloroplast Genome of a Native Tree Species from the Amazon Basin, Capirona (Calycophyllum spruceanum Benth., Rubiaceae), and Its Comparative Analysis with Other Ixoroideae Species

10.20944/preprints202111.0533.v1 ◽

2021 ◽

Author(s):

Carla L. Saldaña ◽

Pedro Rodriguez-Grados ◽

Julio C. Chávez-Galarza ◽

Shefferson Feijoo ◽

Juan Carlos Guerrero Abad ◽

...

Keyword(s):

Chloroplast Genome ◽

Amazon Basin ◽

Single Copy ◽

Rrna Genes ◽

Genome Comparison ◽

Trna Genes ◽

Large Single Copy ◽

Illumina Hiseq ◽

Cp Genome ◽

Small Single Copy

Capirona (Calycophyllum spruceanum Benth.) belongs to subfamily Ixoroideae, one of de major lineages in the Rubiaceae family, and is an important timber tree, with origin in the Amazon Basin and has widespread distribution in Bolivia, Peru, Colombia, and Brazil. In this study, we obtained the first complete chloroplast (cp) genome of capirona from department of Madre de Dios located in the Peruvian Amazon. High-quality genomic DNA was used to construct librar-ies. Pair-end clean reads were obtained by PE 150 library and the Illumina HiSeq 2500 platform. The complete cp genome of C. spruceanum has a 154,480 bp in length with typical quadripartite structure, containing a large single copy (LSC) region (84,813 bp) and a small single-copy (SSC) region (18,101 bp), separated by two inverted repeat (IR) regions (25,783 bp). The annotation of C. spruceanum cp genome predicted 87 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes and 01 pseudogene. A total of 41 simple sequence repeats (SSR) of this cp genome were divided into mononucleotides (29), dinucleotides (5), trinucleotides (3), and tetranucleotide (4). Most of these repeats were distributed in the noncoding regions. Whole chloroplast genome comparison with the other six Ixoroideae species revealed that the small single copy and large single copy regions showed more divergence than invert regions. Finally, phylogenetic analysis resolved that C. spruceanum is a sister species to Emmenopterys henryi, and confirms its position within the subfamily Ixoroideae. This study reports for the first time the genome organization, gene content, and structural features of the chloroplast genome of C. spruceanum, providing valuable information for genetic and evolutionary studies in the genus Calycophyllum and beyond.

Download Full-text

Comparative analysis of four Zantedeschia chloroplast genomes: expansion and contraction of the IR region, phylogenetic analyses and SSR genetic diversity assessment

PeerJ ◽

10.7717/peerj.9132 ◽

2020 ◽

Vol 8 ◽

pp. e9132

Author(s):

Shuilian He ◽

Yang Yang ◽

Ziwei Li ◽

Xuejiao Wang ◽

Yanbing Guo ◽

...

Keyword(s):

Genetic Diversity ◽

Phylogenetic Analyses ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Herbaceous Perennials ◽

Coding Regions ◽

Chloroplast Genomes ◽

Diversity Assessment ◽

Cp Genome

The horticulturally important genus Zantedeschia (Araceae) comprises eight species of herbaceous perennials. We sequenced, assembled and analyzed the chloroplast (cp) genomes of four species of Zantedeschia (Z. aethiopica, Z. odorata, Z. elliottiana, and Z. rehmannii) to investigate the structure of the cp genome in the genus. According to our results, the cp genome of Zantedeschia ranges in size from 169,065 bp (Z. aethiopica) to 175,906 bp (Z. elliottiana). We identified a total of 112 unique genes, including 78 protein-coding genes, 30 transfer RNA (tRNA) genes and four ribosomal RNA (rRNA) genes. Comparison of our results with cp genomes from other species in the Araceae suggests that the relatively large sizes of the Zantedeschia cp genomes may result from inverted repeats (IR) region expansion. The sampled Zantedeschia species formed a monophylogenetic clade in our phylogenetic analysis. Furthermore, the long single copy (LSC) and short single copy (SSC) regions in Zantedeschia are more divergent than the IR regions in the same genus, and non-coding regions showed generally higher divergence than coding regions. We identified a total of 410 cpSSR sites from the four Zantedeschia species studied. Genetic diversity analyses based on four polymorphic SSR markers from 134 cultivars of Zantedeschia suggested that high genetic diversity (I = 0.934; Ne = 2.371) is present in the Zantedeschia cultivars. High genetic polymorphism from the cpSSR region suggests that cpSSR could be an effective tool for genetic diversity assessment and identification of Zantedeschia varieties.

Download Full-text