Analyzing and characterization of the chloroplast genome of Salix suchowensis

By screening sequence reads from the chloroplast (cp) genome of S. suchowensis that generated by the next generation sequencing platforms, we built the complete circular pseudomolecule for its cp genome. This pseudomolecule is 155,508 bp in length, which has a typical quadripartite structure containing two single copy regions, a large single copy region (LSC 84,385 bp), and a small single copy region (SSC 16,209 bp) separated by inverted repeat regions (IRs 27,457 bp). Gene annotation revealed that the cp genome of S. suchowensis encoded 119 unique genes, including 4 ribosome RNA genes, 30 transfer RNA genes, 82 protein-coding genes and 3 pseudogenes. Analyzing the repetitive sequences detected 15 tandem repeats, 16 forward repeats and 5 palindromic repeats. In addition, a total of 188 perfect microsatellites were detected, which were characterized as A/T predominance in nucleotide compositions. Significant shifting of the IR/SSC boundaries was revealed by comparing this cp genome with that of other rosids plants. We also built phylogenetic trees to demonstrate the phylogenetic position of S. suchowensis in Rosidae, with 66 orthologous protein-coding genes presented in the cp genomes of 32 species. By sequencing 30 amplicons based on the pseudomolecule, experimental verification achieved accuracy up to 99.84% for the cp genome assembly of S. suchowensis. In conclusion, this study built a high quality pseudomolecule for the cp genome of S. suchowensis, which is a useful resource for facilitating the development of this shrub willow into a more productive bioenergy crop.

Download Full-text

Analyzing and characterization of the chloroplast genome of Salix suchowensis

10.7287/peerj.preprints.2388 ◽

2016 ◽

Author(s):

Congrui Sun ◽

Jie Li ◽

Xiaogang Dai ◽

Yingnan Chen

Keyword(s):

Tandem Repeats ◽

Gene Annotation ◽

Repetitive Sequences ◽

Single Copy ◽

Phylogenetic Position ◽

Shrub Willow ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome ◽

Rna Genes

Download Full-text

Complete chloroplast genome features and phylogenetic analysis of Eruca sativa (Brassicaceae)

PLoS ONE ◽

10.1371/journal.pone.0248556 ◽

2021 ◽

Vol 16 (3) ◽

pp. e0248556

Author(s):

Bin Zhu ◽

Fang Qian ◽

Yunfeng Hou ◽

Weicheng Yang ◽

Mengxian Cai ◽

...

Keyword(s):

Phylogenetic Analysis ◽

De Novo ◽

Repetitive Sequences ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Eruca Sativa ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome

Eruca sativa Mill. (Brassicaceae) is an important edible vegetable and a potential medicinal plant due to the antibacterial activity of its seed oil. Here, the complete chloroplast (cp) genome of E. sativa was de novo assembled with a combination of long PacBio reads and short Illumina reads. The E. sativa cp genome had a quadripartite structure that was 153,522 bp in size, consisting of one large single-copy region of 83,320 bp and one small single-copy region of 17,786 bp which were separated by two inverted repeat (IRa and IRb) regions of 26,208 bp. This complete cp genome harbored 113 unique genes: 79 protein-coding genes, 30 tRNA genes, and four rRNA genes. Forty-nine long repetitive sequences and 69 simple sequence repeats were identified in the E. sativa cp genome. A codon usage analysis of the E. sativa cp genome showed a bias toward codons ending in A/T. The E. sativa cp genome was similar in size, gene composition, and linearity of the structural region when compared with other Brassicaceae cp genomes. Moreover, the analysis of the synonymous (Ks) and non-synonymous (Ka) substitution rates demonstrated that protein-coding genes generally underwent purifying selection pressure, expect ycf1, ycf2, and rps12. A phylogenetic analysis determined that E. sativa is evolutionarily close to important Brassica species, indicating that it may be possible to transfer favorable E. sativa alleles into other Brassica species. Our results will be helpful to advance genetic improvement and breeding of E. sativa, and will provide valuable information for utilizing E. sativa as an important resource to improve other Brassica species.

Download Full-text

Complete Chloroplast Genome of Argania spinosa: Structural Organization and Phylogenetic Relationships in Sapotaceae

Plants ◽

10.3390/plants9101354 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1354

Author(s):

Slimane Khayi ◽

Fatima Gaboun ◽

Stacy Pirro ◽

Tatiana Tatusova ◽

Abdelhamid El Mousadik ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Rrna Genes ◽

Trna Genes ◽

Protein Coding ◽

Important Species ◽

Complete Chloroplast Genome ◽

Argania Spinosa ◽

Protein Coding Genes ◽

Cp Genome

Argania spinosa (Sapotaceae), an important endemic Moroccan oil tree, is a primary source of argan oil, which has numerous dietary and medicinal proprieties. The plant species occupies the mid-western part of Morocco and provides great environmental and socioeconomic benefits. The complete chloroplast (cp) genome of A. spinosa was sequenced, assembled, and analyzed in comparison with those of two Sapotaceae members. The A. spinosa cp genome is 158,848 bp long, with an average GC content of 36.8%. The cp genome exhibits a typical quadripartite and circular structure consisting of a pair of inverted regions (IR) of 25,945 bp in length separating small single-copy (SSC) and large single-copy (LSC) regions of 18,591 and 88,367 bp, respectively. The annotation of A. spinosa cp genome predicted 130 genes, including 85 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, and 37 transfer RNA (tRNA) genes. A total of 44 long repeats and 88 simple sequence repeats (SSR) divided into mononucleotides (76), dinucleotides (7), trinucleotides (3), tetranucleotides (1), and hexanucleotides (1) were identified in the A. spinosa cp genome. Phylogenetic analyses using the maximum likelihood (ML) method were performed based on 69 protein-coding genes from 11 species of Ericales. The results confirmed the close position of A. spinosa to the Sideroxylon genus, supporting the revisiting of its taxonomic status. The complete chloroplast genome sequence will be valuable for further studies on the conservation and breeding of this medicinally and culinary important species and also contribute to clarifying the phylogenetic position of the species within Sapotaceae.

Download Full-text

Complete Chloroplast Genome Sequence of Erigeron breviscapus and Characterization of Chloroplast Regulatory Elements

Frontiers in Plant Science ◽

10.3389/fpls.2021.758290 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yifan Yu ◽

Zhen Ouyang ◽

Juan Guo ◽

Wen Zeng ◽

Yujun Zhao ◽

...

Keyword(s):

Chloroplast Genome ◽

Single Copy ◽

Regulatory Elements ◽

Rrna Genes ◽

Expression Vectors ◽

Protein Coding ◽

Protein Coding Genes ◽

Flanking Sequences ◽

Erigeron Breviscapus ◽

Cp Genome

Erigeron breviscapus is a famous medicinal plant. However, the limited chloroplast genome information of E. breviscapus, especially for the chloroplast DNA sequence resources, has hindered the study of E. breviscapus chloroplast genome transformation. Here, the complete chloroplast (cp) genome of E. breviscapus was reported. This genome was 152,164bp in length, included 37.2% GC content and was structurally arranged into two 24,699bp inverted repeats (IRs) and two single-copy areas. The sizes of the large single-copy region and the small single-copy region were 84,657 and 18,109bp, respectively. The E. breviscapus cp genome consisted of 127 coding genes, including 83 protein coding genes, 36 transfer RNA (tRNA) genes, and eight ribosomal RNA (rRNA) genes. For those genes, 95 genes were single copy genes and 16 genes were duplicated in two inverted regions with seven tRNAs, four rRNAs, and five protein coding genes. Then, genomic DNA of E. breviscapus was used as a template, and the endogenous 5' and 3' flanking sequences of the trnI gene and trnA gene were selected as homologous recombinant fragments in vector construction and cloned through PCR. The endogenous 5' flanking sequences of the psbA gene and rrn16S gene, the endogenous 3' flanking sequences of the psbA gene, rbcL gene, and rps16 gene and one sequence element from the psbN-psbH chloroplast operon were cloned, and certain chloroplast regulatory elements were identified. Two homologous recombination fragments and all of these elements were constructed into the cloning vector pBluescript SK (+) to yield a series of chloroplast expression vectors, which harbored the reporter gene EGFP and the selectable marker aadA gene. After identification, the chloroplast expression vectors were transformed into Escherichia coli and the function of predicted regulatory elements was confirmed by a spectinomycin resistance test and fluorescence intensity measurement. The results indicated that aadA gene and EGFP gene were efficiently expressed under the regulation of predicted regulatory elements and the chloroplast expression vector had been successfully constructed, thereby providing a solid foundation for establishing subsequent E. breviscapus chloroplast transformation system and genetic improvement of E. breviscapus.

Download Full-text

Plastome Diversity and Phylogenomic Relationships in Asteraceae

Plants ◽

10.3390/plants10122699 ◽

2021 ◽

Vol 10 (12) ◽

pp. 2699

Author(s):

Joan Pere Pascual-Díaz ◽

Sònia Garcia ◽

Daniel Vitales

Keyword(s):

Ribosomal Rna ◽

Evolutionary Rate ◽

Plastid Dna ◽

Single Copy ◽

Phylogenomic Analysis ◽

Protein Coding ◽

Protein Coding Genes ◽

Plastid Genomes ◽

The Family ◽

Rna Genes

Plastid genomes are in general highly conserved given their slow evolutionary rate, and thus large changes in their structure are unusual. However, when specific rearrangements are present, they are often phylogenetically informative. Asteraceae is a highly diverse family whose evolution is long driven by polyploidy (up to 48x) and hybridization, both processes usually complicating systematic inferences. In this study, we generated one of the most comprehensive plastome-based phylogenies of family Asteraceae, providing information about the structure, genetic diversity and repeat composition of these sequences. By comparing the whole-plastome sequences obtained, we confirmed the double inversion located in the long single-copy region, for most of the species analyzed (with the exception of basal tribes), a well-known feature for Asteraceae plastomes. We also showed that genome size, gene order and gene content are highly conserved along the family. However, species representative of the basal subfamily Barnadesioideae—as well as in the sister family Calyceraceae—lack the pseudogene rps19 located in one inverted repeat. The phylogenomic analysis conducted here, based on 63 protein-coding genes, 30 transfer RNA genes and 21 ribosomal RNA genes from 36 species of Asteraceae, were overall consistent with the general consensus for the family’s phylogeny while resolving the position of tribe Senecioneae and revealing some incongruences at tribe level between reconstructions based on nuclear and plastid DNA data.

Download Full-text

Characterization of the complete chloroplast genome sequence and phylogenetic analysis of B. oleracea var. italica

10.21203/rs.2.20976/v1 ◽

2020 ◽

Author(s):

Zhenchao Zhang ◽

Zhongliang Dai ◽

Yuemei Yao ◽

Yongfei Pan ◽

Guosheng Sun ◽

...

Keyword(s):

Chloroplast Genome ◽

Genome Sequence ◽

Genomic Structure ◽

Gc Content ◽

Single Copy ◽

Biological Research ◽

Protein Coding ◽

Protein Coding Genes ◽

Cp Genome ◽

Functional Components

Abstract Backgrounds: Broccoli (Brassica. oleracea var. italica L.) is known as one of the most nutritionally rich vegetables, as well as rich in functional components that benefit to health. The main purposes of this research were sequencing, assembling and annotation of chloroplast genome of broccoli based on Illumina HiSeq2500 sequencing platform. Results: The size of the broccoli cp genome is 153,364 bp, including two inverted repeat (IR) regions of 26,197 bp each, separated by a small single copy (SSC) region of 17,834 bp and a large single copy (LSC) region of 83,136 bp. The GC content of the complete genome is 36.36%, while those of SSC, LSC, and IR are 29.1%, 34.15% and 42.35%, respectively. It harbors 134 functional genes, including 87 protein-coding genes, 39 tRNAs and 8 rRNAs, with 31 duplicates in the IRs. The most abundant amino acid in the protein-coding genes is leucine, while the least is cysteine. Codon usage frequency showed bias for A/T-ending codons in the cp genome. In the repeat structure analysis, a total of 34 repeat sequences and 291 simple sequence repeat (SSRs) were detected in the work. Although cp genomic structure and size are highly conserved, the SC-IR boundary regions are variable between the 7 cp genomes. The phylogenetic relationships based on complete cp genome from 9 species suggest that B. oleracea var. italica is closely related to Brassica juncea. Conclusions: The complete cp genome sequence was obtained and annotated for broccoli for the first time. The information acquired from this research will be useful for further species identification, population genetics and biological research of broccoli.

Download Full-text

Complete Chloroplast Genome Sequence of Justicia flava: Genome Comparative Analysis and Phylogenetic Relationships among Acanthaceae

BioMed Research International ◽

10.1155/2019/4370258 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Samaila S. Yaradua ◽

Dhafer A. Alzahrani ◽

Enas J. Albokhary ◽

Abidina Abba ◽

Abubakar Bello

Keyword(s):

Comparative Analysis ◽

Chloroplast Genome ◽

Phylogenetic Relationships ◽

Inverted Repeat ◽

Gc Content ◽

Single Copy ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Protein Coding Genes ◽

Cp Genome

The complete chloroplast genome of J. flava, an endangered medicinal plant in Saudi Arabia, was sequenced and compared with cp genome of three Acanthaceae species to characterize the cp genome, identify SSRs, and also detect variation among the cp genomes of the sampled Acanthaceae. NOVOPlasty was used to assemble the complete chloroplast genome from the whole genome data. The cp genome of J. flava was 150, 888bp in length with GC content of 38.2%, and has a quadripartite structure; the genome harbors one pair of inverted repeat (IRa and IRb 25, 500bp each) separated by large single copy (LSC, 82, 995 bp) and small single copy (SSC, 16, 893 bp). There are 132 genes in the genome, which includes 80 protein coding genes, 30 tRNA, and 4 rRNA; 113 are unique while the remaining 19 are duplicated in IR regions. The repeat analysis indicates that the genome contained all types of repeats with palindromic occurring more frequently; the analysis also identified total number of 98 simple sequence repeats (SSR) of which majority are mononucleotides A/T and are found in the intergenic spacer. The comparative analysis with other cp genomes sampled indicated that the inverted repeat regions are conserved than the single copy regions and the noncoding regions show high rate of variation than the coding region. All the genomes have ndhF and ycf1 genes in the border junction of IRb and SSC. Sequence divergence analysis of the protein coding genes showed that seven genes (petB, atpF, psaI, rpl32, rpl16, ycf1, and clpP) are under positive selection. The phylogenetic analysis revealed that Justiceae is sister to Ruellieae. This study reported the first cp genome of the largest genus in Acanthaceae and provided resources for studying genetic diversity of J. flava as well as resolving phylogenetic relationships within the core Acanthaceae.

Download Full-text

Highly rearranged mitochondrial genome in Falcolipeurus lice (Phthiraptera: Philopteridae) from endangered eagles

Parasites & Vectors ◽

10.1186/s13071-021-04776-5 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Yu Nie ◽

Yi-Tian Fu ◽

Yu Zhang ◽

Yuan-Ping Deng ◽

Wei Wang ◽

...

Keyword(s):

Phylogenetic Analyses ◽

Phylogenetic Position ◽

Gene Rearrangements ◽

Posterior Probabilities ◽

Protein Coding ◽

Protein Coding Genes ◽

The Family ◽

Louse Species ◽

Mt Genome ◽

Rna Genes

Abstract Background Fragmented mitochondrial (mt) genomes and extensive mt gene rearrangements have been frequently reported from parasitic lice (Insecta: Phthiraptera). However, relatively little is known about the mt genomes from the family Philopteridae, the most species-rich family within the suborder Ischnocera. Methods Herein, we use next-generation sequencing to decode the mt genome of Falcolipeurus suturalis and compare it with the mt genome of F. quadripustulatus. Phylogenetic relationships within the family Philopteridae were inferred from the concatenated 13 protein-coding genes of the two Falcolipeurus lice and members of the family Philopteridae using Bayesian inference (BI) and maximum likelihood (ML) methods. Results The complete mt genome of F. suturalis is a circular, double-stranded DNA molecule 16,659 bp in size that contains 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and three non-coding regions. The gene order of the F. suturalis mt genome is rearranged relative to that of F. quadripustulatus, and is radically different from both other louse species and the putative ancestral insect. Phylogenetic analyses revealed clear genetic distinctiveness between F. suturalis and F. quadripustulatus (Bayesian posterior probabilities = 1.0 and bootstrapping frequencies = 100), and that the genus Falcolipeurus is sister to the genus Ibidoecus (Bayesian posterior probabilities = 1.0 and bootstrapping frequencies = 100). Conclusions These datasets help to better understand gene rearrangements in lice and the phylogenetic position of Falcolipeurus and provide useful genetic markers for systematic studies of bird lice. Graphic abstract

Download Full-text

Complete chloroplast genome of Stephania tetrandra (Menispermaceae) from Zhejiang Province: insights into molecular structures, comparative genome analysis, mutational hotspots and phylogenetic relationships

BMC Genomics ◽

10.1186/s12864-021-08193-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Shujie Dong ◽

Zhiqi Ying ◽

Shuisheng Yu ◽

Qirui Wang ◽

Guanghui Liao ◽

...

Keyword(s):

Comparative Analysis ◽

Rna Editing ◽

Geographical Origin ◽

Single Copy ◽

Zhejiang Province ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Protein Coding Genes ◽

Mutational Hotspots ◽

Cp Genome

Abstract Background The Stephania tetrandra S. Moore (S. tetrandra) is a medicinal plant belonging to the family Menispermaceae that has high medicinal value and is well worth doing further exploration. The wild resources of S. tetrandra were widely distributed in tropical and subtropical regions of China, generating potential genetic diversity and unique population structures. The geographical origin of S. tetrandra is an important factor influencing its quality and price in the market. In addition, the species relationship within Stephania genus still remains uncertain due to high morphological similarity and low support values of molecular analysis approach. The complete chloroplast (cp) genome data has become a promising strategy to determine geographical origin and understand species evolution for closely related plant species. Herein, we sequenced the complete cp genome of S. tetrandra from Zhejiang Province and conducted a comparative analysis within Stephania plants to reveal the structural variations, informative markers and phylogenetic relationship of Stephania species. Results The cp genome of S. tetrandra voucher ZJ was 157,725 bp, consisting of a large single copy region (89,468 bp), a small single copy region (19,685 bp) and a pair of inverted repeat regions (24,286 bp each). A total of 134 genes were identified in the cp genome of S. tetrandra, including 87 protein-coding genes, 8 rRNA genes, 37 tRNA genes and 2 pseudogene copies (ycf1 and rps19). The gene order and GC content were highly consistent in the Stephania species according to the comparative analysis results, with the highest RSCU value in arginine (1.79) and lowest RSCU value in serine of S. tetrandra, respectively. A total of 90 SSRs have been identified in the cp genome of S. tetrandra, where repeats that consisting of A or T bases were much higher than that of G or C bases. In addition, 92 potential RNA editing sites were identified in 25 protein-coding genes, with the most predicted RNA editing sites in ndhB gene. The variations on length and expansion extent to the junction of ycf1 gene were observed between S. tetrandra vouchers from different regions, indicating potential markers for further geographical origin discrimination. Moreover, the values of transition to transversion ratio (Ts/Tv) in the Stephania species were significantly higher than 1 using Pericampylus glaucus as reference. Comparative analysis of the Stephania cp genomes revealed 5 highly variable regions, including 3 intergenic regions (trnH-psbA, trnD-trnY, trnP) and two protein coding genes (rps16 and ndhA). The identified mutational hotspots of Stephania plants exhibited multiple SNP sites and Gaps, as well as different Ka/Ks ratio values. In addition, five pairs of specific primers targeting the divergence regions were accordingly designed, which could be utilized as potential molecular markers for species identification, population genetic and phylogenetic analysis in Stephania species. Phylogenetic tree analysis based on the conserved chloroplast protein coding genes indicated a sister relationship between S. tetrandra and the monophyletic group of S. japonica and S. kwangsiensis with high support values, suggesting a close genetic relationship within Stephania plants. However, two S. tetrandra vouches from different regions failed to cluster into one clade, confirming the occurrences of genetic diversities and requiring further investigation for geographical tracing strategy. Conclusions Overall, we provided comprehensive and detailed information on the complete chloroplast genome and identified nucleotide diversity hotspots of Stephania species. The obtained genetic resource of S. tetrandra from Zhejiang Province would facilitate future studies in DNA barcode, species discrimination, the intraspecific and interspecific variability and the phylogenetic relationships of Stephania plants.

Download Full-text

Characterization of the complete chloroplast genome sequence of Vitis vinifera ‘Guifeimeigui’

Scientific Journal of Genetics and Gene Therapy ◽

10.17352/sjggt.000019 ◽

2021 ◽

pp. 001-003

Author(s):

Liu Li ◽

Yang Yang ◽

Li Xiujie ◽

Li Bo

Keyword(s):

Vitis Vinifera ◽

Gc Content ◽

Single Copy ◽

Protein Coding ◽

Complete Chloroplast Genome ◽

Chloroplast Genome Sequence ◽

Cp Genome ◽

Eurasian Species ◽

Rna Genes

Vitis vinifera ‘Guifeimeigui’ is a diploid table grape, a Eurasian species. This research first reported the complete chloroplast (cp) genome of Vitis vinifera ‘Guifeimeigui’. The size of the complete cp genome is 160,928 bp and its GC content is 37.38%, including a pair of inverted repeats (26,353 bp each) separated by large (89,150 bp) and small (19,072 bp) single-copy regions. It encodes 85 genes, including 40 protein coding genes, 37 transfer RNA genes (tRNA), and 8 ribosomal RNA genes (rRNA). The Maximum Likelihood (ML) phylogenetic tree demonstrated that Vitis vinifera ‘Guifeimeigui’ is close to Vitis vinifera.

Download Full-text