scholarly journals The chromosome-level reference genome of Coptis chinensis provides insights into genomic evolution and berberine biosynthesis

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Da-xia Chen ◽  
Yuan Pan ◽  
Yu Wang ◽  
Yan-Ze Cui ◽  
Ying-Jun Zhang ◽  
...  

AbstractCoptis chinensis Franch, a perennial herb, is mainly distributed in southeastern China. The rhizome of C. chinensis has been used as a traditional medicine for more than 2000 years in China and many other Asian countries. The pharmacological activities of C. chinensis have been validated by research. Here, we present a de novo high-quality genome of C. chinensis with a chromosome-level genome of ~958.20 Mb, a contig N50 of 1.58 Mb, and a scaffold N50 of 4.53 Mb. We found that the relatively large genome size of C. chinensis was caused by the amplification of long terminal repeat (LTR) retrotransposons. In addition, a whole-genome duplication event in ancestral Ranunculales was discovered. Comparative genomic analysis revealed that the tyrosine decarboxylase (TYDC) and (S)-norcoclaurine synthase (NCS) genes were expanded and that the aspartate aminotransferase gene (ASP5) was positively selected in the berberine metabolic pathway. Expression level and HPLC analyses showed that the berberine content was highest in the roots of C. chinensis in the third and fourth years. The chromosome-level reference genome of C. chinensis provides important genomic data for molecular-assisted breeding and active ingredient biosynthesis.

2021 ◽  
Author(s):  
Xinxin Yi ◽  
Jing Liu ◽  
Shengcai Chen ◽  
Hao Wu ◽  
Min Liu ◽  
...  

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.


2020 ◽  
Author(s):  
Xinxin Yi ◽  
Jing Liu ◽  
Shengcai Chen ◽  
Hao Wu ◽  
Min Liu ◽  
...  

Abstract BackgroundCultivated soybean (Glycine max) is an important source for protein and oil. Each soybean strain has its own genetic diversity, and the availability of more soybean genomes may enhance comparative genomic analysis of soybean.ResultsIn this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with high contiguity, completeness, and accuracy. We annotated 59,629 gene models and reconstructed 235,109 high-quality full-length transcripts. We have molecularly characterized the genotypes of some important agronomic traits of JD17 by taking advantage of these newly established genomic resources.ConclusionsWe reported a high-quality genome and annotations of a wide range of cultivars, and used them to analyze the genotypes of genes related to important agronomic traits of soybean in JD17. We have demonstrated that high-quality genome assembly can serve as a valuable reference for soybean genomics and breeding research community.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Baohua Chen ◽  
Zhixiong Zhou ◽  
Qiaozhen Ke ◽  
Yidi Wu ◽  
Huaqiang Bai ◽  
...  

Abstract Larimichthys crocea is an endemic marine fish in East Asia that belongs to Sciaenidae in Perciformes. L. crocea has now been recognized as an “iconic” marine fish species in China because not only is it a popular food fish in China, it is a representative victim of overfishing and still provides high value fish products supported by the modern large-scale mariculture industry. Here, we report a chromosome-level reference genome of L. crocea generated by employing the PacBio single molecule sequencing technique (SMRT) and high-throughput chromosome conformation capture (Hi-C) technologies. The genome sequences were assembled into 1,591 contigs with a total length of 723.86 Mb and a contig N50 length of 2.83 Mb. After chromosome-level scaffolding, 24 scaffolds were constructed with a total length of 668.67 Mb (92.48% of the total length). Genome annotation identified 23,657 protein-coding genes and 7262 ncRNAs. This highly accurate, chromosome-level reference genome of L. crocea provides an essential genome resource to support the development of genome-scale selective breeding and restocking strategies of L. crocea.


mSystems ◽  
2019 ◽  
Vol 4 (5) ◽  
Author(s):  
Haijian Du ◽  
Wenyan Zhang ◽  
Wensi Zhang ◽  
Weijia Zhang ◽  
Hongmiao Pan ◽  
...  

ABSTRACT The evolution of microbial magnetoreception (or magnetotaxis) is of great interest in the fields of microbiology, evolutionary biology, biophysics, geomicrobiology, and geochemistry. Current genomic data from magnetotactic bacteria (MTB), the only prokaryotes known to be capable of sensing the Earth’s geomagnetic field, suggests an ancient origin of magnetotaxis in the domain Bacteria. Vertical inheritance, followed by multiple independent magnetosome gene cluster loss, is considered to be one of the major forces that drove the evolution of magnetotaxis at or above the class or phylum level, although the evolutionary trajectories at lower taxonomic ranks (e.g., within the class level) remain largely unstudied. Here we report the isolation, cultivation, and sequencing of a novel magnetotactic spirillum belonging to the genus Terasakiella (Terasakiella sp. strain SH-1) within the class Alphaproteobacteria. The complete genome sequence of Terasakiella sp. strain SH-1 revealed an unexpected duplication event of magnetosome genes within the mamAB operon, a group of genes essential for magnetosome biomineralization and magnetotaxis. Intriguingly, further comparative genomic analysis suggests that the duplication of mamAB genes is a common feature in the genomes of alphaproteobacterial MTB. Taken together, with the additional finding that gene duplication appears to have also occurred in some magnetotactic members of the Deltaproteobacteria, our results indicate that gene duplication plays an important role in the evolution of magnetotaxis in the Alphaproteobacteria and perhaps the domain Bacteria. IMPORTANCE A diversity of organisms can sense the geomagnetic field for the purpose of navigation. Magnetotactic bacteria are the most primitive magnetism-sensing organisms known thus far and represent an excellent model system for the study of the origin, evolution, and mechanism of microbial magnetoreception (or magnetotaxis). The present study is the first report focused on magnetosome gene cluster duplication in the Alphaproteobacteria, which suggests the important role of gene duplication in the evolution of magnetotaxis in the Alphaproteobacteria and perhaps the domain Bacteria. A novel scenario for the evolution of magnetotaxis in the Alphaproteobacteria is proposed and may provide new insights into evolution of magnetoreception of higher species.


mSphere ◽  
2019 ◽  
Vol 4 (6) ◽  
Author(s):  
Marian Dominguez-Mirazo ◽  
Rong Jin ◽  
Joshua S. Weitz

ABSTRACT Huanglongbing disease (HLB; yellow shoot disease) is a severe worldwide infectious disease for citrus family plants. The pathogen “Candidatus Liberibacter asiaticus” is an alphaproteobacterium of the Rhizobiaceae family that has been identified as the causative agent of HLB. The virulence of “Ca. Liberibacter asiaticus” has been attributed, in part, to prophage-carried genes. Prophage and prophage-like elements have been identified in 12 of the 15 available “Ca. Liberibacter asiaticus” genomes and are classified into three prophage types. Here, we reexamined all 15 “Ca. Liberibacter asiaticus” genomes using a de novo prediction approach and expanded the number of prophage-like elements from 16 to 33. Further, we found that all of the “Ca. Liberibacter asiaticus” genomes contained at least one prophage-like sequence. Comparative analysis revealed a prevalent, albeit previously unknown, prophage-like sequence type that is a remnant of an integrated prophage. Notably, this remnant prophage is found in the Ishi-1 “Ca. Liberibacter asiaticus” strain that had previously been reported as lacking prophages. Our findings provide both a resource for data and new insights into the evolutionary relationship between phage and “Ca. Liberibacter asiaticus” pathogenicity. IMPORTANCE Huanglongbing (HLB) disease is threatening citrus production worldwide. The causative agent is “Candidatus Liberibacter asiaticus.” Prior work using mapping-based approaches identified prophage-like sequences in some “Ca. Liberibacter asiaticus” genomes but not all. Here, we utilized a de novo approach that expands the number of prophage-like elements found in “Ca. Liberibacter asiaticus” from 16 to 33 and identified at least one prophage-like sequence in all “Ca. Liberibacter asiaticus” strains. Furthermore, we identified a prophage-like sequence type that is a remnant of an integrated prophage—expanding the number of prophage types in “Ca. Liberibacter asiaticus” from 3 to 4. Overall, the findings will help researchers investigate the role of prophage in the ecology, evolution, and pathogenicity of “Ca. Liberibacter asiaticus.”


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Xian-Gui Yi ◽  
Xia-Qing Yu ◽  
Jie Chen ◽  
Min Zhang ◽  
Shao-Wei Liu ◽  
...  

Abstract Cerasus serrulata is a flowering cherry germplasm resource for ornamental purposes. In this work, we present a de novo chromosome-scale genome assembly of C. serrulata by the use of Nanopore and Hi-C sequencing technologies. The assembled C. serrulata genome is 265.40 Mb across 304 contigs and 67 scaffolds, with a contig N50 of 1.56 Mb and a scaffold N50 of 31.12 Mb. It contains 29,094 coding genes, 27,611 (94.90%) of which are annotated in at least one functional database. Synteny analysis indicated that C. serrulata and C. avium have 333 syntenic blocks composed of 14,072 genes. Blocks on chromosome 01 of C. serrulata are distributed on all chromosomes of C. avium, implying that chromosome 01 is the most ancient or active of the chromosomes. The comparative genomic analysis confirmed that C. serrulata has 740 expanded gene families, 1031 contracted gene families, and 228 rapidly evolving gene families. By the use of 656 single-copy orthologs, a phylogenetic tree composed of 10 species was constructed. The present C. serrulata species diverged from Prunus yedoensis ~17.34 million years ago (Mya), while the divergence of C. serrulata and C. avium was estimated to have occurred ∼21.44 Mya. In addition, a total of 148 MADS-box family gene members were identified in C. serrulata, accompanying the loss of the AGL32 subfamily and the expansion of the SVP subfamily. The MYB and WRKY gene families comprising 372 and 66 genes could be divided into seven and eight subfamilies in C. serrulata, respectively, based on clustering analysis. Nine hundred forty-one plant disease-resistance genes (R-genes) were detected by searching C. serrulata within the PRGdb. This research provides high-quality genomic information about C. serrulata as well as insights into the evolutionary history of Cerasus species.


Microbiology ◽  
2014 ◽  
Vol 160 (9) ◽  
pp. 1953-1963
Author(s):  
Nityananda Chowdhury ◽  
Joseph J. Kingston ◽  
W. Brian Whitaker ◽  
Megan R. Carpenter ◽  
Analuisa Cohen ◽  
...  

Heat-shock proteins are molecular chaperones essential for protein folding, degradation and trafficking. The human pathogen Vibrio vulnificus encodes a copy of the groESEL operon in both chromosomes and these genes share <80 % similarity with each other. Comparative genomic analysis was used to determine whether this duplication is prevalent among Vibrionaceae specifically or Gammaproteobacteria in general. Among the Vibrionaceae complete genome sequences in the database (31 species), seven Vibrio species contained a copy of groESEL in each chromosome, including the human pathogens Vibrio cholerae, Vibrio parahaemolyticus and V. vulnificus. Phylogenetic analysis of GroEL among the Gammaproteobacteria indicated that GroESEL-1 encoded in chromosome I was the ancestral copy and GroESEL-2 in chromosome II arose by an ancient gene duplication event. Interestingly, outside of the Vibrionaceae within the Gammaproteobacteria, groESEL chromosomal duplications were rare among the 296 genomes examined; only five additional species contained two or more copies. Examination of the expression pattern of groEL from V. vulnificus cells grown under different conditions revealed differential expression between the copies. The data demonstrate that groEL-1 was more highly expressed during growth in exponential phase than groEL-2 and a similar pattern was also found in both V. cholerae and V. parahaemolyticus. Overall these data suggest that retention of both copies of groESEL in Vibrio species may confer an evolutionary advantage.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Dmitrii E. Polev ◽  
Iuliia K. Karnaukhova ◽  
Larisa L. Krukovskaya ◽  
Andrei P. Kozlov

Human geneLOC100505644 uncharacterized LOC100505644 [Homo sapiens](Entrez Gene ID 100505644) is abundantly expressed in tumors but weakly expressed in few normal tissues. Till now the function of this gene remains unknown. Here we identified the chromosomal borders of the transcribed region and the major splice form of theLOC100505644-specific transcript. We characterised the major regulatory motifs of the gene and its splice sites. Analysis of the secondary structure of the major transcript variant revealed a hairpin-like structure characteristic for precursor microRNAs. Comparative genomic analysis of the locus showed that it originated in primatesde novo. Taken together, our data indicate that human geneLOC100505644encodes some non-protein coding RNA, likely a microRNA. It was assigned a gene symbolELFN1-AS1(ELFN1 antisense RNA 1 (non-protein coding)). This gene combines features of evolutionary novelty and predominant expression in tumors.


2017 ◽  
Author(s):  
Alex B. Brohammer ◽  
Thomas JY. Kono ◽  
Nathan M. Springer ◽  
Suzanne E. McGaugh ◽  
Candice N. Hirsch

SUMMARYMaize is a diverse paleotetraploid species with widespread presence/absence variation and copy number variation. One mechanism through which presence/absence variation can arise is differential fractionation. Fractionation refers to the loss of duplicate gene pairs from one of the maize subgenomes during diploidization and differential fractionation refers to non-shared gene loss events between individuals. We investigated the prevalence of presence/absence variation resulting from differential fractionation in the syntenic portion of the genome using two whole genome de novo assemblies of the inbred lines B73 and PH207. Between these two genomes, syntenic genes were highly conserved with less than 1% of syntenic genes being subject to differential fractionation. The few variable syntenic genes that were identified are unlikely to contribute to functional phenotypic variation, as there is a significant depletion of these genes in annotated gene sets. In further comparisons of 60 diverse inbred lines, non-syntenic genes were six times more likely to be variable compared to syntenic genes, suggesting that comparisons among additional genome assemblies are not likely to result in the discovery of large-scale presence/absence variation among syntenic genes.SIGNIFICANCE STATEMENTThere is a large amount of presence/absence variation for gene content in maize. One mechanism that has been hypothesized to contribute to this variation is differential fractionation between individuals following the maize whole genome duplication event. Using comparative genomics, with sorghum and rice representing the ancestral state, we observed little evidence of differential fractionation among elite inbred lines and the few differentially fractionated genes identified did not appear to confer functional significance.


2014 ◽  
Author(s):  
Rajiv C McCoy ◽  
Ryan W Taylor ◽  
Timothy A Blauwkamp ◽  
Joanna L Kelley ◽  
Michael Kertesz ◽  
...  

High-throughput DNA sequencing technologies have revolutionized genomic analysis, including thede novoassembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or present in complex genomic arrangements. While TEs strongly affect genome function and evolution, most currentde novoassembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly parallel library preparation and local assembly of short read data and achieve lengths of 1.5-18.5 Kbp with an extremely low error rate (∼0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organismDrosophila melanogaster(reference genome strainy;cn,bw,sp) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 of annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long- reads, and likely other methods that generate long reads, offer a powerful approach to improvede novoassemblies of whole genomes.


Sign in / Sign up

Export Citation Format

Share Document