scholarly journals Chromosome level genome assembly and annotation of highly invasive Japanese stiltgrass (Microstegium vimineum)

2021 ◽  
Author(s):  
Dhanushya Ramachandran ◽  
Cynthia D Huebner ◽  
Mark Daly ◽  
Jasmine Haimovitz ◽  
Thomas Swale ◽  
...  

The invasive Japanese stiltgrass (Microstegium vimineum) affects a wide range of ecosystems and threatens biodiversity across the eastern USA. However, the mechanisms underlying rapid adaptation, plasticity, and epigenetics in the invasive range are largely unknown. We present a chromosome-level assembly for M. vimineum to investigate genome dynamics, evolution, adaptation, and the genomics of phenotypic plasticity. We generated a 1.12 Gb genome with scaffold N50 length of 53.44 Mb respectively, taking a de novo assembly approach that combined PacBio and Dovetail Genomics Omni-C sequencing. The assembly contains 23 pseudochromosomes, representing 99.96% of the genome. BUSCO assessment indicated that 80.3% of Poales gene groups are present in the assembly. The genome is predicted to contain 39,604 protein-coding genes, of which 26,288 are functionally annotated. Furthermore, 66.68% of the genome is repetitive, of which unclassified (35.63%) and long terminal repeat (LTR) retrotransposons (26.90%) are predominant. Similar to other grasses, Gypsy (41.07%) and Copia (32%) are the most abundant LTR-retrotransposon families. The majority of LTR-retrotransposons are derived from a significant expansion in the past 1-2 million years, suggesting the presence of relatively young LTR-retrotransposon lineages. We find corroborating evidence from Ks plots for a stiltgrass-specific duplication event, distinct from the more ancient grass-specific duplication event. The assembly and annotation of M. vimineum will serve as an essential genomic resource facilitating studies of the invasion process, the history and consequences of polyploidy in grasses, and provides a crucial tool for natural resource managers.

2019 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Baohua Chen ◽  
Zhixiong Zhou ◽  
Qiaozhen Ke ◽  
Yidi Wu ◽  
Huaqiang Bai ◽  
...  

Abstract Larimichthys crocea is an endemic marine fish in East Asia that belongs to Sciaenidae in Perciformes. L. crocea has now been recognized as an “iconic” marine fish species in China because not only is it a popular food fish in China, it is a representative victim of overfishing and still provides high value fish products supported by the modern large-scale mariculture industry. Here, we report a chromosome-level reference genome of L. crocea generated by employing the PacBio single molecule sequencing technique (SMRT) and high-throughput chromosome conformation capture (Hi-C) technologies. The genome sequences were assembled into 1,591 contigs with a total length of 723.86 Mb and a contig N50 length of 2.83 Mb. After chromosome-level scaffolding, 24 scaffolds were constructed with a total length of 668.67 Mb (92.48% of the total length). Genome annotation identified 23,657 protein-coding genes and 7262 ncRNAs. This highly accurate, chromosome-level reference genome of L. crocea provides an essential genome resource to support the development of genome-scale selective breeding and restocking strategies of L. crocea.


2020 ◽  
Vol 10 (3) ◽  
pp. 891-897 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

The Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193 Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromeres, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Shubo Jin ◽  
Chao Bian ◽  
Sufei Jiang ◽  
Kai Han ◽  
Yiwei Xiong ◽  
...  

Abstract Background The oriental river prawn, Macrobrachium nipponense, is an economically important shrimp in China. Male prawns have higher commercial value than females because the former grow faster and reach larger sizes. It is therefore important to reveal sex-differentiation and development mechanisms of the oriental river prawn to enable genetic improvement. Results We sequenced 293.3 Gb of raw Illumina short reads and 405.7 Gb of Pacific Biosciences long reads. The final whole-genome assembly of the Oriental river prawn was ∼4.5 Gb in size, with predictions of 44,086 protein-coding genes. A total of 49 chromosomes were determined, with an anchor ratio of 94.7% and a scaffold N50 of 86.8 Mb. A whole-genome duplication event was deduced to have happened 109.8 million years ago. By integration of genome and transcriptome data, 21 genes were predicted as sex-related candidate genes. Conclusion The first high-quality chromosome-level genome assembly of the oriental river prawn was obtained. These genomic data, along with transcriptome sequences, are essential for understanding sex-differentiation and development mechanisms in the oriental river prawn, as well as providing genetic resources for in-depth studies on developmental and evolutionary biology in arthropods.


2012 ◽  
Vol 2012 ◽  
pp. 1-17 ◽  
Author(s):  
Anton Novikov ◽  
Georgiy Smyshlyaev ◽  
Olga Novikova

Chromodomain-containing LTR retrotransposons are one of the most successful groups of mobile elements in plant genomes. Previously, we demonstrated that two types of chromodomains (CHDs) are carried by plant LTR retrotransposons. Chromodomains from group I (CHD_I) were detected only in Tcn1-like LTR retrotransposons from nonseed plants such as mosses (including the model moss species Physcomitrella) and lycophytes (the Selaginella species). LTR retrotransposon chromodomains from group II (CHD_II) have been described from a wide range of higher plants. In the present study, we performed computer-based mining of plant LTR retrotransposon CHDs from diverse plants with an emphasis on spike-moss Selaginella. Our extended comparative and phylogenetic analysis demonstrated that two types of CHDs are present only in the Selaginella genome, which puts this species in a unique position among plants. It appears that a transition from CHD_I to CHD_II and further diversification occurred in the evolutionary history of plant LTR retrotransposons at approximately 400 MYA and most probably was associated with the evolution of chromatin organization.


GigaScience ◽  
2020 ◽  
Vol 9 (8) ◽  
Author(s):  
Zhou Hong ◽  
Jiang Li ◽  
Xiaojin Liu ◽  
Jinmin Lian ◽  
Ningnan Zhang ◽  
...  

Abstract Background Dalbergia odorifera T. Chen (Fabaceae) is an International Union for Conservation of Nature red-listed tree. This tree is of high medicinal and commercial value owing to its officinal, insect-proof, durable heartwood. However, there is a lack of genome reference, which has hindered development of studies on the heartwood formation. Findings We presented the first chromosome-scale genome assembly of D. odorifera obtained on the basis of Illumina paired-end sequencing, Pacific Biosciences single-molecule real-time sequencing, 10x Genomics linked reads, and Hi-C technology. We assembled 97.68% of the 653.45 Mb D. odorifera genome with scaffold N50 and contig sizes of 56.16 and 5.92 Mb, respectively. Ten super-scaffolds corresponding to the 10 chromosomes were assembled, with the longest scaffold reaching 79.61 Mb. Repetitive elements account for 54.17% of the genome, and 30,310 protein-coding genes were predicted from the genome, of which ∼92.6% were functionally annotated. The phylogenetic tree showed that D. odorifera diverged from the ancestor of Arabidopsis thaliana and Populus trichocarpa and then separated from Glycine max and Cajanus cajan. Conclusions We sequence and reveal the first chromosome-level de novo genome of D. odorifera. These studies provide valuable genomic resources for the research of heartwood formation in D. odorifera and other timber trees. The high-quality assembled genome can also be used as reference for comparative genomics analysis and future population genetic studies of D. odorifera.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Xuchen Yang ◽  
Minghui Kang ◽  
Yanting Yang ◽  
Haifeng Xiong ◽  
Mingcheng Wang ◽  
...  

AbstractThe deciduous Chinese tupelo (Nyssa sinensis Oliv.) is a popular ornamental tree for the spectacular autumn leaf color. Here, using single-molecule sequencing and chromosome conformation capture data, we report a high-quality, chromosome-level genome assembly of N. sinensis. PacBio long reads were de novo assembled into 647 polished contigs with a total length of 1,001.42 megabases (Mb) and an N50 size of 3.62 Mb, which is in line with genome sizes estimated using flow cytometry and the k-mer analysis. These contigs were further clustered and ordered into 22 pseudo-chromosomes based on Hi-C data, matching the chromosome counts in Nyssa obtained from previous cytological studies. In addition, a total of 664.91 Mb of repetitive elements were identified and a total of 37,884 protein-coding genes were predicted in the genome of N. sinensis. All data were deposited in publicly available repositories, and should be a valuable resource for genomics, evolution, and conservation biology.


2017 ◽  
Author(s):  
Steven T. Hill ◽  
Rachael Kuintzle ◽  
Amy Teegarden ◽  
Erich Merrill ◽  
Padideh Danaee ◽  
...  

AbstractThe current deluge of newly identified RNA transcripts presents a singular opportunity for improved assessment of coding potential, a cornerstone of genome annotation, and for machine-driven discovery of biological knowledge. While traditional, feature-based methods for RNA classification are limited by current scientific knowledge, deep learning methods can independently discover complex biological rules in the data de novo. We trained a gated recurrent neural network (RNN) on human messenger RNA (mRNA) and long noncoding RNA (lncRNA) sequences. Our model, mRNA RNN (mRNN), surpasses state-of-the-art methods at predicting protein-coding potential. To understand what mRNN learned, we probed the network and uncovered several context-sensitive codons highly predictive of coding potential. Our results suggest that gated RNNs can learn complex and long-range patterns in full-length human transcripts, making them ideal for performing a wide range of difficult classification tasks and, most importantly, for harvesting new biological insights from the rising flood of sequencing data.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Fanming Meng ◽  
Zhuoying Liu ◽  
Han Han ◽  
Dmitrijs Finkelbergs ◽  
Yangshuai Jiang ◽  
...  

Abstract Background Blowflies (Diptera: Calliphoridae) are the most commonly found entomological evidence in forensic investigations. Distinguished from other blowflies, Aldrichina grahami has some unique biological characteristics and is a species of forensic importance. Its development rate, pattern, and life cycle can provide valuable information for the estimation of the minimum postmortem interval. Findings Herein we provide a chromosome-level genome assembly of A. grahami that was generated by Pacific BioSciences sequencing platform and chromosome conformation capture (Hi-C) technology. A total of 50.15 Gb clean reads of the A. grahami genome were generated. FALCON and Wtdbg were used to construct the genome of A. grahami, resulting in an assembly of 600 Mb and 1,604 contigs with an N50 size of 1.93 Mb. We predicted 12,823 protein-coding genes, 99.8% of which was functionally annotated on the basis of the de novo genome (SRA: PRJNA513084) and transcriptome (SRA: SRX5207346) of A. grahami. According to the co-analysis with 11 other insect species, clustering and phylogenetic reconstruction of gene families were performed. Using Hi-C sequencing, a chromosome-level assembly of 6 chromosomes was generated with scaffold N50 of 104.7 Mb. Of these scaffolds, 96.4% were anchored to the total A. grahami genome contig bases. Conclusions The present study provides a robust genome reference for A. grahami that supplements vital genetic information for nonhuman forensic genomics and facilitates the future research of A. grahami and other necrophagous blowfly species used in forensic medicine.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Da-xia Chen ◽  
Yuan Pan ◽  
Yu Wang ◽  
Yan-Ze Cui ◽  
Ying-Jun Zhang ◽  
...  

AbstractCoptis chinensis Franch, a perennial herb, is mainly distributed in southeastern China. The rhizome of C. chinensis has been used as a traditional medicine for more than 2000 years in China and many other Asian countries. The pharmacological activities of C. chinensis have been validated by research. Here, we present a de novo high-quality genome of C. chinensis with a chromosome-level genome of ~958.20 Mb, a contig N50 of 1.58 Mb, and a scaffold N50 of 4.53 Mb. We found that the relatively large genome size of C. chinensis was caused by the amplification of long terminal repeat (LTR) retrotransposons. In addition, a whole-genome duplication event in ancestral Ranunculales was discovered. Comparative genomic analysis revealed that the tyrosine decarboxylase (TYDC) and (S)-norcoclaurine synthase (NCS) genes were expanded and that the aspartate aminotransferase gene (ASP5) was positively selected in the berberine metabolic pathway. Expression level and HPLC analyses showed that the berberine content was highest in the roots of C. chinensis in the third and fourth years. The chromosome-level reference genome of C. chinensis provides important genomic data for molecular-assisted breeding and active ingredient biosynthesis.


Sign in / Sign up

Export Citation Format

Share Document