scholarly journals Chromosome-level genome assembly reveals the unique genome evolution of the swimming crab (Portunus trituberculatus)

GigaScience ◽  
2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Boping Tang ◽  
Daizhen Zhang ◽  
Haorong Li ◽  
Senhao Jiang ◽  
Huabin Zhang ◽  
...  

Abstract Background The swimming crab, Portunus trituberculatus, is an important commercial species in China and is widely distributed in the coastal waters of Asia-Pacific countries. Despite increasing interest in swimming crab research, a high-quality chromosome-level genome is still lacking. Findings Here, we assembled the first chromosome-level reference genome of P. trituberculatus by combining the short reads, Nanopore long reads, and Hi-C data. The genome assembly size was 1.00 Gb with a contig N50 length of 4.12 Mb. In addition, BUSCO assessment indicated that 94.7% of core eukaryotic genes were present in the genome assembly. Approximately 54.52% of the genome was identified as repetitive sequences, with a total of 16,796 annotated protein-coding genes. In addition, we anchored contigs into chromosomes and identified 50 chromosomes with an N50 length of 21.80 Mb by Hi-C technology. Conclusions We anticipate that this chromosome-level assembly of the P. trituberculatus genome will not only promote study of basic development and evolution but also provide important resources for swimming crab reproduction.

2021 ◽  
Vol 12 ◽  
Author(s):  
Jielong Zhou ◽  
Peifu Wu ◽  
Zhongping Xiong ◽  
Naiyong Liu ◽  
Ning Zhao ◽  
...  

A high-quality genome is of significant value when seeking to control forest pests such as Dendrolimus kikuchii, a destructive member of the order Lepidoptera that is widespread in China. Herein, a high quality, chromosome-level reference genome for D. kikuchii based on Nanopore, Pacbio HiFi sequencing and the Hi-C capture system is presented. Overall, a final genome assembly of 705.51 Mb with contig and scaffold N50 values of 20.89 and 24.73 Mb, respectively, was obtained. Of these contigs, 95.89% had unique locations on 29 chromosomes. In silico analysis revealed that the genome contained 15,323 protein-coding genes and 63.44% repetitive sequences. Phylogenetic analyses indicated that D. kikuchii may diverged from the common ancestor of Thaumetopoea. Pityocampa, Thaumetopoea ni, Heliothis virescens, Hyphantria armigera, Spodoptera frugiperda, and Spodoptera litura approximately 122.05 million years ago. Many gene families were expanded in the D. kikuchii genome, particularly those of the Toll and IMD signaling pathway, which included 10 genes in peptidoglycan recognition protein, 19 genes in MODSP, and 11 genes in Toll. The findings from this study will help to elucidate the mechanisms involved in protection of D. kikuchii against foreign substances and pathogens, and may highlight a potential channel to control this pest.


2021 ◽  
Vol 13 (2) ◽  
Author(s):  
Linlin Zhao ◽  
Shengyong Xu ◽  
Zhiqiang Han ◽  
Qi Liu ◽  
Wensi Ke ◽  
...  

Abstract Argyrosomus japonicus is an economically and ecologically important fish species in the family Sciaenidae with a wide distribution in the world’s oceans. Here, we report a high-quality, chromosome-level genome assembly of A. japonicus based on PacBio and Hi-C sequencing technology. A 673.7-Mb genome containing 282 contigs with an N50 length of 18.4 Mb was obtained based on PacBio long reads. These contigs were further ordered and clustered into 24 chromosome groups based on Hi-C data. In addition, a total of 217.2 Mb (32.24% of the assembled genome) of sequences were identified as repeat elements, and 23,730 protein-coding genes were predicted based on multiple approaches. More than 97% of BUSCO genes were identified in the A. japonicus genome. The high-quality genome assembled in this work not only provides a valuable genomic resource for future population genetics, conservation biology and selective breeding studies of A. japonicus but also lays a solid foundation for the study of Sciaenidae evolution.


2020 ◽  
Author(s):  
Yun Sun ◽  
Dongdong Zhang ◽  
Jianzhi Shi ◽  
Guisen Chen ◽  
Ying Wu ◽  
...  

AbstractCromileptes altivelas that belongs to Serranidae in the order Perciformes, is widely distributed throughout the tropical waters of the Indo-West Pacific regions. Due to their excellent food quality and abundant nutrients, it has become a popular marine food fish with high market values. Here, we reported a chromosome-level genome assembly and annotation of the humpback grouper genome using more than 103X PacBio long-reads and high-throughput chromosome conformation capture (Hi-C) technologies. The N50 contig length of the assembly is as large as 4.14 Mbp, the final assembly is 1.07 Gb with N50 of scaffold 44.78 Mb, and 99.24% of the scaffold sequences were anchored into 24 chromosomes. The high-quality genome assembly also showed high gene completeness with 27,067 protein coding genes and 3,710 ncRNAs. This high accurate genome assembly and annotation will not only provide an essential genome resource for C. altivelas breeding and restocking, but will also serve as a key resource for studying fish genomics and genetics.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Xuchen Yang ◽  
Minghui Kang ◽  
Yanting Yang ◽  
Haifeng Xiong ◽  
Mingcheng Wang ◽  
...  

AbstractThe deciduous Chinese tupelo (Nyssa sinensis Oliv.) is a popular ornamental tree for the spectacular autumn leaf color. Here, using single-molecule sequencing and chromosome conformation capture data, we report a high-quality, chromosome-level genome assembly of N. sinensis. PacBio long reads were de novo assembled into 647 polished contigs with a total length of 1,001.42 megabases (Mb) and an N50 size of 3.62 Mb, which is in line with genome sizes estimated using flow cytometry and the k-mer analysis. These contigs were further clustered and ordered into 22 pseudo-chromosomes based on Hi-C data, matching the chromosome counts in Nyssa obtained from previous cytological studies. In addition, a total of 664.91 Mb of repetitive elements were identified and a total of 37,884 protein-coding genes were predicted in the genome of N. sinensis. All data were deposited in publicly available repositories, and should be a valuable resource for genomics, evolution, and conservation biology.


2020 ◽  
Vol 12 (12) ◽  
pp. 2486-2490
Author(s):  
Bangxing Han ◽  
Yi Jing ◽  
Jun Dai ◽  
Tao Zheng ◽  
Fangli Gu ◽  
...  

Abstract Dendrobium huoshanense is used to treat various diseases in traditional Chinese medicine. Recent studies have identified active components. However, the lack of genomic data limits research on the biosynthesis and application of these therapeutic ingredients. To address this issue, we generated the first chromosome-level genome assembly and annotation of D. huoshanense. We integrated PacBio sequencing data, Illumina paired-end sequencing data, and Hi-C sequencing data to assemble a 1.285 Gb genome, with contig and scaffold N50 lengths of 598 kb and 71.79 Mb, respectively. We annotated 21,070 protein-coding genes and 0.96 Gb transposable elements, constituting 74.92% of the whole assembly. In addition, we identified 252 genes responsible for polysaccharide biosynthesis by Kyoto Encyclopedia of Genes and Genomes functional annotation. Our data provide a basis for further functional studies, particularly those focused on genes related to glycan biosynthesis and metabolism, and have implications for both conservation and medicine.


2021 ◽  
Author(s):  
Shengjun Bai ◽  
Hainan Wu ◽  
Jinpeng Zhang ◽  
Zhiliang Pan ◽  
Wei Zhao ◽  
...  

Abstract Populus deltoides has important ecological and economic values, widely used in poplar breeding programs due to its superior characteristics such as rapid growth and resistance to disease. Although the genome sequence of P. deltoides WV94 is available, the assembly is fragmented. Here, we reported an improved chromosome-level assembly of the P. deltoides cultivar I-69 by combining Nanopore sequencing and chromosome conformation capture (Hi-C) technologies. The assembly was 429.3 Mb in size and contained 657 contigs with a contig N50 length of 2.62 Mb. Hi-C scaffolding of the contigs generated 19 chromosome-level sequences, which covered 97.4% (418 Mb) of the total assembly size. Moreover, repetitive sequences annotation showed that 39.28% of the P. deltoides genome was composed of interspersed elements, including retroelements (23.66%), DNA transposons (6.83%), and unclassified elements (8.79%). We also identified a total of 44 362 protein-coding genes in the current P. deltoides assembly. Compared with the previous genome assembly of P. deltoides WV94, the current assembly had some significantly improved qualities: the contig N50 increased 3.5-fold and the proportion of gaps decreased from 3.2% to 0.08%. This high-quality, well-annotated genome assembly provides a reliable genomic resource for identifying genome variants among individuals, mining candidate genes that control growth and wood quality traits, and facilitating further application of genomics-assisted breeding in populations related to P. deltoides.


2019 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


2020 ◽  
Author(s):  
Yichun Xie ◽  
Yiyi Zhong ◽  
Jinhui Chang ◽  
Hoi Shan Kwan

AbstractThe homokaryotic Coprinopsis cinerea strain A43mut B43mut pab1-1 #326 is a widely used experimental model for developmental studies in mushroom-forming fungi. It can grow on defined artificial media and complete the whole lifecycle within two weeks. The mutations in mating type factors A and B result in the special feature of clamp formation and fruiting without mating. This feature allows investigations and manipulations with a homokaryotic genetic background. Current genome assembly of strain #326 was based on short-read sequencing data and was highly fragmented, leading to the bias in gene annotation and downstream analyses. Here, we report a chromosome-level genome assembly of strain #326. Oxford Nanopore Technology (ONT) MinION sequencing was used to get long reads. Illumina short reads was used to polish the sequences. A combined assembly yield 13 chromosomes and a mitochondrial genome as individual scaffolds. The assembly has 15,250 annotated genes with a high synteny with the C. cinerea strain Okayama-7 #130. This assembly has great improvement on contiguity and annotations. It is a suitable reference for further genomic studies, especially for the genetic, genomic and transcriptomic analyses in ONT long reads. Single nucleotide variants and structural variants in six mutagenized and cisplatin-screened mutants could be identified and validated. A 66 bp deletion in Ras GTPase-activating protein (RasGAP) was found in all mutants. To make a better use of ONT sequencing platform, we modified a high-molecular-weight genomic DNA isolation protocol based on magnetic beads for filamentous fungi. This study showed the use of MinION to construct a fungal reference genome and to perform downstream studies in an individual laboratory. An experimental workflow was proposed, from DNA isolation and whole genome sequencing, to genome assembly and variant calling. Our results provided solutions and parameters for fungal genomic analysis on MinION sequencing platform.HighlightA chromosome-level genome assembly of C. cinerea #326A fast and efficient high-molecular-weight fungal genomic DNA isolation protocolStructural variant and single nucleotide variant calling using Nanopore readsA series of solutions and reference parameters for fungal genomic analysis on MinION


2021 ◽  
Author(s):  
Chi yang ◽  
Lu Ma ◽  
Donglai Xiao ◽  
Xiaoyu Liu ◽  
Xiaoling Jiang ◽  
...  

Sparassis latifolia is a valuable edible mushroom cultivated in China. In 2018, our research group reported an incomplete and low quality genome of S. latifolia was obtained by Illumina HiSeq 2500 sequencing. These limitations in the available genome have constrained genetic and genomic studies in this mushroom resource. Herein, an updated draft genome sequence of S. latifolia was generated by Oxford Nanopore sequencing and the Hi-C technique. A total of 8.24 Gb of Oxford Nanopore long reads representing ~198.08X coverage of the S. latifolia genome were generated. Subsequently, a high-quality genome of 41.41 Mb, with scaffold and contig N50 sizes of 3.31 Mb and 1.51 Mb, respectively, was assembled. Hi-C scaffolding of the genome resulted in 12 pseudochromosomes containing 93.56% of the bases in the assembled genome. Genome annotation further revealed that 17.47% of the genome was composed of repetitive sequences. In addition, 13,103 protein-coding genes were predicted, among which 98.72% were functionally annotated. BUSCO assay results further revealed that there were 92.07% complete BUSCOs. The improved chromosome-scale assembly and genome features described here will aid further molecular elucidation of various traits, breeding of S. latifolia, and evolutionary studies with related taxa.


2020 ◽  
Vol 10 (3) ◽  
pp. 891-897 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

The Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193 Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromeres, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


Sign in / Sign up

Export Citation Format

Share Document