scholarly journals A chromosome-level genome assembly and annotation of the humpback grouper Cromileptes altivelas

2020 ◽  
Author(s):  
Yun Sun ◽  
Dongdong Zhang ◽  
Jianzhi Shi ◽  
Guisen Chen ◽  
Ying Wu ◽  
...  

AbstractCromileptes altivelas that belongs to Serranidae in the order Perciformes, is widely distributed throughout the tropical waters of the Indo-West Pacific regions. Due to their excellent food quality and abundant nutrients, it has become a popular marine food fish with high market values. Here, we reported a chromosome-level genome assembly and annotation of the humpback grouper genome using more than 103X PacBio long-reads and high-throughput chromosome conformation capture (Hi-C) technologies. The N50 contig length of the assembly is as large as 4.14 Mbp, the final assembly is 1.07 Gb with N50 of scaffold 44.78 Mb, and 99.24% of the scaffold sequences were anchored into 24 chromosomes. The high-quality genome assembly also showed high gene completeness with 27,067 protein coding genes and 3,710 ncRNAs. This high accurate genome assembly and annotation will not only provide an essential genome resource for C. altivelas breeding and restocking, but will also serve as a key resource for studying fish genomics and genetics.

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Xuchen Yang ◽  
Minghui Kang ◽  
Yanting Yang ◽  
Haifeng Xiong ◽  
Mingcheng Wang ◽  
...  

AbstractThe deciduous Chinese tupelo (Nyssa sinensis Oliv.) is a popular ornamental tree for the spectacular autumn leaf color. Here, using single-molecule sequencing and chromosome conformation capture data, we report a high-quality, chromosome-level genome assembly of N. sinensis. PacBio long reads were de novo assembled into 647 polished contigs with a total length of 1,001.42 megabases (Mb) and an N50 size of 3.62 Mb, which is in line with genome sizes estimated using flow cytometry and the k-mer analysis. These contigs were further clustered and ordered into 22 pseudo-chromosomes based on Hi-C data, matching the chromosome counts in Nyssa obtained from previous cytological studies. In addition, a total of 664.91 Mb of repetitive elements were identified and a total of 37,884 protein-coding genes were predicted in the genome of N. sinensis. All data were deposited in publicly available repositories, and should be a valuable resource for genomics, evolution, and conservation biology.


GigaScience ◽  
2020 ◽  
Vol 9 (1) ◽  
Author(s):  
Boping Tang ◽  
Daizhen Zhang ◽  
Haorong Li ◽  
Senhao Jiang ◽  
Huabin Zhang ◽  
...  

Abstract Background The swimming crab, Portunus trituberculatus, is an important commercial species in China and is widely distributed in the coastal waters of Asia-Pacific countries. Despite increasing interest in swimming crab research, a high-quality chromosome-level genome is still lacking. Findings Here, we assembled the first chromosome-level reference genome of P. trituberculatus by combining the short reads, Nanopore long reads, and Hi-C data. The genome assembly size was 1.00 Gb with a contig N50 length of 4.12 Mb. In addition, BUSCO assessment indicated that 94.7% of core eukaryotic genes were present in the genome assembly. Approximately 54.52% of the genome was identified as repetitive sequences, with a total of 16,796 annotated protein-coding genes. In addition, we anchored contigs into chromosomes and identified 50 chromosomes with an N50 length of 21.80 Mb by Hi-C technology. Conclusions We anticipate that this chromosome-level assembly of the P. trituberculatus genome will not only promote study of basic development and evolution but also provide important resources for swimming crab reproduction.


GigaScience ◽  
2019 ◽  
Vol 8 (8) ◽  
Author(s):  
Lu Wang ◽  
Jinwei Wu ◽  
Xiaomei Liu ◽  
Dandan Di ◽  
Yuhong Liang ◽  
...  

Abstract Background The golden snub-nosed monkey (Rhinopithecus roxellana) is an endangered colobine species endemic to China, which has several distinct traits including a unique social structure. Although a genome assembly for R. roxellana is available, it is incomplete and fragmented because it was constructed using short-read sequencing technology. Thus, important information such as genome structural variation and repeat sequences may be absent. Findings To obtain a high-quality chromosomal assembly for R. roxellana qinlingensis, we used 5 methods: Pacific Bioscience single-molecule real-time sequencing, Illumina paired-end sequencing, BioNano optical maps, 10X Genomics link-reads, and high-throughput chromosome conformation capture. The assembled genome was ∼3.04 Gb, with a contig N50 of 5.72 Mb and a scaffold N50 of 144.56 Mb. This represented a 100-fold improvement over the previously published genome. In the new genome, 22,497 protein-coding genes were predicted, of which 22,053 were functionally annotated. Gene family analysis showed that 993 and 2,745 gene families were expanded and contracted, respectively. The reconstructed phylogeny recovered a close relationship between R. rollexana and Macaca mulatta, and these 2 species diverged ∼13.4 million years ago. Conclusion We constructed a high-quality genome assembly of the Qinling golden snub-nosed monkey; it had superior continuity and accuracy, which might be useful for future genetic studies in this species and as a new standard reference genome for colobine primates. In addition, the updated genome assembly might improve our understanding of this species and could assist conservation efforts.


2021 ◽  
Vol 13 (2) ◽  
Author(s):  
Linlin Zhao ◽  
Shengyong Xu ◽  
Zhiqiang Han ◽  
Qi Liu ◽  
Wensi Ke ◽  
...  

Abstract Argyrosomus japonicus is an economically and ecologically important fish species in the family Sciaenidae with a wide distribution in the world’s oceans. Here, we report a high-quality, chromosome-level genome assembly of A. japonicus based on PacBio and Hi-C sequencing technology. A 673.7-Mb genome containing 282 contigs with an N50 length of 18.4 Mb was obtained based on PacBio long reads. These contigs were further ordered and clustered into 24 chromosome groups based on Hi-C data. In addition, a total of 217.2 Mb (32.24% of the assembled genome) of sequences were identified as repeat elements, and 23,730 protein-coding genes were predicted based on multiple approaches. More than 97% of BUSCO genes were identified in the A. japonicus genome. The high-quality genome assembled in this work not only provides a valuable genomic resource for future population genetics, conservation biology and selective breeding studies of A. japonicus but also lays a solid foundation for the study of Sciaenidae evolution.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 241-242
Author(s):  
Karim Karimi ◽  
Duy Ngoc Do ◽  
Younes Miar

Abstract Development of genome-enabled selection and providing new insights in the genetic architecture of economically important traits are essential parts of mink breeding programs. Availability of a contagious genome assembly would guarantee the fundamental genomic studies in American mink (Neovison vison). Advances in long-read sequencing technologies have provided the opportunity to obtain high quality and free-gaps assemblies for different species. The objective of this study was to generate an accurate genome assembly using Single Molecule High-Fidelity (HiFi) Sequencing for American mink. The whole genome sequences of 100 mink were analyzed to select the most homozygous individual. A black American mink from Millbank Fur Farm (Rockwood, ON, Canada) was selected for PacBio sequencing. The total number of 2,884,047 HiFi reads with the average size of 20 kb were generated using three libraries of PacBio Sequel II System. Three de novo assemblers including wtdbg, Flye and IPA were used to obtain the initial draft of assembly using the long reads. The draft generated using Flye was selected as the final assembly based on the metrics of contiguity and completeness. The final assembly included 3,529 contigs with the N50 of 18.26 Mb and the largest contig of 62.16 Mb. The length of genome assembly was 2.66 Gb with 85 gaps. These results confirmed that high-coverage and accurate long-reads significantly improved the American mink genome assembly and successfully generated more contagious assembly. The chromosome conformation capture data will be integrated to the current draft to obtain a chromosome-level genome assembly for American mink at the next step of the project.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Fanming Meng ◽  
Zhuoying Liu ◽  
Han Han ◽  
Dmitrijs Finkelbergs ◽  
Yangshuai Jiang ◽  
...  

Abstract Background Blowflies (Diptera: Calliphoridae) are the most commonly found entomological evidence in forensic investigations. Distinguished from other blowflies, Aldrichina grahami has some unique biological characteristics and is a species of forensic importance. Its development rate, pattern, and life cycle can provide valuable information for the estimation of the minimum postmortem interval. Findings Herein we provide a chromosome-level genome assembly of A. grahami that was generated by Pacific BioSciences sequencing platform and chromosome conformation capture (Hi-C) technology. A total of 50.15 Gb clean reads of the A. grahami genome were generated. FALCON and Wtdbg were used to construct the genome of A. grahami, resulting in an assembly of 600 Mb and 1,604 contigs with an N50 size of 1.93 Mb. We predicted 12,823 protein-coding genes, 99.8% of which was functionally annotated on the basis of the de novo genome (SRA: PRJNA513084) and transcriptome (SRA: SRX5207346) of A. grahami. According to the co-analysis with 11 other insect species, clustering and phylogenetic reconstruction of gene families were performed. Using Hi-C sequencing, a chromosome-level assembly of 6 chromosomes was generated with scaffold N50 of 104.7 Mb. Of these scaffolds, 96.4% were anchored to the total A. grahami genome contig bases. Conclusions The present study provides a robust genome reference for A. grahami that supplements vital genetic information for nonhuman forensic genomics and facilitates the future research of A. grahami and other necrophagous blowfly species used in forensic medicine.


2020 ◽  
Vol 12 (12) ◽  
pp. 2486-2490
Author(s):  
Bangxing Han ◽  
Yi Jing ◽  
Jun Dai ◽  
Tao Zheng ◽  
Fangli Gu ◽  
...  

Abstract Dendrobium huoshanense is used to treat various diseases in traditional Chinese medicine. Recent studies have identified active components. However, the lack of genomic data limits research on the biosynthesis and application of these therapeutic ingredients. To address this issue, we generated the first chromosome-level genome assembly and annotation of D. huoshanense. We integrated PacBio sequencing data, Illumina paired-end sequencing data, and Hi-C sequencing data to assemble a 1.285 Gb genome, with contig and scaffold N50 lengths of 598 kb and 71.79 Mb, respectively. We annotated 21,070 protein-coding genes and 0.96 Gb transposable elements, constituting 74.92% of the whole assembly. In addition, we identified 252 genes responsible for polysaccharide biosynthesis by Kyoto Encyclopedia of Genes and Genomes functional annotation. Our data provide a basis for further functional studies, particularly those focused on genes related to glycan biosynthesis and metabolism, and have implications for both conservation and medicine.


2021 ◽  
Author(s):  
Shengjun Bai ◽  
Hainan Wu ◽  
Jinpeng Zhang ◽  
Zhiliang Pan ◽  
Wei Zhao ◽  
...  

Abstract Populus deltoides has important ecological and economic values, widely used in poplar breeding programs due to its superior characteristics such as rapid growth and resistance to disease. Although the genome sequence of P. deltoides WV94 is available, the assembly is fragmented. Here, we reported an improved chromosome-level assembly of the P. deltoides cultivar I-69 by combining Nanopore sequencing and chromosome conformation capture (Hi-C) technologies. The assembly was 429.3 Mb in size and contained 657 contigs with a contig N50 length of 2.62 Mb. Hi-C scaffolding of the contigs generated 19 chromosome-level sequences, which covered 97.4% (418 Mb) of the total assembly size. Moreover, repetitive sequences annotation showed that 39.28% of the P. deltoides genome was composed of interspersed elements, including retroelements (23.66%), DNA transposons (6.83%), and unclassified elements (8.79%). We also identified a total of 44 362 protein-coding genes in the current P. deltoides assembly. Compared with the previous genome assembly of P. deltoides WV94, the current assembly had some significantly improved qualities: the contig N50 increased 3.5-fold and the proportion of gaps decreased from 3.2% to 0.08%. This high-quality, well-annotated genome assembly provides a reliable genomic resource for identifying genome variants among individuals, mining candidate genes that control growth and wood quality traits, and facilitating further application of genomics-assisted breeding in populations related to P. deltoides.


Diversity ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 668
Author(s):  
Euna Jo ◽  
Seung Jae Lee ◽  
Jeong-Hoon Kim ◽  
Steven J. Parker ◽  
Eunkyung Choi ◽  
...  

Trematomus species (suborder Notothenioidei; family Nototheniidae) are widely distributed in the southern oceans near Antarctica. There are 11 recognized species in the genus Trematomus, and notothenioids are known to have high chromosomal diversity (2n = 24–58) because of relatively recent and rapid adaptive radiation. Herein, we report the chromosomal-level genome assembly of T. loennbergii, the first characterized genome representative of the genus Trematomus. The final genome assembly of T. loennbergii was obtained using a Pacific Biosciences long-read sequencing platform and high-throughput chromosome conformation capture technology. Twenty-three chromosomal-level scaffolds were assembled to 940 Mb in total size, with a longest contig size of 48.5 Mb and contig N50 length of 24.7 Mb. The genome contained 42.03% repeat sequences, and a total of 24,525 protein-coding genes were annotated. We produced a high-quality genome assembly of T. loennbergii. Our results provide a first reference genome for the genus Trematomus and will serve as a basis for studying the molecular taxonomy and evolution of Antarctic fish.


Toxins ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 488 ◽  
Author(s):  
Shiyong Zhang ◽  
Jia Li ◽  
Qin Qin ◽  
Wei Liu ◽  
Chao Bian ◽  
...  

Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ≈6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.


Sign in / Sign up

Export Citation Format

Share Document