scholarly journals A chromosome-level genome assembly of the Chinese tupelo Nyssa sinensis

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Xuchen Yang ◽  
Minghui Kang ◽  
Yanting Yang ◽  
Haifeng Xiong ◽  
Mingcheng Wang ◽  
...  

AbstractThe deciduous Chinese tupelo (Nyssa sinensis Oliv.) is a popular ornamental tree for the spectacular autumn leaf color. Here, using single-molecule sequencing and chromosome conformation capture data, we report a high-quality, chromosome-level genome assembly of N. sinensis. PacBio long reads were de novo assembled into 647 polished contigs with a total length of 1,001.42 megabases (Mb) and an N50 size of 3.62 Mb, which is in line with genome sizes estimated using flow cytometry and the k-mer analysis. These contigs were further clustered and ordered into 22 pseudo-chromosomes based on Hi-C data, matching the chromosome counts in Nyssa obtained from previous cytological studies. In addition, a total of 664.91 Mb of repetitive elements were identified and a total of 37,884 protein-coding genes were predicted in the genome of N. sinensis. All data were deposited in publicly available repositories, and should be a valuable resource for genomics, evolution, and conservation biology.

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Zhixiong Zhou ◽  
Bo Liu ◽  
Baohua Chen ◽  
Yue Shi ◽  
Fei Pu ◽  
...  

Abstract Takifugu bimaculatus is a native teleost species of the southeast coast of China where it has been cultivated as an important edible fish in the last decade. Genetic breeding programs, which have been recently initiated for improving the aquaculture performance of T. bimaculatus, urgently require a high-quality reference genome to facilitate genome selection and related genetic studies. To address this need, we produced a chromosome-level reference genome of T. bimaculatus using the PacBio single molecule sequencing technique (SMRT) and High-through chromosome conformation capture (Hi-C) technologies. The genome was assembled into 2,193 contigs with a total length of 404.21 Mb and a contig N50 length of 1.31 Mb. After chromosome-level scaffolding, 22 chromosomes with a total length of 371.68 Mb were constructed. Moreover, a total of 21,117 protein-coding genes and 3,471 ncRNAs were annotated in the reference genome. The highly accurate, chromosome-level reference genome of T. bimaculatus provides an essential genome resource for not only the genome-scale selective breeding of T. bimaculatus but also the exploration of the evolutionary basis of the speciation and local adaptation of the Takifugu genus.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Baohua Chen ◽  
Zhixiong Zhou ◽  
Qiaozhen Ke ◽  
Yidi Wu ◽  
Huaqiang Bai ◽  
...  

Abstract Larimichthys crocea is an endemic marine fish in East Asia that belongs to Sciaenidae in Perciformes. L. crocea has now been recognized as an “iconic” marine fish species in China because not only is it a popular food fish in China, it is a representative victim of overfishing and still provides high value fish products supported by the modern large-scale mariculture industry. Here, we report a chromosome-level reference genome of L. crocea generated by employing the PacBio single molecule sequencing technique (SMRT) and high-throughput chromosome conformation capture (Hi-C) technologies. The genome sequences were assembled into 1,591 contigs with a total length of 723.86 Mb and a contig N50 length of 2.83 Mb. After chromosome-level scaffolding, 24 scaffolds were constructed with a total length of 668.67 Mb (92.48% of the total length). Genome annotation identified 23,657 protein-coding genes and 7262 ncRNAs. This highly accurate, chromosome-level reference genome of L. crocea provides an essential genome resource to support the development of genome-scale selective breeding and restocking strategies of L. crocea.


2020 ◽  
Author(s):  
Yun Sun ◽  
Dongdong Zhang ◽  
Jianzhi Shi ◽  
Guisen Chen ◽  
Ying Wu ◽  
...  

AbstractCromileptes altivelas that belongs to Serranidae in the order Perciformes, is widely distributed throughout the tropical waters of the Indo-West Pacific regions. Due to their excellent food quality and abundant nutrients, it has become a popular marine food fish with high market values. Here, we reported a chromosome-level genome assembly and annotation of the humpback grouper genome using more than 103X PacBio long-reads and high-throughput chromosome conformation capture (Hi-C) technologies. The N50 contig length of the assembly is as large as 4.14 Mbp, the final assembly is 1.07 Gb with N50 of scaffold 44.78 Mb, and 99.24% of the scaffold sequences were anchored into 24 chromosomes. The high-quality genome assembly also showed high gene completeness with 27,067 protein coding genes and 3,710 ncRNAs. This high accurate genome assembly and annotation will not only provide an essential genome resource for C. altivelas breeding and restocking, but will also serve as a key resource for studying fish genomics and genetics.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 241-242
Author(s):  
Karim Karimi ◽  
Duy Ngoc Do ◽  
Younes Miar

Abstract Development of genome-enabled selection and providing new insights in the genetic architecture of economically important traits are essential parts of mink breeding programs. Availability of a contagious genome assembly would guarantee the fundamental genomic studies in American mink (Neovison vison). Advances in long-read sequencing technologies have provided the opportunity to obtain high quality and free-gaps assemblies for different species. The objective of this study was to generate an accurate genome assembly using Single Molecule High-Fidelity (HiFi) Sequencing for American mink. The whole genome sequences of 100 mink were analyzed to select the most homozygous individual. A black American mink from Millbank Fur Farm (Rockwood, ON, Canada) was selected for PacBio sequencing. The total number of 2,884,047 HiFi reads with the average size of 20 kb were generated using three libraries of PacBio Sequel II System. Three de novo assemblers including wtdbg, Flye and IPA were used to obtain the initial draft of assembly using the long reads. The draft generated using Flye was selected as the final assembly based on the metrics of contiguity and completeness. The final assembly included 3,529 contigs with the N50 of 18.26 Mb and the largest contig of 62.16 Mb. The length of genome assembly was 2.66 Gb with 85 gaps. These results confirmed that high-coverage and accurate long-reads significantly improved the American mink genome assembly and successfully generated more contagious assembly. The chromosome conformation capture data will be integrated to the current draft to obtain a chromosome-level genome assembly for American mink at the next step of the project.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Fanming Meng ◽  
Zhuoying Liu ◽  
Han Han ◽  
Dmitrijs Finkelbergs ◽  
Yangshuai Jiang ◽  
...  

Abstract Background Blowflies (Diptera: Calliphoridae) are the most commonly found entomological evidence in forensic investigations. Distinguished from other blowflies, Aldrichina grahami has some unique biological characteristics and is a species of forensic importance. Its development rate, pattern, and life cycle can provide valuable information for the estimation of the minimum postmortem interval. Findings Herein we provide a chromosome-level genome assembly of A. grahami that was generated by Pacific BioSciences sequencing platform and chromosome conformation capture (Hi-C) technology. A total of 50.15 Gb clean reads of the A. grahami genome were generated. FALCON and Wtdbg were used to construct the genome of A. grahami, resulting in an assembly of 600 Mb and 1,604 contigs with an N50 size of 1.93 Mb. We predicted 12,823 protein-coding genes, 99.8% of which was functionally annotated on the basis of the de novo genome (SRA: PRJNA513084) and transcriptome (SRA: SRX5207346) of A. grahami. According to the co-analysis with 11 other insect species, clustering and phylogenetic reconstruction of gene families were performed. Using Hi-C sequencing, a chromosome-level assembly of 6 chromosomes was generated with scaffold N50 of 104.7 Mb. Of these scaffolds, 96.4% were anchored to the total A. grahami genome contig bases. Conclusions The present study provides a robust genome reference for A. grahami that supplements vital genetic information for nonhuman forensic genomics and facilitates the future research of A. grahami and other necrophagous blowfly species used in forensic medicine.


2020 ◽  
Author(s):  
Shangang Jia ◽  
Guoliang Wang ◽  
Guiming Liu ◽  
Jiangyong Qu ◽  
Beilun Zhao ◽  
...  

ABSTRACTThe red algae Kappaphycus alvarezii is the most important aquaculture species in Kappaphycus, widely distributed in tropical waters, and it has become the main crop of carrageenan production at present. The mechanisms of adaptation for high temperature, high salinity environments and carbohydrate metabolism may provide an important inspiration for marine algae study. Scientific background knowledge such as genomic data will be also essential to improve disease resistance and production traits of K. alvarezii. 43.28 Gb short paired-end reads and 18.52 Gb single-molecule long reads of K. alvarezii were generated by Illumina HiSeq platform and Pacbio RSII platform respectively. The de novo genome assembly was performed using Falcon_unzip and Canu software, and then improved with Pilon. The final assembled genome (336 Mb) consists of 888 scaffolds with a contig N50 of 849 Kb. Further annotation analyses predicted 21,422 protein-coding genes, with 61.28% functionally annotated. Here we report the draft genome and annotations of K. alvarezii, which are valuable resources for future genomic and genetic studies in Kappaphycus and other algae.


2019 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


2020 ◽  
Vol 10 (3) ◽  
pp. 891-897 ◽  
Author(s):  
Ryan Bracewell ◽  
Anita Tran ◽  
Kamalakar Chatla ◽  
Doris Bachtrog

The Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193 Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromeres, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Benjamin D Rosen ◽  
Derek M Bickhart ◽  
Robert D Schnabel ◽  
Sergey Koren ◽  
Christine G Elsik ◽  
...  

Abstract Background Major advances in selection progress for cattle have been made following the introduction of genomic tools over the past 10–12 years. These tools depend upon the Bos taurus reference genome (UMD3.1.1), which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies. Results We present the new reference genome for cattle, ARS-UCD1.2, based on the same animal as the original to facilitate transfer and interpretation of results obtained from the earlier version, but applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly includes 2.7 Gb and is >250× more continuous than the original assembly, with contig N50 >25 Mb and L50 of 32. We also greatly expanded supporting RNA-based data for annotation that identifies 30,396 total genes (21,039 protein coding). The new reference assembly is accessible in annotated form for public use. Conclusions We demonstrate that improved continuity of assembled sequence warrants the adoption of ARS-UCD1.2 as the new cattle reference genome and that increased assembly accuracy will benefit future research on this species.


GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Yan Li ◽  
Guangliang Gao ◽  
Yu Lin ◽  
Silu Hu ◽  
Yi Luo ◽  
...  

ABSTRACT Background The domestic goose is an economically important and scientifically valuable waterfowl; however, a lack of high-quality genomic data has hindered research concerning its genome, genetics, and breeding. As domestic geese breeds derive from both the swan goose (Anser cygnoides) and the graylag goose (Anser anser), we selected a female Tianfu goose for genome sequencing. We generated a chromosome-level goose genome assembly by adopting a hybrid de novo assembly approach that combined Pacific Biosciences single-molecule real-time sequencing, high-throughput chromatin conformation capture mapping, and Illumina short-read sequencing. Findings We generated a 1.11-Gb goose genome with contig and scaffold N50 values of 1.85 and 33.12 Mb, respectively. The assembly contains 39 pseudo-chromosomes (2n = 78) accounting for ∼88.36% of the goose genome. Compared with previous goose assemblies, our assembly has more continuity, completeness, and accuracy; the annotation of core eukaryotic genes and universal single-copy orthologs has also been improved. We have identified 17,568 protein-coding genes and a repeat content of 8.67% (96.57 Mb) in this genome assembly. We also explored the spatial organization of chromatin and gene expression in the goose liver tissues, in terms of inter-pseudo-chromosomal interaction patterns, compartments, topologically associating domains, and promoter-enhancer interactions. Conclusions We present the first chromosome-level assembly of the goose genome. This will be a valuable resource for future genetic and genomic studies on geese.


Sign in / Sign up

Export Citation Format

Share Document