scholarly journals The First Draft Genome Assembly of Snow Sheep (Ovis nivicola)

2020 ◽  
Vol 12 (8) ◽  
pp. 1330-1336 ◽  
Author(s):  
Maulik Upadhyay ◽  
Andreas Hauser ◽  
Elisabeth Kunz ◽  
Stefan Krebs ◽  
Helmut Blum ◽  
...  

Abstract The snow sheep, Ovis nivicola, which is endemic to the mountain ranges of northeastern Siberia, are well adapted to the harsh cold climatic conditions of their habitat. In this study, using long reads of Nanopore sequencing technology, whole-genome sequencing, assembly, and gene annotation of a snow sheep were carried out. Additionally, RNA-seq reads from several tissues were also generated to supplement the gene prediction in snow sheep genome. The assembled genome was ∼2.62 Gb in length and was represented by 7,157 scaffolds with N50 of about 2 Mb. The repetitive sequences comprised of 41% of the total genome. BUSCO analysis revealed that the snow sheep assembly contained full-length or partial fragments of 97% of mammalian universal single-copy orthologs (n = 4,104), illustrating the completeness of the assembly. In addition, a total of 20,045 protein-coding sequences were identified using comprehensive gene prediction pipeline. Of which 19,240 (∼96%) sequences were annotated using protein databases. Moreover, homology-based searches and de novo identification detected 1,484 tRNAs; 243 rRNAs; 1,931 snRNAs; and 782 miRNAs in the snow sheep genome. To conclude, we generated the first de novo genome of the snow sheep using long reads; these data are expected to contribute significantly to our understanding related to evolution and adaptation within the Ovis genus.

2018 ◽  
Vol 6 (16) ◽  
pp. e00265-18 ◽  
Author(s):  
Stewart T. G. Burgess ◽  
Kathryn Bartley ◽  
Edward J. Marr ◽  
Harry W. Wright ◽  
Robert J. Weaver ◽  
...  

ABSTRACT Sheep scab, caused by infestation with Psoroptes ovis, is highly contagious, results in intense pruritus, and represents a major welfare and economic concern. Here, we report the first draft genome assembly and gene prediction of P. ovis based on PacBio de novo sequencing. The ∼63.2-Mb genome encodes 12,041 protein-coding genes.


Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 708 ◽  
Author(s):  
Julien Alban Nguinkal ◽  
Ronald Marco Brunner ◽  
Marieke Verleih ◽  
Alexander Rebl ◽  
Lidia de los Ríos-Pérez ◽  
...  

The pikeperch (Sander lucioperca) is a fresh and brackish water Percid fish natively inhabiting the northern hemisphere. This species is emerging as a promising candidate for intensive aquaculture production in Europe. Specific traits like cannibalism, growth rate and meat quality require genomics based understanding, for an optimal husbandry and domestication process. Still, the aquaculture community is lacking an annotated genome sequence to facilitate genome-wide studies on pikeperch. Here, we report the first highly contiguous draft genome assembly of Sander lucioperca. In total, 413 and 66 giga base pairs of DNA sequencing raw data were generated with the Illumina platform and PacBio Sequel System, respectively. The PacBio data were assembled into a final assembly size of ~900 Mb covering 89% of the 1,014 Mb estimated genome size. The draft genome consisted of 1966 contigs ordered into 1,313 scaffolds. The contig and scaffold N50 lengths are 3.0 Mb and 4.9 Mb, respectively. The identified repetitive structures accounted for 39% of the genome. We utilized homologies to other ray-finned fishes, and ab initio gene prediction methods to predict 21,249 protein-coding genes in the Sander lucioperca genome, of which 88% were functionally annotated by either sequence homology or protein domains and signatures search. The assembled genome spans 97.6% and 96.3% of Vertebrate and Actinopterygii single-copy orthologs, respectively. The outstanding mapping rate (99.9%) of genomic PE-reads on the assembly suggests an accurate and nearly complete genome reconstruction. This draft genome sequence is the first genomic resource for this promising aquaculture species. It will provide an impetus for genomic-based breeding studies targeting phenotypic and performance traits of captive pikeperch.


2019 ◽  
Vol 12 (1) ◽  
Author(s):  
Ruby Dhar ◽  
Karthikeyan Pethusamy ◽  
Sunil Singh ◽  
Indrani Mukherjee ◽  
Ashikh Seethy ◽  
...  

Abstract Objective Pabda (Ompok bimaculatus) is a freshwater catfish, largely available in Asian countries, especially in Bangladesh, India, Pakistan and Nepal. This fish is highly valued for its fabulous taste and high nutritional value and is very popular as a rich source of proteins, omega-3 and omega-6 fatty acids, vitamins and mineral for growing children, pregnant females and elders. We performed de-novo sequencing of Ompok bimaculatus using a hybrid approach and present here a draft assembly for this species for the first time. Data Description The genome of Ompok bimaculatus (Fig. 1: Table 1, Data file 3) from Ganges river, has been sequenced by hybrid approach using Illumina short reads and PacBio long reads followed by structural annotations. The draft genome assembly was found to be 718 Mb with N50 size of 81 kb. MAKER gene annotation tool predicted 21,371 genes.


2018 ◽  
Vol 7 (18) ◽  
Author(s):  
Stewart T. G. Burgess ◽  
Kathryn Bartley ◽  
Francesca Nunn ◽  
Harry W. Wright ◽  
Margaret Hughes ◽  
...  

The poultry red mite, Dermanyssus gallinae, is a major worldwide concern in the egg-laying industry. Here, we report the first draft genome assembly and gene prediction of Dermanyssus gallinae, based on combined PacBio and MinION long-read de novo sequencing.


2019 ◽  
Author(s):  
Mengyang Xu ◽  
Xiaoshan Su ◽  
Mengqi Zhang ◽  
Ming Li ◽  
Xiaoyun Huang ◽  
...  

AbstractThe long-spine porcupinefish, Diodon holocanthus (Diodontidae, Tetraodontiformes, Actinopterygii), also known as the freckled porcupinefish, attracts great interest of ecology and economy. Its distinct characteristics including inflation reaction, spiny skin and tetradotoxin, however, have not been fully studied without a complete genome assembly.In this study, the whole genome of a single individual was sequenced using single tube-Long Fragment Read co-barcode reads, generating 154.3 Gb of paired-end data (219.8× depth). The gap was further filled using small amount of Oxford Nanopore MinION long read dataset (11.4Gb, 15.9× depth). Taking full use of long, medium, short-range of genome assembly information, the final assembled sequences with a total length of 650.02 Mb obtained contig and scaffold N50 sizes of 2.15 Mb and 8.13 Mb, respectively, despite of high repetitive content. Benchmarking Universal Single-Copy Orthologs captured 95.7% (2,474) of core genes to assess the completeness. In addition, 206.5 Mb (32.10%) of repetitive sequences were identified, and 20,840 protein-coding genes were annotated, among which 18,281 (87.72%) proteins were assigned with possible functions.This is the first demonstration of de novo genome of the porcupinefish, which will benefit downstream analysis of ontogeny, phylogeny, and evolution, and improve the exploration of its unique defensive mechanism.


2018 ◽  
Author(s):  
Florencia Diaz-Viraque ◽  
Sebastian Pita ◽  
Gonzalo Greif ◽  
Rita de Cassia Moreira de Souza ◽  
Gregorio Iraola ◽  
...  

Chagas disease was described by Carlos Chagas, who first identified the parasite Trypanosoma cruzi from a two-year-old girl called Berenice. Many T. cruzi sequencing projects based on short reads have demonstrated that genome assembly and downstream comparative analyses are extremely challenging in this species, given that half of its genome is composed of repetitive sequences. Here, we report de novo assemblies, annotation and comparative analyses of the Berenice strain using a combination of Illumina short reads and MinION long reads. Our work demonstrates that Nanopore sequencing improves T. cruzi assembly contiguity and increases the assembly size in ~16 Mb. Specifically, we found that assembly improvement also refines the completeness of coding regions for both single copy genes and repetitive transposable elements. Beyond its historical and epidemiological importance, Berenice constitutes a fundamental resource since it now represents the best-quality assembly available for TcII, a highly prevalent lineage causing human infections in South America. The availability of Berenice genome expands the known genetic diversity of T. cruzi and facilitates more comprehensive evolutionary inferences. Our work represents the first report of Nanopore technology used to resolve complex protozoan genomes, supporting its subsequent application for improving trypanosomatid and other highly repetitive genomes.


GigaScience ◽  
2020 ◽  
Vol 9 (4) ◽  
Author(s):  
Matt A Field ◽  
Benjamin D Rosen ◽  
Olga Dudchenko ◽  
Eva K F Chan ◽  
Andre E Minoche ◽  
...  

Abstract Background The German Shepherd Dog (GSD) is one of the most common breeds on earth and has been bred for its utility and intelligence. It is often first choice for police and military work, as well as protection, disability assistance, and search-and-rescue. Yet, GSDs are well known to be susceptible to a range of genetic diseases that can interfere with their training. Such diseases are of particular concern when they occur later in life, and fully trained animals are not able to continue their duties. Findings Here, we provide the draft genome sequence of a healthy German Shepherd female as a reference for future disease and evolutionary studies. We generated this improved canid reference genome (CanFam_GSD) utilizing a combination of Pacific Bioscience, Oxford Nanopore, 10X Genomics, Bionano, and Hi-C technologies. The GSD assembly is ∼80 times as contiguous as the current canid reference genome (20.9 vs 0.267 Mb contig N50), containing far fewer gaps (306 vs 23,876) and fewer scaffolds (429 vs 3,310) than the current canid reference genome CanFamv3.1. Two chromosomes (4 and 35) are assembled into single scaffolds with no gaps. BUSCO analyses of the genome assembly results show that 93.0% of the conserved single-copy genes are complete in the GSD assembly compared with 92.2% for CanFam v3.1. Homology-based gene annotation increases this value to ∼99%. Detailed examination of the evolutionarily important pancreatic amylase region reveals that there are most likely 7 copies of the gene, indicative of a duplication of 4 ancestral copies and the disruption of 1 copy. Conclusions GSD genome assembly and annotation were produced with major improvement in completeness, continuity, and quality over the existing canid reference. This resource will enable further research related to canine diseases, the evolutionary relationships of canids, and other aspects of canid biology.


2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Chunqing Ou ◽  
Fei Wang ◽  
Jiahong Wang ◽  
Song Li ◽  
Yanjie Zhang ◽  
...  

Abstract‘Zhongai 1’ [(Pyrus ussuriensis × communis) × spp.] is an excellent pear dwarfing rootstock common in China. It is dwarf itself and has high dwarfing efficiency on most of main Pyrus cultivated species when used as inter-stock. Here we describe the draft genome sequences of ‘Zhongai 1’ which was assembled using PacBio long reads, Illumina short reads and Hi-C technology. We estimated the genome size is approximately 511.33 Mb by K-mer analysis and obtained a final genome of 510.59 Mb with a contig N50 size of 1.28 Mb. Next, 506.31 Mb (99.16%) of contigs were clustered into 17 chromosomes with a scaffold N50 size of 23.45 Mb. We further predicted 309.86 Mb (60.68%) of repetitive sequences and 43,120 protein-coding genes. The assembled genome will be a valuable resource and reference for future pear breeding, genetic improvement, and comparative genomics among related species. Moreover, it will help identify genes involved in dwarfism, early flowering, stress tolerance, and commercially desirable fruit characteristics.


2021 ◽  
Author(s):  
Chi yang ◽  
Lu Ma ◽  
Donglai Xiao ◽  
Xiaoyu Liu ◽  
Xiaoling Jiang ◽  
...  

Sparassis latifolia is a valuable edible mushroom cultivated in China. In 2018, our research group reported an incomplete and low quality genome of S. latifolia was obtained by Illumina HiSeq 2500 sequencing. These limitations in the available genome have constrained genetic and genomic studies in this mushroom resource. Herein, an updated draft genome sequence of S. latifolia was generated by Oxford Nanopore sequencing and the Hi-C technique. A total of 8.24 Gb of Oxford Nanopore long reads representing ~198.08X coverage of the S. latifolia genome were generated. Subsequently, a high-quality genome of 41.41 Mb, with scaffold and contig N50 sizes of 3.31 Mb and 1.51 Mb, respectively, was assembled. Hi-C scaffolding of the genome resulted in 12 pseudochromosomes containing 93.56% of the bases in the assembled genome. Genome annotation further revealed that 17.47% of the genome was composed of repetitive sequences. In addition, 13,103 protein-coding genes were predicted, among which 98.72% were functionally annotated. BUSCO assay results further revealed that there were 92.07% complete BUSCOs. The improved chromosome-scale assembly and genome features described here will aid further molecular elucidation of various traits, breeding of S. latifolia, and evolutionary studies with related taxa.


2017 ◽  
Author(s):  
Timothy H Webster ◽  
Greer A. Dolby ◽  
Melissa Wilson Sayres ◽  
Kenro Kusumi

Exogenous sequence contamination presents a challenge in first-draft genomes because it can lead to non-contiguous, chimeric assembled sequences. This can mislead downstream analyses reliant on synteny, such as linkage-based analyses. Recently, the Mojave Desert Tortoise (Gopherus agassizii) draft genome was published as a resource to advance conservation efforts for the threatened species and discover more about chelonian biology and evolution. Here, we illustrate steps taken to improve the desert tortoise draft genome by removing contaminating sequences—actions that are typically carried out after the initial release of a draft genome assembly. We used information from NCBI’s Vecscreen output to remove intra-scaffold contamination and trim heading and trailing Ns. We then reordered and renamed scaffolds, and transferred the gene annotation onto this assembly. Finally, we describe the tools developed for this pipeline, freely available on Github (https://github.com/thw17/G_agassizii_reference_update), which facilitate post-assembly processing of other draft genomes. The new gopAga1.1 genome has an N50 of 251 KB, L50 of 2592 scaffolds, and its annotation retains 17,201 of the original 20,172 genes that were unaffected by the scaffold processing.


Sign in / Sign up

Export Citation Format

Share Document