Chromosome-level genome assembly and annotation of the loquat (Eriobotrya japonica) genome

Shuang Jiang; Haishan An; Fangjie Xu; Xueying Zhang

doi:10.1093/gigascience/giaa015

Chromosome-level genome assembly and annotation of the loquat (Eriobotrya japonica) genome

GigaScience ◽

10.1093/gigascience/giaa015 ◽

2020 ◽

Vol 9 (3) ◽

Cited By ~ 6

Author(s):

Shuang Jiang ◽

Haishan An ◽

Fangjie Xu ◽

Xueying Zhang

Keyword(s):

Genome Assembly ◽

High Throughput Sequencing ◽

Chromosome Rearrangement ◽

Flowering Plant ◽

Eriobotrya Japonica ◽

Sequencing Data ◽

African Countries ◽

Short Reads ◽

A Genome ◽

Chromosome Level

Abstract Background The loquat (Eriobotrya japonica) is a species of flowering plant in the family Rosaceae that is widely cultivated in Asian, European, and African countries. It blossoms in the winter and ripens in the early summer. The genome of loquat has to date not been published, which limits the study of molecular biology in this cultivated species. Here, we used the third-generation sequencing technology of Nanopore and Hi-C technology to sequence the genome of Eriobotrya. Findings We generated 100.10 Gb of long reads using Oxford Nanopore sequencing technologies. Three types of Illumina high-throughput sequencing data, including genome short reads (47.42 Gb), transcriptome short reads (11.06 Gb), and Hi-C short reads (67.25 Gb), were also generated to help construct the loquat genome. All data were assembled into a 760.1-Mb genome assembly. The contigs were mapped to chromosomes by using Hi-C technology based on the contacts between contigs, and then a genome was assembled exhibiting 17 chromosomes and a scaffold N50 length of 39.7 Mb. A total of 45,743 protein-coding genes were annotated in the Eriobotrya genome, and we investigated the phylogenetic relationships between the Eriobotrya and 6 other Rosaceae species. Eriobotrya shows a close relationship with Malus and Pyrus, with the divergence time of Eriobotrya and Malus being 6.76 million years ago. Furthermore, chromosome rearrangement was found in Eriobotrya and Malus. Conclusions We constructed the first high-quality chromosome-level Eriobotrya genome using Illumina, Nanopore, and Hi-C technologies. This work provides a valuable reference genome for molecular studies of the loquat and provides new insight into chromosome evolution in this species.

Download Full-text

A Multireference-Based Whole Genome Assembly for the Obligate Ant-Following Antbird, Rhegmatorhina melanosticta (Thamnophilidae)

Diversity ◽

10.3390/d11090144 ◽

2019 ◽

Vol 11 (9) ◽

pp. 144 ◽

Cited By ~ 4

Author(s):

Laís Coelho ◽

Lukas Musher ◽

Joel Cracraft

Keyword(s):

Genome Assembly ◽

High Throughput Sequencing ◽

Population Genomics ◽

De Novo ◽

Structural Difference ◽

Whole Genome ◽

Sequencing Technology ◽

A Genome ◽

Avian Genomes ◽

Chromosome Level

Current generation high-throughput sequencing technology has facilitated the generation of more genomic-scale data than ever before, thus greatly improving our understanding of avian biology across a range of disciplines. Recent developments in linked-read sequencing (Chromium 10×) and reference-based whole-genome assembly offer an exciting prospect of more accessible chromosome-level genome sequencing in the near future. We sequenced and assembled a genome of the Hairy-crested Antbird (Rhegmatorhina melanosticta), which represents the first publicly available genome for any antbird (Thamnophilidae). Our objectives were to (1) assemble scaffolds to chromosome level based on multiple reference genomes, and report on differences relative to other genomes, (2) assess genome completeness and compare content to other related genomes, and (3) assess the suitability of linked-read sequencing technology for future studies in comparative phylogenomics and population genomics studies. Our R. melanosticta assembly was both highly contiguous (de novo scaffold N50 = 3.3 Mb, reference based N50 = 53.3 Mb) and relatively complete (contained close to 90% of evolutionarily conserved single-copy avian genes and known tetrapod ultraconserved elements). The high contiguity and completeness of this assembly enabled the genome to be successfully mapped to the chromosome level, which uncovered a consistent structural difference between R. melanosticta and other avian genomes. Our results are consistent with the observation that avian genomes are structurally conserved. Additionally, our results demonstrate the utility of linked-read sequencing for non-model genomics. Finally, we demonstrate the value of our R. melanosticta genome for future researchers by mapping reduced representation sequencing data, and by accurately reconstructing the phylogenetic relationships among a sample of thamnophilid species.

Download Full-text

MaGuS: a tool for map-guided scaffolding and quality assessment of genome assemblies

10.1101/032045 ◽

2015 ◽

Author(s):

Mohammed-Amin Madoui ◽

Carole Dossat ◽

Leo d'Agata ◽

Edwin van der Vossen ◽

Jan van Oeveren ◽

...

Keyword(s):

High Throughput ◽

Genome Assembly ◽

High Throughput Sequencing ◽

Draft Genome ◽

Genetic Maps ◽

Sequencing Data ◽

A Genome ◽

Genome Map ◽

Genome Assemblies ◽

Complex Genome

Background Scaffolding is a crucial step in the genome assembly process. Current methods based on large fragment paired-end reads or long reads allow an increase in continuity but often lack consistency in repetitive regions, resulting in fragmented assemblies. Here, we describe a novel tool to link assemblies to a genome map to aid complex genome reconstruction by detecting assembly errors and allowing scaffold ordering and anchoring. Results We present MaGuS (map-guided scaffolding), a modular tool that uses a draft genome assembly, a genome map, and high-throughput paired-end sequencing data to estimate the quality and to enhance the continuity of an assembly. We generated several assemblies of the Arabidopsis genome using different scaffolding programs and applied MaGuS to select the best assembly using quality metrics. Then, we used MaGuS to perform map-guided scaffolding to increase continuity by creating new scaffold links in low-covered and highly repetitive regions where other commonly used scaffolding methods lack consistency. Conclusions MaGuS is a powerful reference-free evaluator of assembly quality and a map-guided scaffolder that is freely available at https://github.com/institut-de-genomique/MaGuS. Its use can be extended to other high-throughput sequencing data (e.g., long-read data) and also to other map data (e.g., genetic maps) to improve the quality and the continuity of large and complex genome assemblies.

Download Full-text

Chromosome-level genome assembly of a regenerable maize inbred line A188

Genome Biology ◽

10.1186/s13059-021-02396-x ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Guifang Lin ◽

Cheng He ◽

Jun Zheng ◽

Dal-Hoe Koo ◽

Ha Le ◽

...

Keyword(s):

Inbred Line ◽

Genome Assembly ◽

Gene Function ◽

Maize Inbred Line ◽

Carotenoid Cleavage Dioxygenase ◽

Structural Variations ◽

Embryonic Callus ◽

Network Analyses ◽

A Genome ◽

Chromosome Level

Abstract Background The maize inbred line A188 is an attractive model for elucidation of gene function and improvement due to its high embryogenic capacity and many contrasting traits to the first maize reference genome, B73, and other elite lines. The lack of a genome assembly of A188 limits its use as a model for functional studies. Results Here, we present a chromosome-level genome assembly of A188 using long reads and optical maps. Comparison of A188 with B73 using both whole-genome alignments and read depths from sequencing reads identify approximately 1.1 Gb of syntenic sequences as well as extensive structural variation, including a 1.8-Mb duplication containing the Gametophyte factor1 locus for unilateral cross-incompatibility, and six inversions of 0.7 Mb or greater. Increased copy number of carotenoid cleavage dioxygenase 1 (ccd1) in A188 is associated with elevated expression during seed development. High ccd1 expression in seeds together with low expression of yellow endosperm 1 (y1) reduces carotenoid accumulation, accounting for the white seed phenotype of A188. Furthermore, transcriptome and epigenome analyses reveal enhanced expression of defense pathways and altered DNA methylation patterns of the embryonic callus. Conclusions The A188 genome assembly provides a high-resolution sequence for a complex genome species and a foundational resource for analyses of genome variation and gene function in maize. The genome, in comparison to B73, contains extensive intra-species structural variations and other genetic differences. Expression and network analyses identify discrete profiles for embryonic callus and other tissues.

Download Full-text

Detection and application of genome-wide variations in peach for association and genetic relationship analysis

BMC Genetics ◽

10.1186/s12863-019-0799-8 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 2

Author(s):

Liping Guan ◽

Ke Cao ◽

Yong Li ◽

Jian Guo ◽

Qiang Xu ◽

...

Keyword(s):

Genetic Relationship ◽

Dna Markers ◽

High Throughput Sequencing ◽

Prunus Persica ◽

Genetic Research ◽

Diploid Species ◽

Sequencing Data ◽

Relationship Analysis ◽

Genome Wide ◽

A Genome

Abstract Background Peach (Prunus persica L.) is a diploid species and model plant of the Rosaceae family. In the past decade, significant progress has been made in peach genetic research via DNA markers, but the number of these markers remains limited. Results In this study, we performed a genome-wide DNA markers detection based on sequencing data of six distantly related peach accessions. A total of 650,693~1,053,547 single nucleotide polymorphisms (SNPs), 114,227~178,968 small insertion/deletions (InDels), 8386~12,298 structure variants (SVs), 2111~2581 copy number variants (CNVs) and 229,357~346,940 simple sequence repeats (SSRs) were detected and annotated. To demonstrate the application of DNA markers, 944 SNPs were filtered for association study of fruit ripening time and 15 highly polymorphic SSRs were selected to analyze the genetic relationship among 221 accessions. Conclusions The results showed that the use of high-throughput sequencing to develop DNA markers is fast and effective. Comprehensive identification of DNA markers, including SVs and SSRs, would be of benefit to genetic diversity evaluation, genetic mapping, and molecular breeding of peach.

Download Full-text

Improving the Chromosome-Level Genome Assembly of the Siamese Fighting Fish (Betta splendens) in a University Master’s Course

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401205 ◽

2020 ◽

Vol 10 (7) ◽

pp. 2179-2183 ◽

Cited By ~ 1

Author(s):

Stefan Prost ◽

Malte Petersen ◽

Martin Grethlein ◽

Sarah Joy Hahn ◽

Nina Kuschik-Maczollek ◽

...

Keyword(s):

Genome Assembly ◽

High Throughput Sequencing ◽

Siamese Fighting Fish ◽

Betta Splendens ◽

High Quality ◽

Sequencing Platform ◽

Sequencing Technologies ◽

Oxford Nanopore ◽

Long Read ◽

Chromosome Level

Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university master’s course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behavior. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published Hi-C data. The use of ∼35x nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using the Hi-C data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 96.1% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly. We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university master’s course. The use of ∼35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.

Download Full-text

Marsupial chromosomics: bridging the gap between genomes and chromosomes

Reproduction Fertility and Development ◽

10.1071/rd18201 ◽

2019 ◽

Vol 31 (7) ◽

pp. 1189 ◽

Cited By ~ 1

Author(s):

Janine E. Deakin ◽

Sally Potter

Keyword(s):

Dna Sequence ◽

Genome Assembly ◽

Genome Architecture ◽

Sequence Information ◽

Full Potential ◽

Tasmanian Devil ◽

Sequencing Technology ◽

A Genome ◽

Devil Facial Tumour Disease ◽

Chromosome Level

Marsupials have unique features that make them particularly interesting to study, and sequencing of marsupial genomes is helping to understand their evolution. A decade ago, it was a huge feat to sequence the first marsupial genome. Now, the advances in sequencing technology have made the sequencing of many more marsupial genomes possible. However, the DNA sequence is only one component of the structures it is packaged into: chromosomes. Knowing the arrangement of the DNA sequence on each chromosome is essential for a genome assembly to be used to its full potential. The importance of combining sequence information with cytogenetics has previously been demonstrated for rapidly evolving regions of the genome, such as the sex chromosomes, as well as for reconstructing the ancestral marsupial karyotype and understanding the chromosome rearrangements involved in the Tasmanian devil facial tumour disease. Despite the recent advances in sequencing technology assisting in genome assembly, physical anchoring of the sequence to chromosomes is required to achieve a chromosome-level assembly. Once chromosome-level assemblies are achieved for more marsupials, we will be able to investigate changes in the packaging and interactions between chromosomes to gain an understanding of the role genome architecture has played during marsupial evolution.

Download Full-text

An Improved Genome Assembly of Azadirachta indica A. Juss.

10.1101/033290 ◽

2015 ◽

Author(s):

Neeraja M Krishnan ◽

Prachi Jain ◽

Saurabh Gupta ◽

Arun K Hariharan ◽

Binay Panda

Keyword(s):

Azadirachta Indica ◽

Genome Assembly ◽

Draft Genome ◽

Fold Increase ◽

Sequencing Data ◽

Short Read ◽

Short Reads ◽

Short Read Sequencing ◽

Long Reads ◽

Ncbi Short Read Archive

Neem (Azadirachta indica A. Juss.), an evergreen tree of the Meliaceae family, is known for its medicinal, cosmetic, pesticidal and insecticidal properties. We had previously sequenced and published the draft genome of the plant, using mainly short read sequencing data. In this report, we present an improved genome assembly generated using additional short reads from Illumina and long reads from Pacific Biosciences SMRT sequencer. We assembled short reads and error corrected long reads using Platanus, an assembler designed to perform well for heterozygous genomes. The updated genome assembly (v2.0) yielded 3- and 3.5-fold increase in N50 and N75, respectively; 2.6-fold decrease in the total number of scaffolds; 1.25-fold increase in the number of valid transcriptome alignments; 13.4-fold less mis-assembly and 1.85-fold increase in the percentage repeat, over the earlier assembly (v1.0). The current assembly also maps better to the genes known to be involved in the terpenoid biosynthesis pathway. Together, the data represents an improved assembly of the A. indica genome. The raw data described in this manuscript are submitted to the NCBI Short Read Archive under the accession numbers SRX1074131, SRX1074132, SRX1074133, and SRX1074134 (SRP013453).

Download Full-text

Detection and application of genome-wide variations in peach for association and genetic relationship analysis

10.21203/rs.2.10634/v3 ◽

2019 ◽

Author(s):

Liping Guan ◽

ke Cao ◽

yong Li ◽

jian guo ◽

qiang xu ◽

...

Keyword(s):

Genetic Relationship ◽

Dna Markers ◽

High Throughput Sequencing ◽

Prunus Persica ◽

Genetic Research ◽

Diploid Species ◽

Sequencing Data ◽

Relationship Analysis ◽

Genome Wide ◽

A Genome

Abstract Background: Peach (Prunus persica L.) is a diploid species and model plant of the Rosaceae family. In the past decade, significant progress has been made in peach genetic research via DNA markers, but the number of these markers remains limited. Results: In this study, we performed a genome-wide DNA markers detection based on sequencing data of six distantly related peach accessions. A total of 650,693~1,053,547 single nucleotide polymorphisms (SNPs), 114,227~178,968 small insertion/deletions (InDels), 8,386~12,298 structure variants (SVs), 2,111~2,581 copy number variants (CNVs) and 229,357~346,940 simple sequence repeats (SSRs) were detected and annotated. To demonstrate the application of DNA markers, 944 SNPs were filtered for association study of fruit ripening time and 15 highly polymorphic SSRs were selected to analyze the genetic relationship among 221 accessions. Conclusions: The results showed that the use of high-throughput sequencing to develop DNA markers is fast and effective. Comprehensive identification of DNA markers, including SVs and SSRs, would be of benefit to genetic diversity evaluation, genetic mapping, and molecular breeding of peach.

Download Full-text

Detection and application of genome-wide variations in peach for association and genetic relationship analysis

10.21203/rs.2.10634/v2 ◽

2019 ◽

Author(s):

Liping Guan ◽

ke Cao ◽

yong Li ◽

jian guo ◽

qiang xu ◽

...

Keyword(s):

Genetic Relationship ◽

Dna Markers ◽

High Throughput Sequencing ◽

Prunus Persica ◽

Genetic Research ◽

Diploid Species ◽

Sequencing Data ◽

Relationship Analysis ◽

Genome Wide ◽

A Genome

Abstract Abstract Background: Peach (Prunus persica L.) is a diploid species and model plant of the Rosaceae family. In the past decade, significant progress has been made in peach genetic research via DNA markers, but the number of these markers remains limited. Results: In this study, we performed a genome-wide DNA markers detection based on sequencing data of six distantly related peach accessions. A total of 650,693~1,053,547 single nucleotide polymorphisms (SNPs), 114,227~178,968 small insertion/deletions (InDels), 8,386~12,298 structure variants (SVs), 2,111~2,581 copy number variants (CNVs) and 229,357~346,940 simple sequence repeats (SSRs) were detected and annotated. To demonstrate the application of DNA markers, 944 SNPs were filtered for association study of fruit ripening time and 15 highly polymorphic SSRs were selected to analyze the genetic relationship among 221 accessions. Conclusions: The results showed that the use of high-throughput sequencing to develop DNA markers is fast and effective. Comprehensive identification of DNA markers, including SVs and SSRs, would be of benefit to genetic diversity evaluation, genetic mapping, and molecular breeding of peach.

Download Full-text

A genome-wide association study, supported by a new chromosome-level genome assembly, suggests sox2 as a main driver of the undifferentiatiated ZZ/ZW sex determination of turbot (Scophthalmus maximus)

Genomics ◽

10.1016/j.ygeno.2021.04.007 ◽

2021 ◽

Author(s):

Paulino Martínez ◽

Diego Robledo ◽

Xoana Taboada ◽

Andrés Blanco ◽

Michel Moser ◽

...

Keyword(s):

Genome Assembly ◽

Genome Wide Association Study ◽

Scophthalmus Maximus ◽

Genome Wide Association ◽

Genome Wide ◽

A Genome ◽

Main Driver ◽

Turbot Scophthalmus Maximus ◽

Chromosome Level

Download Full-text