scholarly journals Genome Assembly and Sex-Determining Region of Male and Female Populus × sibirica

2021 ◽  
Vol 12 ◽  
Author(s):  
Nataliya V. Melnikova ◽  
Elena N. Pushkova ◽  
Ekaterina M. Dvorianinova ◽  
Artemy D. Beniaminov ◽  
Roman O. Novakovskiy ◽  
...  

The genus Populus is presented by dioecious species, and it became a promising object to study the genetics of sex in plants. In this work, genomes of male and female Populus × sibirica individuals were sequenced for the first time. To achieve high-quality genome assemblies, we used Oxford Nanopore Technologies and Illumina platforms. A protocol for the isolation of long and pure DNA from young poplar leaves was developed, which enabled us to obtain 31 Gb (N50 = 21 kb) for the male poplar and 23 Gb (N50 = 24 kb) for the female one using the MinION sequencer. Genome assembly was performed with different tools, and Canu provided the most complete and accurate assemblies with a length of 818 Mb (N50 = 1.5 Mb) for the male poplar and 816 Mb (N50 = 0.5 Mb) for the female one. After polishing with Racon and Medaka (Nanopore reads) and then with POLCA (Illumina reads), assembly completeness was 98.45% (87.48% duplicated) for the male and 98.20% (76.77% duplicated) for the female according to BUSCO (benchmarking universal single-copy orthologs). A high proportion of duplicated BUSCO and the increased genome size (about 300 Mb above the expected) pointed at the separation of haplotypes in a large part of male and female genomes of P. × sibirica. Due to this, we were able to identify two haplotypes of the sex-determining region (SDR) in both assemblies; and one of these four SDR haplotypes, in the male genome, contained partial repeats of the ARR17 gene (Y haplotype), while the rest three did not (X haplotypes). The analysis of the male P. × sibirica SDR suggested that the Y haplotype originated from P. nigra, while the X haplotype is close to P. trichocarpa and P. balsamifera species. Moreover, we revealed a Populus-specific repeat that could be involved in translocation of the ARR17 gene or its part to the SDR of P. × sibirica and other Populus species. The obtained results expand our knowledge on SDR features in the genus Populus and poplar phylogeny.

Author(s):  
Wayne Xu ◽  
James R Tucker ◽  
Wubishet A Bekele ◽  
Frank M You ◽  
Yong-Bi Fu ◽  
...  

Abstract Barley (Hordeum vulgare L.) is one of the most important global crops. The six-row barley cultivar Morex reference genome has been used by the barley research community worldwide. However, this reference genome can have limitations when used for genomic and genetic diversity analysis studies, gene discovery, and marker development when working in two-row germplasm that is more common to Canadian barley. Here we assembled, for the first time, the genome sequence of a Canadian two-row malting barley, cultivar AAC Synergy. We applied deep Illumina paired-end reads, long mate-pair reads, PacBio sequences, 10X chromium linked read libraries, and chromosome conformation capture sequencing (Hi-C) to generate a contiguous assembly. The genome assembled from super-scaffolds had a size of 4.85 Gb, N50 of 2.32 Mb and an estimated 93.9% of complete genes from a plant database (BUSCO, benchmarking universal single-copy orthologous genes). After removal of small scaffolds (< 300 Kb), the assembly was arranged into pseudomolecules of 4.14 Gb in size with seven chromosomes plus unanchored scaffolds. The completeness and annotation of the assembly were assessed by comparing it with the updated version of six-row Morex and recently released two-row Golden Promise genome assemblies.


GigaScience ◽  
2019 ◽  
Vol 8 (10) ◽  
Author(s):  
Sarah B Kingan ◽  
Julie Urban ◽  
Christine C Lambert ◽  
Primo Baybayan ◽  
Anna K Childers ◽  
...  

ABSTRACT Background A high-quality reference genome is an essential tool for applied and basic research on arthropods. Long-read sequencing technologies may be used to generate more complete and contiguous genome assemblies than alternate technologies; however, long-read methods have historically had greater input DNA requirements and higher costs than next-generation sequencing, which are barriers to their use on many samples. Here, we present a 2.3 Gb de novo genome assembly of a field-collected adult female spotted lanternfly (Lycorma delicatula) using a single Pacific Biosciences SMRT Cell. The spotted lanternfly is an invasive species recently discovered in the northeastern United States that threatens to damage economically important crop plants in the region. Results The DNA from 1 individual was used to make 1 standard, size-selected library with an average DNA fragment size of ∼20 kb. The library was run on 1 Sequel II SMRT Cell 8M, generating a total of 132 Gb of long-read sequences, of which 82 Gb were from unique library molecules, representing ∼36× coverage of the genome. The assembly had high contiguity (contig N50 length = 1.5 Mb), completeness, and sequence level accuracy as estimated by conserved gene set analysis (96.8% of conserved genes both complete and without frame shift errors). Furthermore, it was possible to segregate more than half of the diploid genome into the 2 separate haplotypes. The assembly also recovered 2 microbial symbiont genomes known to be associated with L. delicatula, each microbial genome being assembled into a single contig. Conclusions We demonstrate that field-collected arthropods can be used for the rapid generation of high-quality genome assemblies, an attractive approach for projects on emerging invasive species, disease vectors, or conservation efforts of endangered species.


2020 ◽  
Author(s):  
Bernard Y Kim ◽  
Jeremy Wang ◽  
Danny E. Miller ◽  
Olga Barmina ◽  
Emily K. Delaney ◽  
...  

Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long read sequencing allow high quality genome assemblies for tens or even hundreds of species to be generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of high-quality assemblies for 101 lines of 95 drosophilid species encompassing 14 species groups and 35 sub-groups with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. These assemblies, along with detailed wet lab protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution within this key group.


2021 ◽  
Author(s):  
Arang Rhie ◽  
Ann Mc Cartney ◽  
Kishwar Shafin ◽  
Michael Alonge ◽  
Andrey Bzikadze ◽  
...  

Abstract Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies


2020 ◽  
Author(s):  
Kumar Paritosh ◽  
Akshay Kumar Pradhan ◽  
Deepak Pental

AbstractBrassica nigra (BB), also called black mustard, is grown as a condiment crop in India. B. nigra represents the B genome of U’s triangle and is one of the progenitor species of B. juncea (AABB), an important oilseed crop of the Indian subcontinent. We report here a highly contiguous genome assembly of B. nigra variety Sangam. The genome assembly has been carried out using Oxford Nanopore long-read sequencing and optical mapping. The resulting chromosome-scale assembly is a significant improvement over the previous draft assemblies of B. nigra; five out of the eight pseudochromosomes were represented by one scaffold each. The assembled genome was annotated for the transposons, centromeric repeats, and genes. The B. nigra genome was compared with the recently available contiguous genome assemblies of B. rapa (AA), B. oleracea (CC), and B. juncea (AABB). Based on the maximum homology among the three diploid genomes of U’s triangle, we propose a new nomenclature for B. nigra pseudochromosomes, taking the B. rapa pseudochromosome nomenclature as the reference.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Caroline Belser ◽  
Franc-Christophe Baurens ◽  
Benjamin Noel ◽  
Guillaume Martin ◽  
Corinne Cruaud ◽  
...  

AbstractLong-read technologies hold the promise to obtain more complete genome assemblies and to make them easier. Coupled with long-range technologies, they can reveal the architecture of complex regions, like centromeres or rDNA clusters. These technologies also make it possible to know the complete organization of chromosomes, which remained complicated before even when using genetic maps. However, generating a gapless and telomere-to-telomere assembly is still not trivial, and requires a combination of several technologies and the choice of suitable software. Here, we report a chromosome-scale assembly of a banana genome (Musa acuminata) generated using Oxford Nanopore long-reads. We generated a genome coverage of 177X from a single PromethION flowcell with near 17X with reads longer than 75 kbp. From the 11 chromosomes, 5 were entirely reconstructed in a single contig from telomere to telomere, revealing for the first time the content of complex regions like centromeres or clusters of paralogous genes.


2015 ◽  
Author(s):  
Rene L Warren ◽  
Benjamin P Vandervalk ◽  
Steven JM Jones ◽  
Inanc Birol

Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. Established and emerging long read technologies show great promise in this regard, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics pre-processing before they could be of value. We present LINKS, the Long Interval Nucleotide K-mer Scaffolder algorithm, a solution that makes use of the information in error-rich long reads, without the need for read alignment or base correction. We show how the contiguity of an ABySS E. coli K-12 genome assembly could be increased over five-fold by the use of beta-released Oxford Nanopore Ltd. (ONT) long reads and how LINKS leverages long-range information in S. cerevisiae W303 ONT reads to yield an assembly with less than half the errors of competing applications. Re-scaffolding the colossal white spruce assembly draft (PG29, 20 Gbp) and how LINKS scales to larger genomes is also presented. We expect LINKS to have broad utility in harnessing the potential of long reads in connecting high-quality sequences of small and large genome assembly drafts. Availability: http://www.bcgsc.ca/bioinfo/software/links


2021 ◽  
Author(s):  
Jiandong Bao ◽  
Rong Wang ◽  
Shilei Gao ◽  
Zhe Wang ◽  
Yu Fang ◽  
...  

Ustilaginoidea virens is the fungal pathogen causing rice false smut, resulting in not only yield lost but also grain pollution with toxic mycotoxins. Here we deployed PacBio Sequel II HIFI-read sequencing technology to generate a near-complete genome assembly for the U. virens isolate UV-FJ-1 (38.48 Mb), which was isolated from Fujian province, China. The genome assembly contains 116 contigs with N50 of 0.65 Mb and a maximum length of 2.10 Mb, and the genome completeness is ≥98% assessed by benchmarking universal single-copy orthologs (BUSCOs) and the mapping rate of Illumina short reads. Excluding 35.78% repeat sequences, we identified a total of 7,164 protein-coding genes, of which 5,818 were functionally annotated and 223 encode putative effector proteins. Moreover, 21 secondary metabolite biosynthesis gene clusters were found in UV-FJ-1 genome. Taken together, this high-quality genome assembly and gene annotation resource will provide a better insight for characterizing the biological and pathogenic mechanisms of U. virens.


Author(s):  
Hengyuan Guo ◽  
Jiandong Bao ◽  
Lianyu Lin ◽  
Zhixin Wang ◽  
Mingyue Shi ◽  
...  

Peronophythora litchii is an oomycete pathogen that exclusively infects litchi, with infection stages affecting a broad range of tissues. In this study, we obtained a near chromosome-level genome assembly of P. litchii strain ZL2018 from China using Oxford Nanopore Technologies (ONT) long-read sequencing and Illumina short-read sequencing. The genome assembly was 64.15 Mb in size and consisted of 81 contigs with an N50 of 1.43 Mb and a maximum length of 4.74 Mb. Excluding 34.67% of repeat sequences, a total of 14,857 protein-coding genes were identified, among which 14,447 genes were annotated. We also predicted 306 candidate RXLR effectors in the assembly. The high-quality genome assembly and annotation resources reported in this study will provide new insight into the infection mechanisms of P. litchii.


Sign in / Sign up

Export Citation Format

Share Document