scholarly journals Unraveling the Genome of a High Yielding Colombian Sugarcane Hybrid

2021 ◽  
Vol 12 ◽  
Author(s):  
Jhon Henry Trujillo-Montenegro ◽  
María Juliana Rodríguez Cubillos ◽  
Cristian Darío Loaiza ◽  
Manuel Quintero ◽  
Héctor Fabio Espitia-Navarro ◽  
...  

Recent developments in High Throughput Sequencing (HTS) technologies and bioinformatics, including improved read lengths and genome assemblers allow the reconstruction of complex genomes with unprecedented quality and contiguity. Sugarcane has one of the most complicated genomes among grassess with a haploid length of 1Gbp and a ploidies between 8 and 12. In this work, we present a genome assembly of the Colombian sugarcane hybrid CC 01-1940. Three types of sequencing technologies were combined for this assembly: PacBio long reads, Illumina paired short reads, and Hi-C reads. We achieved a median contig length of 34.94 Mbp and a total genome assembly of 903.2 Mbp. We annotated a total of 63,724 protein coding genes and performed a reconstruction and comparative analysis of the sucrose metabolism pathway. Nucleotide evolution measurements between orthologs with close species suggest that divergence between Saccharum officinarum and Saccharum spontaneum occurred <2 million years ago. Synteny analysis between CC 01-1940 and the S. spontaneum genome confirms the presence of translocation events between the species and a random contribution throughout the entire genome in current sugarcane hybrids. Analysis of RNA-Seq data from leaf and root tissue of contrasting sugarcane genotypes subjected to water stress treatments revealed 17,490 differentially expressed genes, from which 3,633 correspond to genes expressed exclusively in tolerant genotypes. We expect the resources presented here to serve as a source of information to improve the selection processes of new varieties of the breeding programs of sugarcane.

Author(s):  
Xiaolin Zhao ◽  
Zhichao Zhang ◽  
Sujiao Zheng ◽  
Wenwu Ye ◽  
Xiaobo Zheng ◽  
...  

Diaporthe-Phomopsis disease complex causes considerable yield losses in soybean production worldwide. As one of the major pathogens, Phomopsis longicolla T. W. Hobbs (syn. Diaporthe longicolla) is not only the primary agent of Phomopsis seed decay, but also one of the agents of Phomopsis pod and stem blight, and Phomopsis stem canker. We performed both PacBio long read sequencing and Illumina short read sequencing, and obtained a genome assembly for the P. longicolla strain YC2-1, which was isolated from soybean stem with Phomopsis stem blight disease. The 63.1 Mb genome assembly contains 87 scaffolds, with a minimum, maximum, and N50 scaffold length of 20 kb, 4.6 Mb, and 1.5 Mb respectively, and a total of 17,407 protein-coding genes. The high-quality data expand the genomic resource of P. longicolla species and will provide a solid foundation for a better understanding of their genetic diversity and pathogenic mechanisms.


2021 ◽  
Vol 6 ◽  
pp. 258
Author(s):  
Konrad Lohse ◽  
Alexander Mackintosh ◽  
Roger Vila ◽  
◽  
◽  
...  

We present a genome assembly from an individual male Aglais io (also known as Inachis io and Nymphalis io) (the European peacock; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 384 megabases in span. The majority (99.91%) of the assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 11,420 protein coding genes.


2020 ◽  
Vol 10 (7) ◽  
pp. 2179-2183 ◽  
Author(s):  
Stefan Prost ◽  
Malte Petersen ◽  
Martin Grethlein ◽  
Sarah Joy Hahn ◽  
Nina Kuschik-Maczollek ◽  
...  

Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university master’s course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behavior. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published Hi-C data. The use of ∼35x nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using the Hi-C data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 96.1% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly. We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university master’s course. The use of ∼35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.


Diversity ◽  
2019 ◽  
Vol 11 (9) ◽  
pp. 144 ◽  
Author(s):  
Laís Coelho ◽  
Lukas Musher ◽  
Joel Cracraft

Current generation high-throughput sequencing technology has facilitated the generation of more genomic-scale data than ever before, thus greatly improving our understanding of avian biology across a range of disciplines. Recent developments in linked-read sequencing (Chromium 10×) and reference-based whole-genome assembly offer an exciting prospect of more accessible chromosome-level genome sequencing in the near future. We sequenced and assembled a genome of the Hairy-crested Antbird (Rhegmatorhina melanosticta), which represents the first publicly available genome for any antbird (Thamnophilidae). Our objectives were to (1) assemble scaffolds to chromosome level based on multiple reference genomes, and report on differences relative to other genomes, (2) assess genome completeness and compare content to other related genomes, and (3) assess the suitability of linked-read sequencing technology for future studies in comparative phylogenomics and population genomics studies. Our R. melanosticta assembly was both highly contiguous (de novo scaffold N50 = 3.3 Mb, reference based N50 = 53.3 Mb) and relatively complete (contained close to 90% of evolutionarily conserved single-copy avian genes and known tetrapod ultraconserved elements). The high contiguity and completeness of this assembly enabled the genome to be successfully mapped to the chromosome level, which uncovered a consistent structural difference between R. melanosticta and other avian genomes. Our results are consistent with the observation that avian genomes are structurally conserved. Additionally, our results demonstrate the utility of linked-read sequencing for non-model genomics. Finally, we demonstrate the value of our R. melanosticta genome for future researchers by mapping reduced representation sequencing data, and by accurately reconstructing the phylogenetic relationships among a sample of thamnophilid species.


2017 ◽  
Author(s):  
Jia-Xing Yue ◽  
Gianni Liti

AbstractLong-read sequencing technologies have become increasingly popular in genome projects due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast, Saccharomyces cerevisiae, has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here we present LRSDAY, the first one-stop solution to streamline this process. LRSDAY can produce chromosome-level end-to-end genome assembly and comprehensive annotations for various genomic features (including centromeres, protein-coding genes, tRNAs, transposable elements and telomere-associated elements) that are ready for downstream analysis. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable for virtually any eukaryotic organisms. Applying LRSDAY to a S. cerevisiae strain takes ∼43 hrs to generate a complete and well-annotated genome from ∼100X Pacific Biosciences (PacBio) reads using four threads.


2021 ◽  
Vol 6 ◽  
pp. 266
Author(s):  
Roger Vila ◽  
Alex Hayward ◽  
Konrad Lohse ◽  
Charlotte Wright ◽  
◽  
...  

We present a genome assembly from an individual male Melitaea cinxia (the Glanville fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 499 megabases in span. The complete assembly is scaffolded into 31 chromosomal pseudomolecules, with the Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 13,666 protein coding genes.


2019 ◽  
Vol 23 (1) ◽  
pp. 38-48 ◽  
Author(s):  
M. K. Bragina ◽  
D. A. Afonnikov ◽  
E. A. Salina

Since the first plant genome of Arabidopsis thaliana has been sequenced and published, genome sequencing technologies have undergone significant changes. New algorithms, sequencing technologies and bioinformatic approaches were adopted to obtain genome, transcriptome and exome sequences for model and crop species, which have permitted deep inferences into plant biology. As a result of an improved genome assembly and analysis methods, genome sequencing costs plummeted and the number of high-quality plant genome sequences is constantly growing. Consequently, more than 300 plant genome sequences have been published over the past twenty years. Although many of the published genomes are considered incomplete, they proved to be a valuable tool for identifying genes involved in the formation of economically valuable plant traits, for marker-assisted and genomic selection and for comparative analysis of plant genomes in order to determine the basic patterns of origin of various plant species. Since a high coverage and resolution of a genome sequence is not enough to detect all changes in complex samples, targeted sequencing, which consists in the isolation and sequencing of a specific region of the genome, has begun to develop. Targeted sequencing has a higher detection power (the ability to identify new differences/variants) and resolution (up to one basis). In addition, exome sequencing (the method of sequencing only protein-coding genes regions) is actively developed, which allows for the sequencing of non-expressed alleles and genes that cannot be found with RNA-seq. In this review, an analysis of sequencing technologies development and the construction of “reference” genomes of plants is performed. A comparison of the methods of targeted sequencing based on the use of the reference DNA sequence is accomplished.


2021 ◽  
Vol 6 ◽  
pp. 304
Author(s):  
Alex Hayward ◽  
Roger Vila ◽  
Dominik R. Laetsch ◽  
Konrad Lohse ◽  
Tobias Baril ◽  
...  

We present a genome assembly from an individual female Melitaea athalia (also known as Mellicta athalia; the heath fritillary; Arthropoda; Insecta; Lepidoptera; Nymphalidae). The genome sequence is 610 megabases in span. In total, 99.98% of the assembly is scaffolded into 32 chromosomal pseudomolecules, with the W and Z sex chromosome assembled. Gene annotation of this assembly on Ensembl has identified 12,824 protein coding genes.


GigaScience ◽  
2019 ◽  
Vol 8 (8) ◽  
Author(s):  
Xin Jiang ◽  
Qian Zhang ◽  
Yaoguo Qin ◽  
Hang Yin ◽  
Siyu Zhang ◽  
...  

AbstractBackgroundSitobion miscanthi is an ideal model for studying host plant specificity, parthenogenesis-based phenotypic plasticity, and interactions between insects and other species of various trophic levels, such as viruses, bacteria, plants, and natural enemies. However, the genome information for this species has not yet to be sequenced and published. Here, we analyzed the entire genome of a parthenogenetic female aphid colony using Pacific Biosciences long-read sequencing and Hi-C data to generate chromosome-length scaffolds and a highly contiguous genome assembly.ResultsThe final draft genome assembly from 33.88 Gb of raw data was ∼397.90 Mb in size, with a 2.05 Mb contig N50. Nine chromosomes were further assembled based on Hi-C data to a 377.19 Mb final size with a 36.26 Mb scaffold N50. The identified repeat sequences accounted for 26.41% of the genome, and 16,006 protein-coding genes were annotated. According to the phylogenetic analysis, S. miscanthi is closely related to Acyrthosiphon pisum, with S. miscanthi diverging from their common ancestor ∼25.0–44.9 million years ago.ConclusionsWe generated a high-quality draft of the S. miscanthi genome. This genome assembly should help promote research on the lifestyle and feeding specificity of aphids and their interactions with each other and species at other trophic levels. It can serve as a resource for accelerating genome-assisted improvements in insecticide-resistant management and environmentally safe aphid management.


2019 ◽  
Vol 12 (1) ◽  
pp. 3580-3585 ◽  
Author(s):  
Luis Rodriguez-Caro ◽  
Jennifer Fenner ◽  
Caleb Benson ◽  
Steven M Van Belleghem ◽  
Brian A Counterman

Abstract Comparisons of high-quality, reference butterfly, and moth genomes have been instrumental to advancing our understanding of how hybridization, and natural selection drive genomic change during the origin of new species and novel traits. Here, we present a genome assembly of the Southern Dogface butterfly, Zerene cesonia (Pieridae) whose brilliant wing colorations have been implicated in developmental plasticity, hybridization, sexual selection, and speciation. We assembled 266,407,278 bp of the Z. cesonia genome, which accounts for 98.3% of the estimated 271 Mb genome size. Using a hybrid approach involving Chicago libraries with Hi-Rise assembly and a diploid Meraculous assembly, the final haploid genome was assembled. In the final assembly, nearly all autosomes and the Z chromosome were assembled into single scaffolds. The largest 29 scaffolds accounted for 91.4% of the genome assembly, with the remaining ∼8% distributed among another 247 scaffolds and overall N50 of 9.2 Mb. Tissue-specific RNA-seq informed annotations identified 16,442 protein-coding genes, which included 93.2% of the arthropod Benchmarking Universal Single-Copy Orthologs (BUSCO). The Z. cesonia genome assembly had ∼9% identified as repetitive elements, with a transposable element landscape rich in helitrons. Similar to other Lepidoptera genomes, Z. cesonia showed a high conservation of chromosomal synteny. The Z. cesonia assembly provides a high-quality reference for studies of chromosomal arrangements in the Pierid family, as well as for population, phylo, and functional genomic studies of adaptation and speciation.


Sign in / Sign up

Export Citation Format

Share Document