scholarly journals Mapping Genomic Scaffolds to Chromosomes Using Laser Capture Microdissection in Application to Hawaiian Picture-Winged Drosophila

2017 ◽  
Vol 152 (4) ◽  
pp. 204-212 ◽  
Author(s):  
Lin Kang ◽  
Phillip George ◽  
Donald K. Price ◽  
Igor Sharakhov ◽  
Pawel Michalak

Next-generation sequencing technologies have led to a decreased cost and an increased throughput in genome sequencing. Yet, many genome assemblies based on short sequencing reads have been assembled only to the scaffold level due to the lack of sufficient chromosome mapping information. Traditional ways of mapping scaffolds to chromosomes require a large amount of laboratory work and time to generate genetic and/or physical maps. To address this problem, we conducted a rapid technique which uses laser capture microdissection and enables mapping scaffolds of de novo genome assemblies directly to chromosomes in Hawaiian picture-winged Drosophila. We isolated and sequenced intact chromosome arms from larvae of D. differens. By mapping the reads of each chromosome to the recently assembled scaffolds from 3 Hawaiian picture-winged Drosophila species, at least 67% of the scaffolds were successfully assigned to chromosome arms. Even though the scaffolds are not ordered within a chromosome, the fast-generated chromosome information allows for chromosome-related analyses after genome assembling. We utilize this new information to test the faster-X evolution effect for the first time in these Hawaiian picture-winged Drosophila species.

Author(s):  
Valentina Peona ◽  
Mozes P.K. Blom ◽  
Luohao Xu ◽  
Reto Burri ◽  
Shawn Sullivan ◽  
...  

AbstractGenome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies have opened up a whole new world of genomic biodiversity. Although these technologies generate high-quality genome assemblies, there are still genomic regions difficult to assemble, like repetitive elements and GC-rich regions (genomic “dark matter”). In this study, we compare the efficiency of currently used sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter starting from the same sample. By adopting different de-novo assembly strategies, we were able to compare each individual draft assembly to a curated multiplatform one and identify the nature of the previously missing dark matter with a particular focus on transposable elements, multi-copy MHC genes, and GC-rich regions. Thanks to this multiplatform approach, we demonstrate the feasibility of producing a high-quality chromosome-level assembly for a non-model organism (paradise crow) for which only suboptimal samples are available. Our approach was able to reconstruct complex chromosomes like the repeat-rich W sex chromosome and several GC-rich microchromosomes. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects around the completeness of both the coding and non-coding parts of the genomes.


2019 ◽  
Author(s):  
Bráulio S.M.L. Silva ◽  
Pedro Heringer ◽  
Guilherme B. Dias ◽  
Marta Svartman ◽  
Gustavo C.S. Kuhn

AbstractSatellite DNAs are among the most abundant repetitive DNAs found in eukaryote genomes, where they participate in a variety of biological roles, from being components of important chromosome structures to gene regulation. Experimental methodologies used before the genomic era were not sufficient despite being too laborious and time-consuming to recover the collection of all satDNAs from a genome. Today, the availability of whole sequenced genomes combined with the development of specific bioinformatic tools are expected to foster the identification of virtually all of the “satellitome” from a particular species. While whole genome assemblies are important to obtain a global view of genome organization, most assemblies are incomplete and lack repetitive regions. Here, we applied short-read sequencing and similarity clustering in order to perform a de novo identification of the most abundant satellite families in two Drosophila species from the virilis group: Drosophila virilis and D. americana. These species were chosen because they have been used as a model to understand satDNA biology since early 70’s. We combined computational tandem repeat detection via similarity-based read clustering (implemented in Tandem Repeat Analyzer pipeline – “TAREAN”) with data from the literature and chromosome mapping to obtain an overview of satDNAs in D. virilis and D. americana. The fact that all of the abundant tandem repeats we detected were previously identified in the literature allowed us to evaluate the efficiency of TAREAN in correctly identifying true satDNAs. Our results indicate that raw sequencing reads can be efficiently used to detect satDNAs, but that abundant tandem repeats present in dispersed arrays or associated with transposable elements are frequent false positives. We demonstrate that TAREAN with its parent method RepeatExplorer, may be used as resources to detect tandem repeats associated with transposable elements and also to reveal families of dispersed tandem repeats.


Marine Drugs ◽  
2019 ◽  
Vol 17 (8) ◽  
pp. 466 ◽  
Author(s):  
Ronghua Li ◽  
Michaël Bekaert ◽  
Luning Wu ◽  
Changkao Mu ◽  
Weiwei Song ◽  
...  

The marine gastropod Hemifusus tuba is served as a luxury food in Asian countries and used in traditional Chinese medicine to treat lumbago and deafness. The lack of genomic data on H. tuba is a barrier to aquaculture development and functional characteristics of potential bioactive molecules are poorly understood. In the present study, we used high-throughput sequencing technologies to generate the first transcriptomic database of H. tuba. A total of 41 unique conopeptides were retrieved from 44 unigenes, containing 6-cysteine frameworks belonging to four superfamilies. Duplication of mature regions and alternative splicing were also found in some of the conopeptides, and the de novo assembly identified a total of 76,306 transcripts with an average length of 824.6 nt, of which including 75,620 (99.1%) were annotated. In addition, simple sequence repeats (SSRs) detection identified 14,000 unigenes containing 20,735 SSRs, among which, 23 polymorphic SSRs were screened. Thirteen of these markers could be amplified in Hemifusus ternatanus and seven in Rapana venosa. This study provides reports of conopeptide genes in Buccinidae for the first time as well as genomic resources for further drug development, gene discovery and population resource studies of this species.


2019 ◽  
Vol 20 (5) ◽  
pp. 1172 ◽  
Author(s):  
Karine Gousset ◽  
Ana Gordon ◽  
Shravan Kumar Kannan ◽  
Joey Tovar

Cell–cell communication is vital to multicellular organisms, and distinct types of cellular protrusions play critical roles during development, cell signaling, and the spreading of pathogens and cancer. The differences in the structure and protein composition of these different types of protrusions and their specific functions have not been elucidated due to the lack of a method for their specific isolation and analysis. In this paper, we described, for the first time, a method to specifically isolate distinct protrusion subtypes, based on their morphological structures or fluorescent markers, using laser capture microdissection (LCM). Combined with a unique fixation and protein extraction protocol, we pushed the limits of microproteomics and demonstrate that proteins from LCM-isolated protrusions can successfully and reproducibly be identified by mass spectrometry using ultra-high field Orbitrap technologies. Our method confirmed that different types of protrusions have distinct proteomes and it promises to advance the characterization and the understanding of these unique structures to shed light on their possible role in health and disease.


2017 ◽  
Author(s):  
Ian T. Fiddes ◽  
Joel Armstrong ◽  
Mark Diekhans ◽  
Stefanie Nachtweide ◽  
Zev N. Kronenberg ◽  
...  

ABSTRACTThe recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultra-contiguous genome assemblies. To compare these genomes we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms and structural variants, even in genomes as well studied as rat and the great apes, and how these annotations improve cross-species RNA expression experiments.


2016 ◽  
Author(s):  
Karyn Meltz Steinberg ◽  
Tina Graves Lindsay ◽  
Valerie A. Schneider ◽  
Mark J.P. Chaisson ◽  
Chad Tomlinson ◽  
...  

ABSTRACTDe novo assembly of human genomes is now a tractable effort due in part to advances in sequencing and mapping technologies. We use PacBio single-molecule, real-time (SMRT) sequencing and BioNano genomic maps to construct the first de novo assembly of NA19240, a Yoruban individual from Africa. This chromosome-scaffolded assembly of 3.08 Gb with a contig N50 of 7.25 Mb and a scaffold N50 of 78.6 Mb represents one of the most contiguous high-quality human genomes. We utilize a BAC library derived from NA19240 DNA and novel haplotype-resolving sequencing technologies and algorithms to characterize regions of complex genomic architecture that are normally lost due to compression to a linear haploid assembly. Our results demonstrate that multiple technologies are still necessary for complete genomic representation, particularly in regions of highly identical segmental duplications. Additionally, we show that diploid assembly has utility in improving the quality of de novo human genome assemblies.


2021 ◽  
Author(s):  
Alexander Bury ◽  
Angela Pyle ◽  
Fabio Marcuccio ◽  
Doug Turnbull ◽  
Amy Vincent ◽  
...  

Intracellular heterogeneity contributes significantly to cellular physiology and, in a number of debilitating diseases, cellular pathophysiology. This is greatly influenced by distinct organelle populations and to understand the aetiology of disease it is important to have tools able to isolate and differentially analyse organelles from precise location within tissues. Here we report the development of a subcellular biopsy technology that facilitates the isolation of organelles, such as mitochondria, from human tissue. We compared the subcellular biopsy technology to laser capture microdissection (LCM) that is the state of art technique for the isolation of cells from their surrounding tissues. We demonstrate an operational limit of (>20 micron) for LCM and then, for the first time in human tissue, show that subcellular biopsy can be used to isolate mitochondria beyond this limit.


2016 ◽  
Author(s):  
Govinda M. Kamath ◽  
Ilan Shomorony ◽  
Fei Xia ◽  
Thomas A. Courtade ◽  
David N. Tse

ABSTRACTLong-read sequencing technologies have the potential to produce gold-standard de novo genome assemblies, but fully exploiting error-prone reads to resolve repeats remains a challenge. Aggressive approaches to repeat resolution often produce mis-assemblies, and conservative approaches lead to unnecessary fragmentation. We present HINGE, an assembler that seeks to achieve optimal repeat resolution by distinguishing repeats that can be resolved given the data from those that cannot. This is accomplished by adding "hinges" to reads for constructing an overlap graph where only unresolvable repeats are merged. As a result, HINGE combines the error resilience of overlap-based assemblers with repeat-resolution capabilities of de Bruijn graph assemblers. HINGE was evaluated on the long-read bacterial datasets from the NCTC project. HINGE produces more finished assemblies than Miniasm and the manual pipeline of NCTC based on the HGAP assembler and Circlator. HINGE also allows us to identify 40 datasets where unresolvable repeats prevent the reliable construction of a unique finished assembly. In these cases, HINGE outputs a visually interpretable assembly graph that encodes all possible finished assemblies consistent with the reads, while other approaches such as the NCTC pipeline and FALCON either fragment the assembly or resolve the ambiguity arbitrarily.


2019 ◽  
Author(s):  
Alex Di Genova ◽  
Elena Buena-Atienza ◽  
Stephan Ossowski ◽  
Marie-France Sagot

The continuous improvement of long-read sequencing technologies along with the development of ad-doc algorithms has launched a new de novo assembly era that promises high-quality genomes. However, it has proven difficult to use only long reads to generate accurate genome assemblies of large, repeat-rich human genomes. To date, most of the human genomes assembled from long error-prone reads add accurate short reads to further polish the consensus quality. Here, we report the development of a novel algorithm for hybrid assembly, WENGAN, and the de novo assembly of four human genomes using a combination of sequencing data generated on ONT PromethION, PacBio Sequel, Illumina and MGI technology. WENGAN implements efficient algorithms that exploit the sequence information of short and long reads to tackle assembly contiguity as well as consensus quality. The resulting genome assemblies have high contiguity (contig NG50:16.67-62.06 Mb), few assembly errors (contig NGA50:10.9-45.91 Mb), good consensus quality (QV:27.79-33.61), and high gene completeness (BUSCO complete: 94.6-95.1%), while consuming low computational resources (CPU hours:153-1027). In particular, the WENGAN assembly of the haploid CHM13 sample achieved a contig NG50 of 62.06 Mb (NGA50:45.91 Mb), which surpasses the contiguity of the current human reference genome (GRCh38 contig NG50:57.88 Mb). Providing highest quality at low computational cost, WENGAN is an important step towards the democratization of the de novo assembly of human genomes. The WENGAN assembler is available at https://github.com/adigenova/wengan


2016 ◽  
Author(s):  
Valerie A. Schneider ◽  
Tina Graves-Lindsay ◽  
Kerstin Howe ◽  
Nathan Bouk ◽  
Hsiu-Chuan Chen ◽  
...  

AbstractThe human reference genome assembly plays a central role in nearly all aspects of today’s basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009 and reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that while the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.


Sign in / Sign up

Export Citation Format

Share Document