Mapping Genomic Scaffolds to Chromosomes
Using Laser Capture Microdissection in
Application to Hawaiian Picture-Winged
Drosophila

Lin Kang; Phillip George; Donald K. Price; Igor Sharakhov; Pawel Michalak

doi:10.1159/000481790

Mapping Genomic Scaffolds to Chromosomes Using Laser Capture Microdissection in Application to Hawaiian Picture-Winged Drosophila

Cytogenetic and Genome Research ◽

10.1159/000481790 ◽

2017 ◽

Vol 152 (4) ◽

pp. 204-212 ◽

Cited By ~ 2

Author(s):

Lin Kang ◽

Phillip George ◽

Donald K. Price ◽

Igor Sharakhov ◽

Pawel Michalak

Keyword(s):

Laser Capture Microdissection ◽

De Novo ◽

Chromosome Mapping ◽

Drosophila Species ◽

Sequencing Technologies ◽

Physical Maps ◽

New Information ◽

Genome Assemblies ◽

First Time ◽

Laser Capture

Next-generation sequencing technologies have led to a decreased cost and an increased throughput in genome sequencing. Yet, many genome assemblies based on short sequencing reads have been assembled only to the scaffold level due to the lack of sufficient chromosome mapping information. Traditional ways of mapping scaffolds to chromosomes require a large amount of laboratory work and time to generate genetic and/or physical maps. To address this problem, we conducted a rapid technique which uses laser capture microdissection and enables mapping scaffolds of de novo genome assemblies directly to chromosomes in Hawaiian picture-winged Drosophila. We isolated and sequenced intact chromosome arms from larvae of D. differens. By mapping the reads of each chromosome to the recently assembled scaffolds from 3 Hawaiian picture-winged Drosophila species, at least 67% of the scaffolds were successfully assigned to chromosome arms. Even though the scaffolds are not ordered within a chromosome, the fast-generated chromosome information allows for chromosome-related analyses after genome assembling. We utilize this new information to test the faster-X evolution effect for the first time in these Hawaiian picture-winged Drosophila species.

Get full-text (via PubEx)

Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise

10.1101/2019.12.19.882399 ◽

2019 ◽

Cited By ~ 5

Author(s):

Valentina Peona ◽

Mozes P.K. Blom ◽

Luohao Xu ◽

Reto Burri ◽

Shawn Sullivan ◽

...

Keyword(s):

Dark Matter ◽

Genome Assembly ◽

Sex Chromosome ◽

De Novo ◽

Model Organism ◽

Technology Choice ◽

High Quality ◽

Sequencing Technologies ◽

Downstream Analysis ◽

Genome Assemblies

AbstractGenome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies have opened up a whole new world of genomic biodiversity. Although these technologies generate high-quality genome assemblies, there are still genomic regions difficult to assemble, like repetitive elements and GC-rich regions (genomic “dark matter”). In this study, we compare the efficiency of currently used sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter starting from the same sample. By adopting different de-novo assembly strategies, we were able to compare each individual draft assembly to a curated multiplatform one and identify the nature of the previously missing dark matter with a particular focus on transposable elements, multi-copy MHC genes, and GC-rich regions. Thanks to this multiplatform approach, we demonstrate the feasibility of producing a high-quality chromosome-level assembly for a non-model organism (paradise crow) for which only suboptimal samples are available. Our approach was able to reconstruct complex chromosomes like the repeat-rich W sex chromosome and several GC-rich microchromosomes. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects around the completeness of both the coding and non-coding parts of the genomes.

Get full-text (via PubEx)

De novo identification of satellite DNAs in the sequenced genomes of Drosophila virilis and D. americana using the RepeatExplorer and TAREAN pipelines

10.1101/781146 ◽

2019 ◽

Author(s):

Bráulio S.M.L. Silva ◽

Pedro Heringer ◽

Guilherme B. Dias ◽

Marta Svartman ◽

Gustavo C.S. Kuhn

Keyword(s):

Transposable Elements ◽

Tandem Repeat ◽

Tandem Repeats ◽

De Novo ◽

Chromosome Mapping ◽

Drosophila Virilis ◽

Satellite Dnas ◽

Bioinformatic Tools ◽

A Genome ◽

Genome Assemblies

AbstractSatellite DNAs are among the most abundant repetitive DNAs found in eukaryote genomes, where they participate in a variety of biological roles, from being components of important chromosome structures to gene regulation. Experimental methodologies used before the genomic era were not sufficient despite being too laborious and time-consuming to recover the collection of all satDNAs from a genome. Today, the availability of whole sequenced genomes combined with the development of specific bioinformatic tools are expected to foster the identification of virtually all of the “satellitome” from a particular species. While whole genome assemblies are important to obtain a global view of genome organization, most assemblies are incomplete and lack repetitive regions. Here, we applied short-read sequencing and similarity clustering in order to perform a de novo identification of the most abundant satellite families in two Drosophila species from the virilis group: Drosophila virilis and D. americana. These species were chosen because they have been used as a model to understand satDNA biology since early 70’s. We combined computational tandem repeat detection via similarity-based read clustering (implemented in Tandem Repeat Analyzer pipeline – “TAREAN”) with data from the literature and chromosome mapping to obtain an overview of satDNAs in D. virilis and D. americana. The fact that all of the abundant tandem repeats we detected were previously identified in the literature allowed us to evaluate the efficiency of TAREAN in correctly identifying true satDNAs. Our results indicate that raw sequencing reads can be efficiently used to detect satDNAs, but that abundant tandem repeats present in dispersed arrays or associated with transposable elements are frequent false positives. We demonstrate that TAREAN with its parent method RepeatExplorer, may be used as resources to detect tandem repeats associated with transposable elements and also to reveal families of dispersed tandem repeats.

Get full-text (via PubEx)

Transcriptomic Analysis of Marine Gastropod Hemifusus tuba Provides Novel Insights into Conotoxin Genes

Marine Drugs ◽

10.3390/md17080466 ◽

2019 ◽

Vol 17 (8) ◽

pp. 466 ◽

Cited By ~ 2

Author(s):

Ronghua Li ◽

Michaël Bekaert ◽

Luning Wu ◽

Changkao Mu ◽

Weiwei Song ◽

...

Keyword(s):

High Throughput Sequencing ◽

De Novo ◽

Average Length ◽

Bioactive Molecules ◽

Asian Countries ◽

Marine Gastropod ◽

Sequencing Technologies ◽

First Time ◽

Aquaculture Development ◽

Simple Sequence

The marine gastropod Hemifusus tuba is served as a luxury food in Asian countries and used in traditional Chinese medicine to treat lumbago and deafness. The lack of genomic data on H. tuba is a barrier to aquaculture development and functional characteristics of potential bioactive molecules are poorly understood. In the present study, we used high-throughput sequencing technologies to generate the first transcriptomic database of H. tuba. A total of 41 unique conopeptides were retrieved from 44 unigenes, containing 6-cysteine frameworks belonging to four superfamilies. Duplication of mature regions and alternative splicing were also found in some of the conopeptides, and the de novo assembly identified a total of 76,306 transcripts with an average length of 824.6 nt, of which including 75,620 (99.1%) were annotated. In addition, simple sequence repeats (SSRs) detection identified 14,000 unigenes containing 20,735 SSRs, among which, 23 polymorphic SSRs were screened. Thirteen of these markers could be amplified in Hemifusus ternatanus and seven in Rapana venosa. This study provides reports of conopeptide genes in Buccinidae for the first time as well as genomic resources for further drug development, gene discovery and population resource studies of this species.

Get full-text (via PubEx)

A novel Microproteomic Approach Using Laser Capture Microdissection to Study Cellular Protrusions

International Journal of Molecular Sciences ◽

10.3390/ijms20051172 ◽

2019 ◽

Vol 20 (5) ◽

pp. 1172 ◽

Cited By ~ 4

Author(s):

Karine Gousset ◽

Ana Gordon ◽

Shravan Kumar Kannan ◽

Joey Tovar

Keyword(s):

Laser Capture Microdissection ◽

Protein Extraction ◽

Cell Communication ◽

Protein Composition ◽

Fluorescent Markers ◽

Different Types ◽

Health And Disease ◽

Multicellular Organisms ◽

First Time ◽

Laser Capture

Cell–cell communication is vital to multicellular organisms, and distinct types of cellular protrusions play critical roles during development, cell signaling, and the spreading of pathogens and cancer. The differences in the structure and protein composition of these different types of protrusions and their specific functions have not been elucidated due to the lack of a method for their specific isolation and analysis. In this paper, we described, for the first time, a method to specifically isolate distinct protrusion subtypes, based on their morphological structures or fluorescent markers, using laser capture microdissection (LCM). Combined with a unique fixation and protein extraction protocol, we pushed the limits of microproteomics and demonstrate that proteins from LCM-isolated protrusions can successfully and reproducibly be identified by mass spectrometry using ultra-high field Orbitrap technologies. Our method confirmed that different types of protrusions have distinct proteomes and it promises to advance the characterization and the understanding of these unique structures to shed light on their possible role in health and disease.

Get full-text (via PubEx)

Comparative Annotation Toolkit (CAT) - simultaneous clade and personal genome annotation

10.1101/231118 ◽

2017 ◽

Cited By ~ 6

Author(s):

Ian T. Fiddes ◽

Joel Armstrong ◽

Mark Diekhans ◽

Stefanie Nachtweide ◽

Zev N. Kronenberg ◽

...

Keyword(s):

Genome Annotation ◽

De Novo ◽

Low Cost ◽

Great Apes ◽

Personal Genome ◽

Sequencing Technologies ◽

Human Genomes ◽

Long Read ◽

Genome Assemblies ◽

Rat Genome

ABSTRACTThe recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultra-contiguous genome assemblies. To compare these genomes we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms and structural variants, even in genomes as well studied as rat and the great apes, and how these annotations improve cross-species RNA expression experiments.

Get full-text (via PubEx)

High-Quality Assembly of an Individual of Yoruban Descent

10.1101/067447 ◽

2016 ◽

Cited By ~ 9

Author(s):

Karyn Meltz Steinberg ◽

Tina Graves Lindsay ◽

Valerie A. Schneider ◽

Mark J.P. Chaisson ◽

Chad Tomlinson ◽

...

Keyword(s):

Single Molecule ◽

De Novo ◽

Bac Library ◽

Segmental Duplications ◽

High Quality ◽

Sequencing Technologies ◽

Human Genomes ◽

Genome Assemblies ◽

Complete Genomic

ABSTRACTDe novo assembly of human genomes is now a tractable effort due in part to advances in sequencing and mapping technologies. We use PacBio single-molecule, real-time (SMRT) sequencing and BioNano genomic maps to construct the first de novo assembly of NA19240, a Yoruban individual from Africa. This chromosome-scaffolded assembly of 3.08 Gb with a contig N50 of 7.25 Mb and a scaffold N50 of 78.6 Mb represents one of the most contiguous high-quality human genomes. We utilize a BAC library derived from NA19240 DNA and novel haplotype-resolving sequencing technologies and algorithms to characterize regions of complex genomic architecture that are normally lost due to compression to a linear haploid assembly. Our results demonstrate that multiple technologies are still necessary for complete genomic representation, particularly in regions of highly identical segmental duplications. Additionally, we show that diploid assembly has utility in improving the quality of de novo human genome assemblies.

Get full-text (via PubEx)

A subcellular cookie cutter for spatial genomics in human tissue

10.1101/2021.11.29.470247 ◽

2021 ◽

Author(s):

Alexander Bury ◽

Angela Pyle ◽

Fabio Marcuccio ◽

Doug Turnbull ◽

Amy Vincent ◽

...

Keyword(s):

Laser Capture Microdissection ◽

Human Tissue ◽

The State ◽

Precise Location ◽

Cellular Physiology ◽

First Time ◽

Laser Capture ◽

State Of Art ◽

Biopsy Technology

Intracellular heterogeneity contributes significantly to cellular physiology and, in a number of debilitating diseases, cellular pathophysiology. This is greatly influenced by distinct organelle populations and to understand the aetiology of disease it is important to have tools able to isolate and differentially analyse organelles from precise location within tissues. Here we report the development of a subcellular biopsy technology that facilitates the isolation of organelles, such as mitochondria, from human tissue. We compared the subcellular biopsy technology to laser capture microdissection (LCM) that is the state of art technique for the isolation of cells from their surrounding tissues. We demonstrate an operational limit of (>20 micron) for LCM and then, for the first time in human tissue, show that subcellular biopsy can be used to isolate mitochondria beyond this limit.

Get full-text (via PubEx)

HINGE: Long-Read Assembly Achieves Optimal Repeat Resolution

10.1101/062117 ◽

2016 ◽

Cited By ~ 4

Author(s):

Govinda M. Kamath ◽

Ilan Shomorony ◽

Fei Xia ◽

Thomas A. Courtade ◽

David N. Tse

Keyword(s):

Gold Standard ◽

De Novo ◽

Error Resilience ◽

De Bruijn Graph ◽

Sequencing Technologies ◽

Long Read ◽

De Bruijn ◽

Genome Assemblies

ABSTRACTLong-read sequencing technologies have the potential to produce gold-standard de novo genome assemblies, but fully exploiting error-prone reads to resolve repeats remains a challenge. Aggressive approaches to repeat resolution often produce mis-assemblies, and conservative approaches lead to unnecessary fragmentation. We present HINGE, an assembler that seeks to achieve optimal repeat resolution by distinguishing repeats that can be resolved given the data from those that cannot. This is accomplished by adding "hinges" to reads for constructing an overlap graph where only unresolvable repeats are merged. As a result, HINGE combines the error resilience of overlap-based assemblers with repeat-resolution capabilities of de Bruijn graph assemblers. HINGE was evaluated on the long-read bacterial datasets from the NCTC project. HINGE produces more finished assemblies than Miniasm and the manual pipeline of NCTC based on the HGAP assembler and Circlator. HINGE also allows us to identify 40 datasets where unresolvable repeats prevent the reliable construction of a unique finished assembly. In these cases, HINGE outputs a visually interpretable assembly graph that encodes all possible finished assemblies consistent with the reads, while other approaches such as the NCTC pipeline and FALCON either fragment the assembly or resolve the ambiguity arbitrarily.

Get full-text (via PubEx)

WENGAN: Efficient and high quality hybrid de novo assembly of human genomes

10.1101/840447 ◽

2019 ◽

Cited By ~ 1

Author(s):

Alex Di Genova ◽

Elena Buena-Atienza ◽

Stephan Ossowski ◽

Marie-France Sagot

Keyword(s):

De Novo ◽

Computational Cost ◽

Sequence Information ◽

Sequencing Data ◽

High Quality ◽

Sequencing Technologies ◽

Human Genomes ◽

Long Reads ◽

Long Read ◽

Genome Assemblies

The continuous improvement of long-read sequencing technologies along with the development of ad-doc algorithms has launched a new de novo assembly era that promises high-quality genomes. However, it has proven difficult to use only long reads to generate accurate genome assemblies of large, repeat-rich human genomes. To date, most of the human genomes assembled from long error-prone reads add accurate short reads to further polish the consensus quality. Here, we report the development of a novel algorithm for hybrid assembly, WENGAN, and the de novo assembly of four human genomes using a combination of sequencing data generated on ONT PromethION, PacBio Sequel, Illumina and MGI technology. WENGAN implements efficient algorithms that exploit the sequence information of short and long reads to tackle assembly contiguity as well as consensus quality. The resulting genome assemblies have high contiguity (contig NG50:16.67-62.06 Mb), few assembly errors (contig NGA50:10.9-45.91 Mb), good consensus quality (QV:27.79-33.61), and high gene completeness (BUSCO complete: 94.6-95.1%), while consuming low computational resources (CPU hours:153-1027). In particular, the WENGAN assembly of the haploid CHM13 sample achieved a contig NG50 of 62.06 Mb (NGA50:45.91 Mb), which surpasses the contiguity of the current human reference genome (GRCh38 contig NG50:57.88 Mb). Providing highest quality at low computational cost, WENGAN is an important step towards the democratization of the de novo assembly of human genomes. The WENGAN assembler is available at https://github.com/adigenova/wengan

Get full-text (via PubEx)

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

10.1101/072116 ◽

2016 ◽

Cited By ~ 8

Author(s):

Valerie A. Schneider ◽

Tina Graves-Lindsay ◽

Kerstin Howe ◽

Nathan Bouk ◽

Hsiu-Chuan Chen ◽

...

Keyword(s):

De Novo ◽

Genome Mapping ◽

Population Variation ◽

Reference Assembly ◽

Sequence Generation ◽

Long Read ◽

Genomic Regions ◽

Genome Assemblies ◽

First Time

AbstractThe human reference genome assembly plays a central role in nearly all aspects of today’s basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009 and reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that while the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.

Get full-text (via PubEx)