A Chromosome-Scale Assembly of the En ormous (32 Gb) Axolotl Genome

Mapping Intimacies ◽

10.1101/373548 ◽

2018 ◽

Cited By ~ 3

Author(s):

Jeramiah J. Smith ◽

Nataliya Timoshevskaya ◽

Vladimir A. Timoshevskiy ◽

Melissa C. Keinath ◽

Drew Hardy ◽

...

Keyword(s):

Large Scale ◽

Genome Structure ◽

Large Deletion ◽

Ambystoma Mexicanum ◽

Biological Research ◽

Genome Wide ◽

Genetic Stocks ◽

Gene Structures ◽

And Function ◽

Genome Assemblies

ABSTRACTThe axolotl (Ambystoma mexicanum) provides critical models for studying regeneration, evolution and development. However, its large genome (~32 gigabases) presents a formidable barrier to genetic analyses. Recent efforts have yielded genome assemblies consisting of thousands of unordered scaffolds that resolve gene structures, but do not yet permit large scale analyses of genome structure and function. We adapted an established mapping approach to leverage dense SNP typing information and for the first time assemble the axolotl genome into 14 chromosomes. Moreover, we used fluorescence in situ hybridization to verify the structure of these 14 scaffolds and assign each to its corresponding physical chromosome. This new assembly covers 27.3 gigabases and encompasses 94% of annotated gene models on chromosomal scaffolds. We show the assembly’s utility by resolving genome-wide orthologies between the axolotl and other vertebrates, identifying the footprints of historical introgression events that occurred during the development of axolotl genetic stocks, and precisely mapping several phenotypes including a large deletion underlying the cardiac mutant. This chromosome-scale assembly will greatly facilitate studies of the axolotl in biological research.

Get full-text (via PubEx)

Large-scale genome-wide analysis identifies genetic variants associated with cardiac structure and function

Journal of Clinical Investigation ◽

10.1172/jci84840 ◽

2017 ◽

Vol 127 (5) ◽

pp. 1798-1812 ◽

Cited By ~ 59

Author(s):

Philipp S. Wild ◽

Janine F. Felix ◽

Arne Schillert ◽

Alexander Teumer ◽

Ming-Huei Chen ◽

...

Keyword(s):

Genetic Variants ◽

Large Scale ◽

Structure And Function ◽

Cardiac Structure ◽

Genome Wide Analysis ◽

Genome Wide ◽

Cardiac Structure And Function ◽

And Function

Get full-text (via PubEx)

A Deep Dive into Genome Assemblies of Non-vertebrate Animals

10.20944/preprints202111.0170.v1 ◽

2021 ◽

Author(s):

Nadège Guiglielmoni ◽

Ramón Rivera-Vicéns ◽

Romain Koszul ◽

Jean-François Flot

Keyword(s):

Genome Assembly ◽

Current Knowledge ◽

Genome Structure ◽

Deep Dive ◽

Sequencing Technologies ◽

Current State ◽

Animal Diversity ◽

And Function ◽

Genome Assemblies ◽

Genome Projects

Non-vertebrate species represent about ~95% of known metazoan (animal) diversity. They remain to this day relatively unexplored genetically, but understanding their genome structure and function is pivotal for expanding our current knowledge of evolution, ecology and biodiversity. Following the continuous improvements and decreasing costs of sequencing technologies, many genome assembly tools have been released, leading to a significant amount of genome projects being completed in recent years. In this review, we examine the current state of genome projects of non-vertebrate animal species. We present an overview of available sequencing technologies, assembly approaches, as well as pre and post-processing steps, genome assembly evaluation methods, and their application to non-vertebrate animal genomes.

Get full-text (via PubEx)

Functional organization of the human 4D Nucleome

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1505822112 ◽

2015 ◽

Vol 112 (26) ◽

pp. 8002-8007 ◽

Cited By ~ 71

Author(s):

Haiming Chen ◽

Jie Chen ◽

Lindsey A. Muir ◽

Scott Ronquist ◽

Walter Meixner ◽

...

Keyword(s):

3D Imaging ◽

Functional Organization ◽

Genome Structure ◽

Structural Features ◽

Topological Stability ◽

Chromosome Conformation ◽

Dynamical Interaction ◽

Gene Pairs ◽

Genome Wide ◽

And Function

The 4D organization of the interphase nucleus, or the 4D Nucleome (4DN), reflects a dynamical interaction between 3D genome structure and function and its relationship to phenotype. We present initial analyses of the human 4DN, capturing genome-wide structure using chromosome conformation capture and 3D imaging, and function using RNA-sequencing. We introduce a quantitative index that measures underlying topological stability of a genomic region. Our results show that structural features of genomic regions correlate with function with surprising persistence over time. Furthermore, constructing genome-wide gene-level contact maps aided in identifying gene pairs with high potential for coregulation and colocalization in a manner consistent with expression via transcription factories. We additionally use 2D phase planes to visualize patterns in 4DN data. Finally, we evaluated gene pairs within a circadian gene module using 3D imaging, and found periodicity in the movement of clock circadian regulator and period circadian clock 2 relative to each other that followed a circadian rhythm and entrained with their expression.

Get full-text (via PubEx)

A mechanistic model of linkage analysis in allohexaploids

10.1101/035139 ◽

2015 ◽

Author(s):

Huan Li ◽

Xuli Zhu ◽

Ke Mao ◽

Rongling Wu ◽

Qin Yan

Keyword(s):

Linkage Analysis ◽

Mechanistic Model ◽

Genome Structure ◽

Real Data ◽

Biological Research ◽

Preferential Pairing ◽

Modeling Framework ◽

Ploidy Levels ◽

The Em Algorithm ◽

And Function

Despite their pivotal role in agriculture and biological research, polyploids, a group of organisms with more than two sets of chromosomes, are very difficult to study. Increasing studies have used high-density genetic linkage maps to investigate the genome structure and function of polyploids and to identify genes underlying polyploid traits. However, although models for linkage analysis have been well established for diploids, with some essential modifications for tetraploids, no models have been available thus far for polyploids at higher ploidy levels. The linkage analysis of polyploids typically requires knowledge about their meiotic mechanisms, depending on the origin of polyplody. Here we describe a computational modeling framework for linkage analysis in allohexaploids by integrating their preferential chromosomal-pairing meiotic feature into a mixture model setting. The framework, implemented with the EM algorithm, allows the simultaneous estimates of preferential pairing factors and the recombination fraction. We investigated statistical properties of the framework through extensive computer simulation and validated its usefulness and utility by analyzing a real data from a full-sib family of allohexaploid persimmon. Our attempt in linkage analysis of allohexaploids by incorporating their meiotic mechanism lays a foundation for allohexaploid genetic mapping and also provides a new horizon to explore allohexaploid parental kinship.

Get full-text (via PubEx)

Large-Scale Profiling of RBP-circRNA Interactions from Public CLIP-Seq Datasets

10.20944/preprints201911.0202.v1 ◽

2019 ◽

Author(s):

Minzhe Zhang ◽

Tao Wang ◽

Guanghua Xiao ◽

Yang Xie

Keyword(s):

Large Scale ◽

Rna Binding ◽

Rna Binding Proteins ◽

Read Length ◽

Circular Rnas ◽

Binding Partners ◽

Genome Wide ◽

Wide Scale ◽

And Function ◽

Short Read Length

Circular RNAs are a special type of RNAs which recently attracted a lot of research interest in studying its formation and function. RNA binding proteins (RBPs) that bind circRNAs are important in these processes but are relatively less studied. CLIP-Seq technology has been invented and applied to profile RBP-RNA interactions on the genome-wide scale. While mRNAs are usually the focus of CLIP-Seq experiments, RBP-circRNA interactions could also be identified through specialized analysis of CLIP-Seq datasets. However, many technical difficulties are involved in this process, such as the usually short read length of CLIP-Seq reads. In this study, we created a pipeline called Clirc specialized for profiling circRNAs in CLIP-Seq data and analyzing the characteristics of RBP- circRNAs interactions. In conclusion, this is one of the first few studies to investigate circRNAs and their binding partners through repurposing CLIP-Seq datasets to our knowledge, and we hope our work will become a valuable resource for future studies into the biogenesis and function of circRNAs. Clirc software is available at https://github.com/Minzhe/Clirc

Get full-text (via PubEx)

From single nuclei to whole genome assemblies

10.1101/625814 ◽

2019 ◽

Cited By ~ 3

Author(s):

Merce Montoliu-Nerin ◽

Marisol Sánchez-García ◽

Claudia Bergin ◽

Manfred Grabherr ◽

Barbara Ellis ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Genomic Data ◽

Life Cycles ◽

Genomic Research ◽

Metagenomic Data ◽

Model Organisms ◽

Genomic Study ◽

And Function ◽

Genome Assemblies

SummaryA large proportion of Earth's biodiversity constitutes organisms that cannot be cultured, have cryptic life-cycles and/or live submerged within their substrates1–4. Genomic data are key to unravel both their identity and function5. The development of metagenomic methods6,7 and the advent of single cell sequencing8–10 have revolutionized the study of life and function of cryptic organisms by upending the need for large and pure biological material, and allowing generation of genomic data from complex or limited environmental samples. Genome assemblies from metagenomic data have so far been restricted to organisms with small genomes, such as bacteria11, archaea12 and certain eukaryotes13. On the other hand, single cell technologies have allowed the targeting of unicellular organisms, attaining a better resolution than metagenomics8,9,14–16, moreover, it has allowed the genomic study of cells from complex organisms one cell at a time17,18. However, single cell genomics are not easily applied to multicellular organisms formed by consortia of diverse taxa, and the generation of specific workflows for sequencing and data analysis is needed to expand genomic research to the entire tree of life, including sponges19, lichens3,20, intracellular parasites21,22, and plant endophytes23,24. Among the most important plant endophytes are the obligate mutualistic symbionts, arbuscular mycorrhizal (AM) fungi, that pose an additional challenge with their multinucleate coenocytic mycelia25. Here, the development of a novel single nuclei sequencing and assembly workflow is reported. This workflow allows, for the first time, the generation of reference genome assemblies from large scale, unbiased sorted, and sequenced AM fungal nuclei circumventing tedious, and often impossible, culturing efforts. This method opens infinite possibilities for studies of evolution and adaptation in these important plant symbionts and demonstrates that reference genomes can be generated from complex non-model organisms by isolating only a handful of their nuclei.

Get full-text (via PubEx)

High-density linkage map and QTLs for growth in snapper (Chrysophrys auratus)

10.1101/376012 ◽

2018 ◽

Author(s):

David T. Ashton ◽

Peter A. Ritchie ◽

Maren Wellenreuther

Keyword(s):

Linkage Map ◽

Genome Structure ◽

Snp Markers ◽

Biological Research ◽

Phenotypic Traits ◽

Strongly Correlated ◽

Fish Family ◽

Significant Qtls ◽

Snp Data ◽

Genome Wide

ABSTRACTCharacterizing the genetic variation underlying phenotypic traits is a central objective in biological research. This research has been hampered in the past by the limited genomic resources available for most non-model species. However, recent advances in sequencing technology and related genotyping methods are rapidly changing this. Here we report the use of genome-wide SNP data from the ecologically and commercially important marine fish species Chrysophrys auratus (snapper) to 1) construct the first linkage map for this species, 2) scan for growth QTLs, and 3) search for candidate genes in the surrounding QTL regions. The newly constructed linkage map contained ~11K SNP markers and is the densest map to date in the fish family Sparidae. Comparisons with available genome scaffolds indicated that overall marker placement was strongly correlated between the scaffolds and linkage map (R = 0.7), but at fine scales (< 5 cM) there were some precision limitations. Of the 24 linkage groups, which reflect the 24 chromosomes of this species, three were found to contain QTLs with genome-wide significance for growth-related traits. A scan for 13 known candidate growth genes located the genes for growth hormone, parvalbumin, and myogenin within 13.2, 2.6, and 5.0 cM of these genome-wide significant QTLs, respectively. The linkage map and QTLs found in this study will advance the investigation of genome structure and selective breeding in snapper.

Get full-text (via PubEx)

Reconstruction of proto-vertebrate, proto-cyclostome and proto-gnathostome genomes provides new insights into early vertebrate evolution

Nature Communications ◽

10.1038/s41467-021-24573-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Yoichiro Nakatani ◽

Prashant Shingate ◽

Vydianathan Ravi ◽

Nisha E. Pillai ◽

Aravind Prasad ◽

...

Keyword(s):

Evolutionary History ◽

Gene Loss ◽

De Novo ◽

Genome Structure ◽

Origin And Evolution ◽

Long Read ◽

History Of ◽

And Function ◽

Genome Assemblies ◽

Key Questions

AbstractAncient polyploidization events have had a lasting impact on vertebrate genome structure, organization and function. Some key questions regarding the number of ancient polyploidization events and their timing in relation to the cyclostome-gnathostome divergence have remained contentious. Here we generate de novo long-read-based chromosome-scale genome assemblies for the Japanese lamprey and elephant shark. Using these and other representative genomes and developing algorithms for the probabilistic macrosynteny model, we reconstruct high-resolution proto-vertebrate, proto-cyclostome and proto-gnathostome genomes. Our reconstructions resolve key questions regarding the early evolutionary history of vertebrates. First, cyclostomes diverged from the lineage leading to gnathostomes after a shared tetraploidization (1R) but before a gnathostome-specific tetraploidization (2R). Second, the cyclostome lineage experienced an additional hexaploidization. Third, 2R in the gnathostome lineage was an allotetraploidization event, and biased gene loss from one of the subgenomes shaped the gnathostome genome by giving rise to remarkably conserved microchromosomes. Thus, our reconstructions reveal the major evolutionary events and offer new insights into the origin and evolution of vertebrate genomes.

Get full-text (via PubEx)

Genome-wide identification and expression analysis of the ERF transcription factor family in pineapple (Ananas comosus (L.) Merr.)

PeerJ ◽

10.7717/peerj.10014 ◽

2020 ◽

Vol 8 ◽

pp. e10014 ◽

Cited By ~ 1

Author(s):

Youmei Huang ◽

Yanhui Liu ◽

Man Zhang ◽

Mengnan Chai ◽

Qing He ◽

...

Keyword(s):

Developmental Stages ◽

Developmental Process ◽

Economic Value ◽

Ananas Comosus ◽

Functional Verification ◽

Tropical Fruit ◽

Genome Wide ◽

Erf Family ◽

Gene Structures ◽

And Function

Pineapple (Ananas comosus (L.) Merr.) is an important tropical fruit with high economic value. The quality and yield of pineapple will be affected by various environmental conditions. Under adverse conditions, plants can produce a complex reaction mechanism to enhance their resistance. It has been reported that the member of ethylene responsive transcription factors (ERFs) plays a crucial role in plant developmental process and stress response. However, the function of these proteins in pineapple remains limited. In this study, a total of 74 ERF genes (AcoERFs) were identified in pineapple genome, named from AcoERF1 to AcoERF74, and divided into 13 groups based on phylogenetic analysis. We also analyzed gene structure, conserved motif and chromosomal location of AcoERFs, and the AcoERFs within the same group possess similar gene structures and motif compositions. Three genes (AcoERF71, AcoERF73 and AcoERF74) were present on unanchored scaffolds, so they could not be conclusively mapped on chromosome. Synteny and cis-elements analysis of ERF genes provided deep insight into the evolution and function of pineapple ERF genes. Furthermore, we analyzed the expression profiling of AcoERF in different tissues and developmental stages, and 22 AcoERF genes were expressed in all examined tissues, in which five genes (AcoERF13, AcoERF16, AcoERF31, AcoERF42, and AcoERF65) had high expression levels. Additionally, nine AcoERF genes were selected for functional verification by qRT-PCR. These results provide useful information for further investigating the evolution and functions of ERF family in pineapple.

Get full-text (via PubEx)

Cancer mutational signatures representation by large-scale context embedding

Bioinformatics ◽

10.1093/bioinformatics/btaa433 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i309-i316 ◽

Cited By ~ 1

Author(s):

Yang Zhang ◽

Yunxuan Xiao ◽

Muyu Yang ◽

Jian Ma

Keyword(s):

Breast Cancer ◽

Large Scale ◽

Somatic Mutations ◽

Genome Structure ◽

Superior Performance ◽

Supplementary Information ◽

Patient Specific ◽

Cancer Subtypes ◽

Cancer Heterogeneity ◽

And Function

Abstract Motivation The accumulation of somatic mutations plays critical roles in cancer development and progression. However, the global patterns of somatic mutations, especially non-coding mutations, and their roles in defining molecular subtypes of cancer have not been well characterized due to the computational challenges in analysing the complex mutational patterns. Results Here, we develop a new algorithm, called MutSpace, to effectively extract patient-specific mutational features using an embedding framework for larger sequence context. Our method is motivated by the observation that the mutation rate at megabase scale and the local mutational patterns jointly contribute to distinguishing cancer subtypes, both of which can be simultaneously captured by MutSpace. Simulation evaluations show that MutSpace can effectively characterize mutational features from known patient subgroups and achieve superior performance compared with previous methods. As a proof-of-principle, we apply MutSpace to 560 breast cancer patient samples and demonstrate that our method achieves high accuracy in subtype identification. In addition, the learned embeddings from MutSpace reflect intrinsic patterns of breast cancer subtypes and other features of genome structure and function. MutSpace is a promising new framework to better understand cancer heterogeneity based on somatic mutations. Availability and implementation Source code of MutSpace can be accessed at: https://github.com/ma-compbio/MutSpace. Supplementary information Supplementary data are available at Bioinformatics online.

Get full-text (via PubEx)