Tandem Duplicate Genes in Maize are Abundant and Date to Two Distinct Periods of Time

Mapping Intimacies ◽

10.1101/238121 ◽

2017 ◽

Cited By ~ 1

Author(s):

Thomas J. Y. Kono ◽

Alex B. Brohammer ◽

Suzanne E. McGaugh ◽

Candice N. Hirsch

Keyword(s):

Size Distribution ◽

Genomic Instability ◽

Phenotypic Variation ◽

Evolutionary History ◽

De Novo ◽

Bimodal Distribution ◽

Duplicate Genes ◽

Nonhomologous Recombination ◽

Genome Assemblies ◽

Genomic Neighborhood

ABSTRACTTandem duplicate genes are proximally duplicated and as such occur in the same genomic neighborhood. Using the maize B73 and PH207 de novo genome assemblies, we identified thousands of tandem gene duplicates that account for ~10% of the genes. These tandem duplicates have a bimodal distribution of estimated ages corresponding to known periods of genomic instability. Tandem duplicates had a number of associated features that suggest origins in nonhomologous recombination based on smaller size distribution and higher rate of containing LTRs than non-tandem duplicates. Within relatively recent tandem duplicate genes, ~26% appear to be undergoing degeneration or divergence in function from the ancestral copy. Our results show that tandem duplicates are abundant in maize, arose in bursts throughout maize evolutionary history under multiple potential mechanisms, and may provide a substrate for novel phenotypic variation.

Reconstruction of proto-vertebrate, proto-cyclostome and proto-gnathostome genomes provides new insights into early vertebrate evolution

Nature Communications ◽

10.1038/s41467-021-24573-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Yoichiro Nakatani ◽

Prashant Shingate ◽

Vydianathan Ravi ◽

Nisha E. Pillai ◽

Aravind Prasad ◽

...

Keyword(s):

Evolutionary History ◽

Gene Loss ◽

De Novo ◽

Genome Structure ◽

Origin And Evolution ◽

Long Read ◽

History Of ◽

And Function ◽

Genome Assemblies ◽

Key Questions

AbstractAncient polyploidization events have had a lasting impact on vertebrate genome structure, organization and function. Some key questions regarding the number of ancient polyploidization events and their timing in relation to the cyclostome-gnathostome divergence have remained contentious. Here we generate de novo long-read-based chromosome-scale genome assemblies for the Japanese lamprey and elephant shark. Using these and other representative genomes and developing algorithms for the probabilistic macrosynteny model, we reconstruct high-resolution proto-vertebrate, proto-cyclostome and proto-gnathostome genomes. Our reconstructions resolve key questions regarding the early evolutionary history of vertebrates. First, cyclostomes diverged from the lineage leading to gnathostomes after a shared tetraploidization (1R) but before a gnathostome-specific tetraploidization (2R). Second, the cyclostome lineage experienced an additional hexaploidization. Third, 2R in the gnathostome lineage was an allotetraploidization event, and biased gene loss from one of the subgenomes shaped the gnathostome genome by giving rise to remarkably conserved microchromosomes. Thus, our reconstructions reveal the major evolutionary events and offer new insights into the origin and evolution of vertebrate genomes.

Mycenagenomes resolve the evolution of fungal bioluminescence

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2010761117 ◽

2020 ◽

Vol 117 (49) ◽

pp. 31267-31277

Author(s):

Huei-Mien Ke ◽

Hsin-Han Lee ◽

Chan-Yi Ivy Lin ◽

Yu-Ching Liu ◽

Min R. Lu ◽

...

Keyword(s):

Dna Methylation ◽

Evolutionary History ◽

Common Ancestor ◽

Developmental Stages ◽

De Novo ◽

Fungal Species ◽

Last Common Ancestor ◽

Diverse Group ◽

Mushroom Species ◽

Genome Assemblies

Mushroom-forming fungi in the order Agaricales represent an independent origin of bioluminescence in the tree of life; yet the diversity, evolutionary history, and timing of the origin of fungal luciferases remain elusive. We sequenced the genomes and transcriptomes of five bonnet mushroom species (Mycenaspp.), a diverse lineage comprising the majority of bioluminescent fungi. Two species with haploid genome assemblies ∼150 Mb are among the largest in Agaricales, and we found that a variety of repeats betweenMycenaspecies were differentially mediated by DNA methylation. We show that bioluminescence evolved in the last common ancestor of mycenoid and the marasmioid clade of Agaricales and was maintained through at least 160 million years of evolution. Analyses of synteny across genomes of bioluminescent species resolved how the luciferase cluster was derived by duplication and translocation, frequently rearranged and lost in mostMycenaspecies, but conserved in theArmillarialineage. Luciferase cluster members were coexpressed across developmental stages, with the highest expression in fruiting body caps and stipes, suggesting fruiting-related adaptive functions. Our results contribute to understanding a de novo origin of bioluminescence and the corresponding gene cluster in a diverse group of enigmatic fungal species.

Correcting bias from stochastic insert size in read pair data — applications to structural variation detection and genome assembly

10.1101/023929 ◽

2015 ◽

Cited By ~ 1

Author(s):

Kristoffer Sahlin ◽

Mattias Frånberg ◽

Lars Arvestad

Keyword(s):

Size Distribution ◽

Genome Assembly ◽

Structural Variation ◽

De Novo ◽

State Of The Art ◽

Size Distributions ◽

Insert Size ◽

Genome Assemblies ◽

Paired Read ◽

Insert Size Distribution

Insert size distributions from paired read protocols are used for inference in bioinformatic applications such as genome assembly and structural variation detection. However, many of the models that are being used are subject to bias. This bias arises when we assume that all insert sizes within a distribution are equally likely to be observed, when in fact, size matters. These systematic errors exist in popular software even when the assumptions made about data are true. We have previously shown that bias occurs for scaffolders in genome assembly. Here, we generalize the theory and demonstrate that it is applicable in other contexts. We provide examples of bias in state-of the-art software and improve them using our model. One key application of our theory is structural variation detection using read pairs. We show that an incorrect null-hypothesis is commonly used in popular tools and can be corrected using our theory. Furthermore, we approximate the smallest size of indels that are possible to discover given an insert size distribution. Two other applications are inference of insert size distribution on \emph{de novo} genome assemblies and error correction of genome assemblies using mated reads. Our theory is implemented in a tool called GetDistr (\url{https://github.com/ksahlin/GetDistr}).

Venom-gland transcriptomic, venomic, and antivenomic profiles of the spine-bellied sea snake (Hydrophis curtus) from the South China Sea

BMC Genomics ◽

10.1186/s12864-021-07824-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Hong-Yan Zhao ◽

Lin Wen ◽

Yu-Feng Miao ◽

Yu Du ◽

Yan Sun ◽

...

Keyword(s):

South China Sea ◽

South China ◽

Evolutionary History ◽

De Novo ◽

Venom Gland ◽

The South China Sea ◽

Protein Families ◽

The South ◽

Widespread Species ◽

China Sea

Abstract Background A comprehensive evaluation of the -omic profiles of venom is important for understanding the potential function and evolution of snake venom. Here, we conducted an integrated multi-omics-analysis to unveil the venom-transcriptomic and venomic profiles in a same group of spine-bellied sea snakes (Hydrophis curtus) from the South China Sea, where the snake is a widespread species and might generate regionally-specific venom potentially harmful to human activities. The capacity of two heterologous antivenoms to immunocapture the H. curtus venom was determined for an in-depth evaluation of their rationality in treatment of H. curtus envenomation. In addition, a phylogenetic analysis by maximum likelihood was used to detect the adaptive molecular evolution of full-length toxin-coding unigenes. Results A total of 90,909,384 pairs of clean reads were generated via Illumina sequencing from a pooled cDNA library of six specimens, and yielding 148,121 unigenes through de novo assembly. Sequence similarity searching harvested 63,845 valid annotations, including 63,789 non-toxin-coding and 56 toxin-coding unigenes belonging to 22 protein families. Three protein families, three-finger toxins (3-FTx), phospholipase A2 (PLA2), and cysteine-rich secretory protein, were detected in the venom proteome. 3-FTx (27.15% in the transcriptome/41.94% in the proteome) and PLA2 (59.71%/49.36%) were identified as the most abundant families in the venom-gland transcriptome and venom proteome. In addition, 24 unigenes from 11 protein families were shown to have experienced positive selection in their evolutionary history, whereas four were relatively conserved throughout evolution. Commercial Naja atra antivenom exhibited a stronger capacity than Bungarus multicinctus antivenom to immunocapture H. curtus venom components, especially short neurotoxins, with the capacity of both antivenoms to immunocapture short neurotoxins being weaker than that for PLA2s. Conclusions Our study clarified the venom-gland transcriptomic and venomic profiles along with the within-group divergence of a H. curtus population from the South China Sea. Adaptive evolution of most venom components driven by natural selection appeared to occur rapidly during evolutionary history. Notably, the utility of commercial N. atra and B. multicinctus antivenoms against H. curtus toxins was not comprehensive; thus, the development of species-specific antivenom is urgently needed.

Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C

Nature Communications ◽

10.1038/s41467-020-20536-y ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Zev N. Kronenberg ◽

Arang Rhie ◽

Sergey Koren ◽

Gregory T. Concepcion ◽

Paul Peluso ◽

...

Keyword(s):

Zebra Finch ◽

Cultured Cells ◽

De Novo ◽

Single Cells ◽

Variant Calling ◽

Chromatin Interaction ◽

Extended Haplotype ◽

Benchmark Datasets ◽

And Performance ◽

Genome Assemblies

AbstractHaplotype-resolved genome assemblies are important for understanding how combinations of variants impact phenotypes. To date, these assemblies have been best created with complex protocols, such as cultured cells that contain a single-haplotype (haploid) genome, single cells where haplotypes are separated, or co-sequencing of parental genomes in a trio-based approach. These approaches are impractical in most situations. To address this issue, we present FALCON-Phase, a phasing tool that uses ultra-long-range Hi-C chromatin interaction data to extend phase blocks of partially-phased diploid assembles to chromosome or scaffold scale. FALCON-Phase uses the inherent phasing information in Hi-C reads, skipping variant calling, and reduces the computational complexity of phasing. Our method is validated on three benchmark datasets generated as part of the Vertebrate Genomes Project (VGP), including human, cow, and zebra finch, for which high-quality, fully haplotype-resolved assemblies are available using the trio-based approach. FALCON-Phase is accurate without having parental data and performance is better in samples with higher heterozygosity. For cow and zebra finch the accuracy is 97% compared to 80–91% for human. FALCON-Phase is applicable to any draft assembly that contains long primary contigs and phased associate contigs.

Systematic evidence from gene duplication and regulation

Australian Systematic Botany ◽

10.1071/sb9900145 ◽

1990 ◽

Vol 3 (1) ◽

pp. 145

Author(s):

DJ Colgan

Keyword(s):

Gene Duplication ◽

Evolutionary History ◽

Tandem Duplication ◽

Fragment Length Polymorphism ◽

Duplicate Genes ◽

Polymorphism Analysis ◽

Multiple Origins ◽

The Family ◽

History Of ◽

Phylogenetic Hypotheses

This paper is a review of the use of information regarding the presence of duplicate genes and their regulation in systematics. The review concentrates on data derived from protein electrophoresis and restriction fragment length polymorphism analysis. The appearance of a duplication in a subset of a group of species implies that the members of the subset belong to the same clade. Suppression of the duplication may render this clade apparently paraphyletic, but may itself be informative of relations within the lineage through patterns of loss of expression in all, or some tissues, or through restrictions of the formation of functional heteropolymers in polymeric enzymes. Examples are given of studies which have used such information to establish phylogenetic hypotheses at the family level, to identify an auto- or allo-polyploid origin of polyploid species and to determine whether there have been single or multiple origins of such species. The likelihood of homoplasy in the patterns of appearance and regulation of duplicates depends on the molecular basis of the duplication. In particular, the contrast between the expected consequences of tandem duplication and the expression of pseudogenes emphasises the value of determining the mechanism of the original duplication. Many instances of sporadic gene duplication are now known, and polyploidisation is a common event in the evolutionary history of both plants and animals. So the opportunities to discover duplicationrelated characters will arise in many systematic studies. A program is presented to increase the chances that such useful information will be recognisable during the studies.

Macrosynteny analysis between Lentinula edodes and Lentinula novae-zelandiae reveals signals of domestication in Lentinula edodes

Scientific Reports ◽

10.1038/s41598-021-89146-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Christopher Alan Smith

Keyword(s):

Evolutionary History ◽

Lentinula Edodes ◽

Reference Genome ◽

Genomic Change ◽

Diverse Species ◽

The World ◽

Cultivated Mushroom ◽

Genome Assemblies ◽

High Degree ◽

Chromosome Level

AbstractThe basidiomycete fungus Lentinula novae-zelandiae is endemic to New Zealand and is a sister taxon to Lentinula edodes, the second most cultivated mushroom in the world. To explore the biology of this organism, a high-quality chromosome level reference genome of L. novae-zelandiae was produced. Macrosyntenic comparisons between the genome assembly of L. novae-zelandiae, L. edodes and a set of three genome assemblies of diverse species from the Agaricomycota reveal a high degree of macrosyntenic restructuring within L. edodes consistent with signal of domestication. These results show L. edodes has undergone significant genomic change during the course of its evolutionary history, likely a result of its cultivation and domestication over the last 1000 years.

Donkey genome and insight into the imprinting of fast karyotype evolution

Scientific Reports ◽

10.1038/srep14106 ◽

2015 ◽

Vol 5 (1) ◽

Cited By ~ 12

Author(s):

Jinlong Huang ◽

Yiping Zhao ◽

Dongyi Bai ◽

Wunierfu Shiraigol ◽

Bei Li ◽

...

Keyword(s):

Target Genes ◽

Karyotype Evolution ◽

De Novo ◽

Cycle Phase ◽

Satellite Sequences ◽

Chromatid Segregation ◽

Karyotypic Instability ◽

Wild Ass ◽

Genome Assemblies ◽

Insight Into

Abstract The donkey, like the horse, is a promising model for exploring karyotypic instability. We report the de novo whole-genome assemblies of the donkey and the Asiatic wild ass. Our results reflect the distinct characteristics of donkeys, including more effective energy metabolism and better immunity than horses. The donkey shows a steady demographic trajectory. We detected abundant satellite sequences in some inactive centromere regions but not in neocentromere regions, while ribosomal RNAs frequently emerged in neocentromere regions but not in the obsolete centromere regions. Expanded miRNA families and five newly discovered miRNA target genes involved in meiosis may be associated with fast karyotype evolution. APC/C, controlling sister chromatid segregation, cytokinesis and the establishment of the G1 cell cycle phase were identified by analysis of miRNA targets and rapidly evolving genes.

Monte Carlo simulation for soot dynamics

Thermal Science ◽

10.2298/tsci1205391z ◽

2012 ◽

Vol 16 (5) ◽

pp. 1391-1394 ◽

Cited By ~ 3

Author(s):

Kun Zhou

Keyword(s):

Particle Size ◽

Monte Carlo ◽

Particle Size Distribution ◽

Size Distribution ◽

Gas Phase ◽

Volume Fraction ◽

Soot Formation ◽

Bimodal Distribution ◽

Number Density ◽

Stochastic Error

A new Monte Carlo method termed Comb-like frame Monte Carlo is developed to simulate the soot dynamics. Detailed stochastic error analysis is provided. Comb-like frame Monte Carlo is coupled with the gas phase solver Chemkin II to simulate soot formation in a 1-D premixed burner stabilized flame. The simulated soot number density, volume fraction, and particle size distribution all agree well with the measurement available in literature. The origin of the bimodal distribution of particle size distribution is revealed with quantitative proof.

Contiguity: Contig adjacency graph construction and visualisation

10.7287/peerj.preprints.1037v1 ◽

2015 ◽

Cited By ~ 8

Author(s):

Mitchell J Sullivan ◽

Nouri L Ben Zakour ◽

Brian M Forde ◽

Mitchell Stanton-Cook ◽

Scott A Beatson

Keyword(s):

De Novo ◽

Reference Sequence ◽

De Bruijn Graph ◽

Interactive Software ◽

Graph Exploration ◽

Adjacency Graph ◽

Highly Sensitive ◽

Long Read ◽

Genome Assemblies ◽

Adjacency Graphs

Contiguity is an interactive software for the visualization and manipulation of de novo genome assemblies. Contiguity creates and displays information on contig adjacency which is contextualized by the simultaneous display of a comparison between assembled contigs and reference sequence. Where scaffolders allow unambiguous connections between contigs to be resolved into a single scaffold, Contiguity allows the user to create all potential scaffolds in ambiguous regions of the genome. This enables the resolution of novel sequence or structural variants from the assembly. In addition, Contiguity provides a sequencing and assembly agnostic approach for the creation of contig adjacency graphs. To maximize the number of contig adjacencies determined, Contiguity combines information from read pair mappings, sequence overlap and De Bruijn graph exploration. We demonstrate how highly sensitive graphs can be achieved using this method. Contig adjacency graphs allow the user to visualize potential arrangements of contigs in unresolvable areas of the genome. By combining adjacency information with comparative genomics, Contiguity provides an intuitive approach for exploring and improving sequence assemblies. It is also useful in guiding manual closure of long read sequence assemblies. Contiguity is an open source application, implemented using Python and the Tkinter GUI package that can run on any Unix, OSX and Windows operating system. It has been designed and optimized for bacterial assemblies. Contiguity is available at http://mjsull.github.io/Contiguity .