The impact of epigenetic information on genome evolution

AbstractEvidence for biological adaptation is often obtained by studying DNA sequence evolution. Since the analyses are affected by both positive and negative selection, studies usually assume constant negative selection in the time span of interest. For this reason, hundreds of studies that conclude adaptive evolution might have reported false signals caused by relaxed negative selection. We test this suspicion two ways. First, we analyze the fluctuation in population size, N, during evolution. For example, the evolutionary rate in the primate phylogeny could vary by as much as 2000 fold due to the variation in N alone. Second, we measure the variation in negative selection directly by analyzing the polymorphism data from four taxa (Drosophila, Arabidopsis, primates, and birds, with 64 species in total). The strength of negative selection, as measured by the ratio of nonsynonymous/synonymous polymorphisms, fluctuates strongly and at multiple time scales. The two approaches suggest that the variation in the strength of negative selection may be responsible for the bulk of the reported adaptive genome evolution in the last two decades. This study corroborates the recent report1 on the inconsistent patterns of adaptive genome evolution. Finally, we discuss the path forward in detecting adaptive sequence evolution.

Download Full-text

Intrinsic laws of k-mer spectra of genome sequences and evolution mechanism of genomes

10.21203/rs.3.rs-33500/v2 ◽

2020 ◽

Author(s):

Zhenhua Yang ◽

Hong Li ◽

Yun Jia ◽

Yan Zheng ◽

Hu Meng ◽

...

Keyword(s):

Genome Evolution ◽

Genome Sequence ◽

Dna Sequences ◽

Extreme Environments ◽

Sequence Evolution ◽

Genome Sequences ◽

Mutual Inhibition ◽

Evolution Mechanism ◽

Sequence Composition

Abstract Background K-mer spectra of DNA sequences contain important information about sequence composition and sequence evolution. We want to reveal the evolution rules of genome sequences by studying the k-mer spectra of genome sequences. Results The intrinsic laws of k-mer spectra of 920 genome sequences from primate to prokaryote were analyzed. We found that there are two types of evolution selection modes in genome sequences, named as CG Independent Selection and TA Independent Selection. There is a mutual inhibition relationship between CG and TA independent selections. We found that the intensity of CG and TA independent selections correlates closely with genome evolution and G+C content of genome sequences. The living habits of species are related closely to the independent selection modes adopted by species genomes. Consequently, we proposed an evolution mechanism of genomes in which the genome evolution is determined by the intensities of the CG and TA independent selections and the mutual inhibition relationship. Besides, by the evolution mechanism of genomes, we speculated the evolution modes of prokaryotes in mild and extreme environments in the anaerobic age and the evolving process of prokaryotes from anaerobic to aerobic environment on earth as well as the originations of different eukaryotes. Conclusion We found that there are two independent selection modes in genome sequences. The evolution of genome sequence is determined by the two independent selection modes and the mutual inhibition relationship between them.

Download Full-text

Generating Stable Knockout Zebrafish Lines by Deleting Large Chromosomal Fragments Using Multiple gRNAs

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.401035 ◽

2020 ◽

Vol 10 (3) ◽

pp. 1029-1037 ◽

Cited By ~ 3

Author(s):

Brian H. Kim ◽

GuangJun Zhang

Keyword(s):

Protein Translation ◽

Loss Of Function ◽

Mrna Transcription ◽

Cell Stage ◽

Knock Out ◽

Dna Mutation ◽

Guide Rna ◽

Genetic Tool ◽

Dna Mutations ◽

The Impact

The CRISPR (clustered regularly interspaced short palindromic repeats) and Cas9 (CRISPR associated protein 9) system has been successfully adopted as a versatile genetic tool for functional manipulations, due to its convenience and effectiveness. Genetics lesions induced by single guide RNA (gRNA) are usually small indel (insertion-deletion) DNA mutations. The impact of this type of CRISPR-induced DNA mutation on the coded mRNA transcription processing and protein translation can be complex. Unexpected or unknown transcripts, generated through alternative splicing, may impede the generation of successful loss-of-function mutants. To create null or null-like loss-of-function mutant zebrafish, we employed simultaneous multiple gRNA injection into single-cell stage embryos. We demonstrated that DNA composed of multiple exons, up to 78kb in length, can be deleted in the smarca2 gene locus. Additionally, two different genes (rnf185 and rnf215) were successfully mutated in F1 fish with multiple exon deletions using this multiplex gRNA injection strategy. We expect this approach will be useful for knock-out studies in zebrafish and other vertebrate organisms, especially when the phenotype of a single gRNA-induced mutant is not clear.

Download Full-text

Intrinsic laws of k-mer spectra of genome sequences and evolution mechanism of genomes

BMC Evolutionary Biology ◽

10.1186/s12862-020-01723-3 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Zhenhua Yang ◽

Hong Li ◽

Yun Jia ◽

Yan Zheng ◽

Hu Meng ◽

...

Keyword(s):

Genome Evolution ◽

Genome Sequence ◽

Dna Sequences ◽

Extreme Environments ◽

Sequence Evolution ◽

Genome Sequences ◽

Mutual Inhibition ◽

Evolution Mechanism ◽

Sequence Composition

Abstract Background K-mer spectra of DNA sequences contain important information about sequence composition and sequence evolution. We want to reveal the evolution rules of genome sequences by studying the k-mer spectra of genome sequences. Results The intrinsic laws of k-mer spectra of 920 genome sequences from primate to prokaryote were analyzed. We found that there are two types of evolution selection modes in genome sequences, named as CG Independent Selection and TA Independent Selection. There is a mutual inhibition relationship between CG and TA independent selections. We found that the intensity of CG and TA independent selections correlates closely with genome evolution and G + C content of genome sequences. The living habits of species are related closely to the independent selection modes adopted by species genomes. Consequently, we proposed an evolution mechanism of genomes in which the genome evolution is determined by the intensities of the CG and TA independent selections and the mutual inhibition relationship. Besides, by the evolution mechanism of genomes, we speculated the evolution modes of prokaryotes in mild and extreme environments in the anaerobic age and the evolving process of prokaryotes from anaerobic to aerobic environment on earth as well as the originations of different eukaryotes. Conclusion We found that there are two independent selection modes in genome sequences. The evolution of genome sequence is determined by the two independent selection modes and the mutual inhibition relationship between them.

Download Full-text

In Silico Estimation of the Abundance and Phylogenetic Significance of the Composite Oct4-Sox2 Binding Motifs within a Wide Range of Species

Data ◽

10.3390/data5040111 ◽

2020 ◽

Vol 5 (4) ◽

pp. 111

Author(s):

Arman Kulyyassov ◽

Ruslan Kalendar

Keyword(s):

High Throughput ◽

Dna Sequences ◽

High Throughput Sequencing ◽

Regulatory Elements ◽

Scoring Method ◽

Sequencing Technologies ◽

Wide Range ◽

Multiple Species ◽

Eukaryotic Organisms ◽

The Impact

High-throughput sequencing technologies have greatly accelerated the progress of genomics, transcriptomics, and metagenomics. Currently, a large amount of genomic data from various organisms is being generated, the volume of which is increasing every year. Therefore, the development of methods that allow the rapid search and analysis of DNA sequences is urgent. Here, we present a novel motif-based high-throughput sequence scoring method that generates genome information. We found and identified Utf1-like, Fgf4-like, and Hoxb1-like motifs, which are cis-regulatory elements for the pluripotency transcription factors Sox2 and Oct4 within the genomes of different eukaryotic organisms. The genome-wide analysis of these motifs was performed to understand the impact of their diversification on mammalian genome evolution. Utf1-like, Fgf4-like, and Hoxb1-like motif diversity was evaluated across genomes from multiple species.

Download Full-text

Intrinsic laws of k-mer spectra of genome sequences and evolution mechanism of genomes

10.21203/rs.3.rs-33500/v1 ◽

2020 ◽

Author(s):

Yang Zhenhua ◽

Hong Li ◽

Jia Yun ◽

Zheng Yan ◽

Meng Hu ◽

...

Keyword(s):

Genome Evolution ◽

Genome Sequence ◽

Dna Sequences ◽

Extreme Environments ◽

Sequence Evolution ◽

Genome Sequences ◽

Mutual Inhibition ◽

Evolution Mechanism ◽

Sequence Composition

Abstract Background K-mer spectra of DNA sequences contain important information about sequence composition and sequence evolution. We want to reveal the evolution rules of genome sequences by studying the k-mer spectra of genome sequences. Results The intrinsic laws of k-mer spectra of 920 genome sequences from primate to prokaryote were analyzed. We found that there are two types of evolution selection modes in genome sequences, named as CG Independent Selection and TA Independent Selection. There is a mutual inhibition relationship between CG and TA independent selections. We found that the intensity of CG and TA independent selections correlates closely with genome evolution and G+C content of genome sequences. The living habits of species are related closely to the independent selection modes adopted by species genomes. Consequently, we proposed an evolution mechanism of genomes in which the genome evolution is determined by the intensities of the CG and TA independent selections and the mutual inhibition relationship. Besides, by the evolution mechanism of genomes, we speculated the evolution modes of prokaryotes in mild and extreme environments in the anaerobic age and the evolving process of prokaryotes from anaerobic to aerobic environment on earth as well as the originations of different eukaryotes. Conclusion We found that there are two independent selection modes in genome sequences. The evolution of genome sequence is determined by the two independent selection modes and the mutual inhibition relationship between them.

Download Full-text

DNA sequence mapping in interphase and metaphase chromosomes by fluorescence in situ hybridization

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100122885 ◽

1992 ◽

Vol 50 (1) ◽

pp. 496-497

Author(s):

Barbara Trask ◽

Susan Allen ◽

Anne Bergmann ◽

Mari Christensen ◽

Anne Fertitta ◽

...

Keyword(s):

In Situ Hybridization ◽

Dna Sequence ◽

Dna Sequences ◽

Dual Band ◽

Nick Translation ◽

Metaphase Chromosomes ◽

Band Pass ◽

Texas Red ◽

Fluorescent Spot

Using fluorescence in situ hybridization (FISH), the positions of DNA sequences can be discretely marked with a fluorescent spot. The efficiency of marking DNA sequences of the size cloned in cosmids is 90-95%, and the fluorescent spots produced after FISH are ≈0.3 μm in diameter. Sites of two sequences can be distinguished using two-color FISH. Different reporter molecules, such as biotin or digoxigenin, are incorporated into DNA sequence probes by nick translation. These reporter molecules are labeled after hybridization with different fluorochromes, e.g., FITC and Texas Red. The development of dual band pass filters (Chromatechnology) allows these fluorochromes to be photographed simultaneously without registration shift.

Download Full-text

DNA thermodynamics shape chromosome organization and topology

Biochemical Society Transactions ◽

10.1042/bst20120334 ◽

2013 ◽

Vol 41 (2) ◽

pp. 548-553 ◽

Cited By ~ 13

Author(s):

Andrew A. Travers ◽

Georgi Muskhelishvili

Keyword(s):

Dna Sequence ◽

Dna Sequences ◽

Chromosome Organization ◽

Topological Properties ◽

Genetic Organization ◽

And Topology ◽

Dna Translocases

How much information is encoded in the DNA sequence of an organism? We argue that the informational, mechanical and topological properties of DNA are interdependent and act together to specify the primary characteristics of genetic organization and chromatin structures. Superhelicity generated in vivo, in part by the action of DNA translocases, can be transmitted to topologically sensitive regions encoded by less stable DNA sequences.

Download Full-text

Systematic benchmark of ancient DNA read mapping

Briefings in Bioinformatics ◽

10.1093/bib/bbab076 ◽

2021 ◽

Author(s):

Adrien Oliva ◽

Raymond Tobler ◽

Alan Cooper ◽

Bastien Llamas ◽

Yassine Souilmi

Keyword(s):

Ancient Dna ◽

Dna Sequences ◽

Population Genetic ◽

Reference Genome ◽

Population Data ◽

Human Populations ◽

Current Standard ◽

Read Mapping ◽

Reference Bias ◽

The Impact

Abstract The current standard practice for assembling individual genomes involves mapping millions of short DNA sequences (also known as DNA ‘reads’) against a pre-constructed reference genome. Mapping vast amounts of short reads in a timely manner is a computationally challenging task that inevitably produces artefacts, including biases against alleles not found in the reference genome. This reference bias and other mapping artefacts are expected to be exacerbated in ancient DNA (aDNA) studies, which rely on the analysis of low quantities of damaged and very short DNA fragments (~30–80 bp). Nevertheless, the current gold-standard mapping strategies for aDNA studies have effectively remained unchanged for nearly a decade, during which time new software has emerged. In this study, we used simulated aDNA reads from three different human populations to benchmark the performance of 30 distinct mapping strategies implemented across four different read mapping software—BWA-aln, BWA-mem, NovoAlign and Bowtie2—and quantified the impact of reference bias in downstream population genetic analyses. We show that specific NovoAlign, BWA-aln and BWA-mem parameterizations achieve high mapping precision with low levels of reference bias, particularly after filtering out reads with low mapping qualities. However, unbiased NovoAlign results required the use of an IUPAC reference genome. While relevant only to aDNA projects where reference population data are available, the benefit of using an IUPAC reference demonstrates the value of incorporating population genetic information into the aDNA mapping process, echoing recent results based on graph genome representations.

Download Full-text

An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses

Human Genomics ◽

10.1186/s40246-021-00327-2 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Anastasios A. Tsonis ◽

Geli Wang ◽

Lvyi Zhang ◽

Wenxu Lu ◽

Aristotle Kayafas ◽

...

Keyword(s):

Mathematical Method ◽

Dna Sequence ◽

Dna Sequences ◽

Transmission Rate ◽

Complex Structure ◽

Mortality Rates ◽

Influenza Viruses ◽

Feature Analysis ◽

Slow Feature Analysis ◽

Genetic Sequences

Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.

Download Full-text