Application of Graph Theory in DNA similarity analysis of Evolutionary Closed Species.

DNA is a complex molecule that consists of biological information that is passed down from generation to generation. With the evolution over time, there are different kinds of species that evolved from a common ancestor because of the occurrence of DNA sequence rearrangements. DNA sequence similarity analysis is a major challenge since the number of sequences is rapidly increasing in the DNA database. In this research, we based a mathematical method to analyze the similarity of two DNA sequences using Graph Theory. This mathematical method started by modeling a weighted directed graph for each DNA sequence, constructing its adjacency matrix, and converting it to the representative vector for each graph. From these vectors, the similarity was determined by distance measurements such as Euclidean, Cosine, and Correlation. By keeping this method as the based method, we will check whether it is applicable for any DNA fragments in considered genomes and molecular similarity coefficients can be used as distance measurements. We will obtain similarities using the graph spectrum instead of the representative vector. Then we will compare the results from the representative vector and that of the graph spectrum. The modified method is tested by using the mitochondrial DNA of Human, Gorilla, and Orangutan. It gives the same result when the number of nucleotides in DNA fragments is increased.

Download Full-text

An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses

Human Genomics ◽

10.1186/s40246-021-00327-2 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Anastasios A. Tsonis ◽

Geli Wang ◽

Lvyi Zhang ◽

Wenxu Lu ◽

Aristotle Kayafas ◽

...

Keyword(s):

Mathematical Method ◽

Dna Sequence ◽

Dna Sequences ◽

Transmission Rate ◽

Complex Structure ◽

Mortality Rates ◽

Influenza Viruses ◽

Feature Analysis ◽

Slow Feature Analysis ◽

Genetic Sequences

Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.

Download Full-text

A New Approach for DNA Sequence Similarity Analysis based on Triplets of Nucleic Acid Bases

International Journal of Nanotechnology and Molecular Computation ◽

10.4018/978-1-60960-064-8.ch006 ◽

2010 ◽

Vol 2 (4) ◽

pp. 1-11

Author(s):

Dan Wei ◽

Qingshan Jiang ◽

Sheng Li

Keyword(s):

Nucleic Acid ◽

Dna Sequence ◽

Dna Sequences ◽

Sequence Similarity ◽

Similarity Analysis ◽

Biological Sequence ◽

Nucleic Acid Bases ◽

New Approach ◽

Characteristic Distribution ◽

Sequence Similarity Analysis

Similarity analysis of DNA sequences is a fundamental research area in Bioinformatics. The characteristic distribution of L-tuple, which is the tuple of length L, reflects the valuable information contained in a biological sequence and thus may be used in DNA sequence similarity analysis. However, similarity analysis based on characteristic distribution of L-tuple is not effective for the comparison of highly conservative sequences. In this paper, a new similarity measurement approach based on Triplets of Nucleic Acid Bases (TNAB) is introduced for DNA sequence similarity analysis. The new approach characterizes both the content feature and position feature of a DNA sequence using the frequency and position of occurrence of TNAB in the sequence. The experimental results show that the approach based on TNAB is effective for analysing DNA sequence similarity.

Download Full-text

Estimation of Similarity between DNA Sequences and Its Graphical Representation

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f9389.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 43-51

Keyword(s):

Sequence Analysis ◽

Molecular Biology ◽

Dna Sequence ◽

Dna Sequences ◽

Graphical Representation ◽

Similarity Analysis ◽

Biological Sequence ◽

Field Of Study ◽

Biological Sequence Analysis ◽

Wide Range

Bioinformatics, which is now a well known field of study, originated in the context of biological sequence analysis. Recently graphical representation takes place for the research on DNA sequence. Research in biological sequence is mainly based on the function and its structure. Bioinformatics finds wide range of applications specifically in the domain of molecular biology which focuses on the analysis of molecules viz. DNA, RNA, Protein etc. In this review, we mainly deal with the similarity analysis between sequences and graphical representation of DNA sequence.

Download Full-text

A graph-theoretical approach to DNA similarity analysis

10.1101/2021.08.05.455342 ◽

2021 ◽

Author(s):

Dong Quan Ngoc Nguyen ◽

Lin Xing ◽

Phuong Dong Tan Le ◽

Lizhen Lin

Keyword(s):

Dna Sequence ◽

Dna Sequences ◽

Influenza A ◽

Cartesian Product ◽

Space Structure ◽

Similarity Analysis ◽

Cartesian Products ◽

Research Areas ◽

Active Research ◽

Cartesian Products Of Graphs

One of the very active research areas in bioinformatics is DNA similarity analysis. There are several approaches using alignment-based or alignment-free methods to analyze similarities/dissimilarities between DNA sequences. In this work, we introduce a novel representation of DNA sequences, using n-ary Cartesian products of graphs for arbitrary positive integers n. Each of the component graphs in the representing Cartesian product of each DNA sequence contain combinatorial information of certain tuples of nucleotides appearing in the DNA sequence. We further introduce a metric space structure to the set of all Cartesian products of graphs that represent a given collection of DNA sequences in order to be able to compare different Cartesian products of graphs, which in turn signifies similarities/dissimilarities between DNA sequences. We test our proposed method on several datasets including Human Papillomavirus, Human rhinovirus, Influenza A virus, and Mammals. We compare our method to other methods in literature, which indicates that our analysis results are comparable in terms of time complexity and high accuracy, and in one dataset, our method performs the best in comparison with other methods.

Download Full-text

An Application of Slow Feature Analysis to the Genetic Sequences of Coronaviruses and Influenza viruses

10.21203/rs.3.rs-294807/v1 ◽

2021 ◽

Author(s):

Anastasios Tsonis ◽

Geli Wang ◽

Lvyi Zhang ◽

Wenxu Lu ◽

Aristotle Kayafas ◽

...

Keyword(s):

Mathematical Method ◽

Dna Sequence ◽

Dna Sequences ◽

Transmission Rate ◽

Complex Structure ◽

Mortality Rates ◽

Influenza Viruses ◽

Feature Analysis ◽

Slow Feature Analysis ◽

Genetic Sequences

Abstract BackgroundMathematical approaches have for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families; those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968.MethodsThe mathematical method used is the Slow Feature Analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences.ResultsThe analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality.ConclusionsThe complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.

Download Full-text

The Similarity Analysis of DNA Sequence Model Based on Graph Theory and Blast Program

EDUCATUM Journal of Science, Mathematics and Technology ◽

10.37134/ejsmt.vol4.1.6.2017 ◽

2017 ◽

Vol 4 (1) ◽

pp. 41-51

Author(s):

Y.A. Lesnussa ◽

S. Kappuw ◽

B.P. Tomasouw ◽

E.R. Persulessy

Keyword(s):

Graph Theory ◽

Dna Sequence ◽

Similarity Analysis ◽

Model Based ◽

Blast Program

Download Full-text

A Novel Model for DNA Sequence Similarity Analysis Based on Graph Theory

Evolutionary Bioinformatics ◽

10.4137/ebo.s7364 ◽

2011 ◽

Vol 7 ◽

pp. EBO.S7364 ◽

Cited By ~ 14

Author(s):

Xingqin Qi ◽

Qin Wu ◽

Yusen Zhang ◽

Eddie Fuller ◽

Cun-Quan Zhang

Keyword(s):

Graph Theory ◽

Dna Sequence ◽

Sequence Similarity ◽

Similarity Analysis ◽

Sequence Similarity Analysis ◽

Novel Model

Download Full-text

DNA sequence mapping in interphase and metaphase chromosomes by fluorescence in situ hybridization

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100122885 ◽

1992 ◽

Vol 50 (1) ◽

pp. 496-497

Author(s):

Barbara Trask ◽

Susan Allen ◽

Anne Bergmann ◽

Mari Christensen ◽

Anne Fertitta ◽

...

Keyword(s):

In Situ Hybridization ◽

Dna Sequence ◽

Dna Sequences ◽

Dual Band ◽

Nick Translation ◽

Metaphase Chromosomes ◽

Band Pass ◽

Texas Red ◽

Fluorescent Spot

Using fluorescence in situ hybridization (FISH), the positions of DNA sequences can be discretely marked with a fluorescent spot. The efficiency of marking DNA sequences of the size cloned in cosmids is 90-95%, and the fluorescent spots produced after FISH are ≈0.3 μm in diameter. Sites of two sequences can be distinguished using two-color FISH. Different reporter molecules, such as biotin or digoxigenin, are incorporated into DNA sequence probes by nick translation. These reporter molecules are labeled after hybridization with different fluorochromes, e.g., FITC and Texas Red. The development of dual band pass filters (Chromatechnology) allows these fluorochromes to be photographed simultaneously without registration shift.

Download Full-text

DNA thermodynamics shape chromosome organization and topology

Biochemical Society Transactions ◽

10.1042/bst20120334 ◽

2013 ◽

Vol 41 (2) ◽

pp. 548-553 ◽

Cited By ~ 13

Author(s):

Andrew A. Travers ◽

Georgi Muskhelishvili

Keyword(s):

Dna Sequence ◽

Dna Sequences ◽

Chromosome Organization ◽

Topological Properties ◽

Genetic Organization ◽

And Topology ◽

Dna Translocases

How much information is encoded in the DNA sequence of an organism? We argue that the informational, mechanical and topological properties of DNA are interdependent and act together to specify the primary characteristics of genetic organization and chromatin structures. Superhelicity generated in vivo, in part by the action of DNA translocases, can be transmitted to topologically sensitive regions encoded by less stable DNA sequences.

Download Full-text

Similarity analysis of DNA sequences based on the weighted pseudo-entropy

Journal of Computational Chemistry ◽

10.1002/jcc.21656 ◽

2010 ◽

Vol 32 (4) ◽

pp. 675-680 ◽

Cited By ~ 14

Author(s):

Chun Li ◽

Hong Ma ◽

Yang Zhou ◽

Xiaolei Wang ◽

Xiaoqi Zheng

Keyword(s):

Dna Sequences ◽

Similarity Analysis

Download Full-text