scholarly journals Application of Graph Theory in DNA similarity analysis of Evolutionary Closed Species.

2021 ◽  
Vol 58 (1) ◽  
pp. 3428-3434
Author(s):  
W. W. P. M. T. M. Karunasena, G. S. Wijesiri

DNA is a complex molecule that consists of biological information that is passed down from generation to generation. With the evolution over time, there are different kinds of species that evolved from a common ancestor because of the occurrence of DNA sequence rearrangements. DNA sequence similarity analysis is a major challenge since the number of sequences is rapidly increasing in the DNA database. In this research, we based a mathematical method to analyze the similarity of two DNA sequences using Graph Theory. This mathematical method started by modeling a weighted directed graph for each DNA sequence, constructing its adjacency matrix, and converting it to the representative vector for each graph. From these vectors, the similarity was determined by distance measurements such as Euclidean, Cosine, and Correlation. By keeping this method as the based method, we will check whether it is applicable for any DNA fragments in considered genomes and molecular similarity coefficients can be used as distance measurements. We will obtain similarities using the graph spectrum instead of the representative vector. Then we will compare the results from the representative vector and that of the graph spectrum. The modified method is tested by using the mitochondrial DNA of Human, Gorilla, and Orangutan. It gives the same result when the number of nucleotides in DNA fragments is increased.

2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Anastasios A. Tsonis ◽  
Geli Wang ◽  
Lvyi Zhang ◽  
Wenxu Lu ◽  
Aristotle Kayafas ◽  
...  

Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.


Author(s):  
Dan Wei ◽  
Qingshan Jiang ◽  
Sheng Li

Similarity analysis of DNA sequences is a fundamental research area in Bioinformatics. The characteristic distribution of L-tuple, which is the tuple of length L, reflects the valuable information contained in a biological sequence and thus may be used in DNA sequence similarity analysis. However, similarity analysis based on characteristic distribution of L-tuple is not effective for the comparison of highly conservative sequences. In this paper, a new similarity measurement approach based on Triplets of Nucleic Acid Bases (TNAB) is introduced for DNA sequence similarity analysis. The new approach characterizes both the content feature and position feature of a DNA sequence using the frequency and position of occurrence of TNAB in the sequence. The experimental results show that the approach based on TNAB is effective for analysing DNA sequence similarity.


Bioinformatics, which is now a well known field of study, originated in the context of biological sequence analysis. Recently graphical representation takes place for the research on DNA sequence. Research in biological sequence is mainly based on the function and its structure. Bioinformatics finds wide range of applications specifically in the domain of molecular biology which focuses on the analysis of molecules viz. DNA, RNA, Protein etc. In this review, we mainly deal with the similarity analysis between sequences and graphical representation of DNA sequence.


2021 ◽  
Author(s):  
Dong Quan Ngoc Nguyen ◽  
Lin Xing ◽  
Phuong Dong Tan Le ◽  
Lizhen Lin

One of the very active research areas in bioinformatics is DNA similarity analysis. There are several approaches using alignment-based or alignment-free methods to analyze similarities/dissimilarities between DNA sequences. In this work, we introduce a novel representation of DNA sequences, using n-ary Cartesian products of graphs for arbitrary positive integers n. Each of the component graphs in the representing Cartesian product of each DNA sequence contain combinatorial information of certain tuples of nucleotides appearing in the DNA sequence. We further introduce a metric space structure to the set of all Cartesian products of graphs that represent a given collection of DNA sequences in order to be able to compare different Cartesian products of graphs, which in turn signifies similarities/dissimilarities between DNA sequences. We test our proposed method on several datasets including Human Papillomavirus, Human rhinovirus, Influenza A virus, and Mammals. We compare our method to other methods in literature, which indicates that our analysis results are comparable in terms of time complexity and high accuracy, and in one dataset, our method performs the best in comparison with other methods.


2021 ◽  
Author(s):  
Anastasios Tsonis ◽  
Geli Wang ◽  
Lvyi Zhang ◽  
Wenxu Lu ◽  
Aristotle Kayafas ◽  
...  

Abstract BackgroundMathematical approaches have for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families; those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968.MethodsThe mathematical method used is the Slow Feature Analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences.ResultsThe analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality.ConclusionsThe complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.


2011 ◽  
Vol 7 ◽  
pp. EBO.S7364 ◽  
Author(s):  
Xingqin Qi ◽  
Qin Wu ◽  
Yusen Zhang ◽  
Eddie Fuller ◽  
Cun-Quan Zhang

Author(s):  
Barbara Trask ◽  
Susan Allen ◽  
Anne Bergmann ◽  
Mari Christensen ◽  
Anne Fertitta ◽  
...  

Using fluorescence in situ hybridization (FISH), the positions of DNA sequences can be discretely marked with a fluorescent spot. The efficiency of marking DNA sequences of the size cloned in cosmids is 90-95%, and the fluorescent spots produced after FISH are ≈0.3 μm in diameter. Sites of two sequences can be distinguished using two-color FISH. Different reporter molecules, such as biotin or digoxigenin, are incorporated into DNA sequence probes by nick translation. These reporter molecules are labeled after hybridization with different fluorochromes, e.g., FITC and Texas Red. The development of dual band pass filters (Chromatechnology) allows these fluorochromes to be photographed simultaneously without registration shift.


2013 ◽  
Vol 41 (2) ◽  
pp. 548-553 ◽  
Author(s):  
Andrew A. Travers ◽  
Georgi Muskhelishvili

How much information is encoded in the DNA sequence of an organism? We argue that the informational, mechanical and topological properties of DNA are interdependent and act together to specify the primary characteristics of genetic organization and chromatin structures. Superhelicity generated in vivo, in part by the action of DNA translocases, can be transmitted to topologically sensitive regions encoded by less stable DNA sequences.


2010 ◽  
Vol 32 (4) ◽  
pp. 675-680 ◽  
Author(s):  
Chun Li ◽  
Hong Ma ◽  
Yang Zhou ◽  
Xiaolei Wang ◽  
Xiaoqi Zheng

Sign in / Sign up

Export Citation Format

Share Document