scholarly journals An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses

2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Anastasios A. Tsonis ◽  
Geli Wang ◽  
Lvyi Zhang ◽  
Wenxu Lu ◽  
Aristotle Kayafas ◽  
...  

Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.

2021 ◽  
Author(s):  
Anastasios Tsonis ◽  
Geli Wang ◽  
Lvyi Zhang ◽  
Wenxu Lu ◽  
Aristotle Kayafas ◽  
...  

Abstract BackgroundMathematical approaches have for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families; those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968.MethodsThe mathematical method used is the Slow Feature Analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences.ResultsThe analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality.ConclusionsThe complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Anastasios A. Tsonis ◽  
Geli Wang ◽  
Lvyi Zhang ◽  
Wenxu Lu ◽  
Aristotle Kayafas ◽  
...  

2021 ◽  
Vol 58 (1) ◽  
pp. 3428-3434
Author(s):  
W. W. P. M. T. M. Karunasena, G. S. Wijesiri

DNA is a complex molecule that consists of biological information that is passed down from generation to generation. With the evolution over time, there are different kinds of species that evolved from a common ancestor because of the occurrence of DNA sequence rearrangements. DNA sequence similarity analysis is a major challenge since the number of sequences is rapidly increasing in the DNA database. In this research, we based a mathematical method to analyze the similarity of two DNA sequences using Graph Theory. This mathematical method started by modeling a weighted directed graph for each DNA sequence, constructing its adjacency matrix, and converting it to the representative vector for each graph. From these vectors, the similarity was determined by distance measurements such as Euclidean, Cosine, and Correlation. By keeping this method as the based method, we will check whether it is applicable for any DNA fragments in considered genomes and molecular similarity coefficients can be used as distance measurements. We will obtain similarities using the graph spectrum instead of the representative vector. Then we will compare the results from the representative vector and that of the graph spectrum. The modified method is tested by using the mitochondrial DNA of Human, Gorilla, and Orangutan. It gives the same result when the number of nucleotides in DNA fragments is increased.


Author(s):  
Barbara Trask ◽  
Susan Allen ◽  
Anne Bergmann ◽  
Mari Christensen ◽  
Anne Fertitta ◽  
...  

Using fluorescence in situ hybridization (FISH), the positions of DNA sequences can be discretely marked with a fluorescent spot. The efficiency of marking DNA sequences of the size cloned in cosmids is 90-95%, and the fluorescent spots produced after FISH are ≈0.3 μm in diameter. Sites of two sequences can be distinguished using two-color FISH. Different reporter molecules, such as biotin or digoxigenin, are incorporated into DNA sequence probes by nick translation. These reporter molecules are labeled after hybridization with different fluorochromes, e.g., FITC and Texas Red. The development of dual band pass filters (Chromatechnology) allows these fluorochromes to be photographed simultaneously without registration shift.


2013 ◽  
Vol 41 (2) ◽  
pp. 548-553 ◽  
Author(s):  
Andrew A. Travers ◽  
Georgi Muskhelishvili

How much information is encoded in the DNA sequence of an organism? We argue that the informational, mechanical and topological properties of DNA are interdependent and act together to specify the primary characteristics of genetic organization and chromatin structures. Superhelicity generated in vivo, in part by the action of DNA translocases, can be transmitted to topologically sensitive regions encoded by less stable DNA sequences.


2016 ◽  
Vol 23 (12) ◽  
pp. 1702-1706 ◽  
Author(s):  
Zhouzhou He ◽  
Xi Li ◽  
Zhongfei Zhang ◽  
Yaqing Zhang ◽  
Jun Xiao ◽  
...  

1999 ◽  
Vol 341 (1) ◽  
pp. 89-93 ◽  
Author(s):  
Gianluca TELL ◽  
Lucia PELLIZZARI ◽  
Gennaro ESPOSITO ◽  
Carlo PUCILLO ◽  
Paolo Emidio MACCHIA ◽  
...  

Pax proteins are transcriptional regulators that play important roles during embryogenesis. These proteins recognize specific DNA sequences via a conserved element: the paired domain (Prd domain). The low level of organized secondary structure, in the free state, is a general feature of Prd domains; however, these proteins undergo a dramatic gain in α-helical content upon interaction with DNA (‘induced fit’). Pax8 is expressed in the developing thyroid, kidney and several areas of the central nervous system. In humans, mutations of the Pax8 gene, which are mapped to the coding region of the Prd domain, give rise to congenital hypothyroidism. Here, we have investigated the molecular defects caused by a mutation in which leucine at position 62 is substituted for an arginine. Leu62 is conserved among Prd domains, and contributes towards the packing together of helices 1 and 3. The binding affinity of the Leu62Arg mutant for a specific DNA sequence (the C sequence of thyroglobulin promoter) is decreased 60-fold with respect to the wild-type Pax8 Prd domain. However, the affinities with which the wild-type and the mutant proteins bind to a non-specific DNA sequence are very similar. CD spectra demonstrate that, in the absence of DNA, both wild-type Pax8 and the Leu62Arg mutant possess a low α-helical content; however, in the Leu62Arg mutant, the gain in α-helical content upon interaction with DNA is greatly reduced with respect to the wild-type protein. Thus the molecular defect of the Leu62Arg mutant causes a reduced capability for induced fit upon DNA interaction.


1985 ◽  
Vol 5 (4) ◽  
pp. 619-627
Author(s):  
M Montoya-Zavala ◽  
J L Hamlin

We have isolated overlapping recombinant cosmids that represent 150 kilobases of contiguous DNA sequence from the amplified dihydrofolate reductase domain of a methotrexate-resistant Chinese hamster ovary cell line (CHOC 400). This sequence includes the 25-kilobase dihydrofolate reductase gene and an origin of DNA synthesis. Eight cosmids that span this domain have been utilized as radioactive hybridization probes to analyze the similarities among the dihydrofolate reductase amplicons in four independently derived methotrexate-resistant Chinese hamster cell lines. We have observed no significant differences among the four cell lines within the 150-kilobase DNA sequence that we have examined, except for polymorphisms that result from the amplification of one or the other of two possible alleles of the dihydrofolate reductase domain. We also show that the restriction patterns of the amplicons in these four resistant cell lines are virtually identical to that of the corresponding, unamplified sequence in drug-susceptible parental cells. Furthermore, measurements of the relative copy numbers of fragments from widely separated regions of the amplicon suggest that all fragments in this 150-kilobase region may be amplified in unison. Our data show that in methotrexate-resistant Chinese hamster cells, the amplified unit is large relative to the dihydrofolate reductase gene itself. Furthermore, within the 150-kilobase amplified consensus sequence that we have examined, significant rearrangements do not seem to occur during the amplification process.


Sign in / Sign up

Export Citation Format

Share Document