scholarly journals A Novel Approach For Identification Of Exon Locations In DNA Sequences Using GLC Window

Author(s):  
P. Kamala Kumari ◽  
J.B. Seventline

The application of signal processing techniques for identification of exons in Deoxyribonucleic acid (DNA) sequence is a challenging task. The objective of this paper is to introduce a combinational window approach for locating exons in DNA sequence. In contrast to the traditional single window function for evaluation of short time Fourier transform (STFT), this work proposes a novel method for evaluating STFT coefficients using a combinational window function comprising of Gaussian, Lanczos and Chebyshev (GLC) windows. The chosen combinational window GLC has the highest relative side lobe attenuation values compared to other window functions introduced by various researchers. The proposed algorithm incorporates GLC window function for evaluating STFT coefficients and in the design of FIR bandpass filter. Simulation results revealed its effectiveness in improving the evaluation parameters like Sensitivity, Specificity, Accuracy, Area under curve (AUC), Discrimination Measure (DM). Furthermore, the proposed algorithm has been applied successfully to some universal benchmark datasets like C. elegans, Homosapiens, etc., The proposed method has shown to be an efficient approach for the prediction of protein coding regions compared to other existing methods. All the simulations are done using the MATLAB 2016a.

2014 ◽  
Vol 2014 ◽  
pp. 1-14 ◽  
Author(s):  
Guangchen Liu ◽  
Yihui Luan

The identification of protein coding regions (exons) plays a critical role in eukaryotic gene structure prediction. Many techniques have been introduced for discriminating between the exons and the introns in the eukaryotic DNA sequences, such as the discrete Fourier transform (DFT) based techniques, but these DFT-based methods rapidly lose their effectiveness in the case of short DNA sequences. In this paper, a novel integrated algorithm based on autoregressive spectrum analysis and wavelet packets transform is presented to improve the efficiency and accuracy of the coding regions identification. The experimental results show that the new algorithm outperforms the conventional DFT-based approaches in improving the prediction accuracy of protein coding regions distinctly by testing GENSCAN65, HMR195, and BG570 benchmark datasets.


2005 ◽  
Vol 2005 (2) ◽  
pp. 139-146 ◽  
Author(s):  
Jianbo Gao ◽  
Yan Qi ◽  
Yinhe Cao ◽  
Wen-wen Tung

Most codon indices used today are based on highly biased nonrandom usage of codons in coding regions. The background of a coding or noncoding DNA sequence, however, is fairly random, and can be characterized as a random fractal. When a gene-finding algorithm incorporates multiple sources of information about coding regions, it becomes more successful. It is thus highly desirable to develop new and efficient codon indices by simultaneously characterizing the fractal and periodic features of a DNA sequence. In this paper, we describe a novel way of achieving this goal. The efficiency of the new codon index is evaluated by studying all of the 16 yeast chromosomes. In particular, we show that the method automatically and correctly identifies which of the three reading frames is the one that contains a gene.


Author(s):  
Barbara Trask ◽  
Susan Allen ◽  
Anne Bergmann ◽  
Mari Christensen ◽  
Anne Fertitta ◽  
...  

Using fluorescence in situ hybridization (FISH), the positions of DNA sequences can be discretely marked with a fluorescent spot. The efficiency of marking DNA sequences of the size cloned in cosmids is 90-95%, and the fluorescent spots produced after FISH are ≈0.3 μm in diameter. Sites of two sequences can be distinguished using two-color FISH. Different reporter molecules, such as biotin or digoxigenin, are incorporated into DNA sequence probes by nick translation. These reporter molecules are labeled after hybridization with different fluorochromes, e.g., FITC and Texas Red. The development of dual band pass filters (Chromatechnology) allows these fluorochromes to be photographed simultaneously without registration shift.


2013 ◽  
Vol 41 (2) ◽  
pp. 548-553 ◽  
Author(s):  
Andrew A. Travers ◽  
Georgi Muskhelishvili

How much information is encoded in the DNA sequence of an organism? We argue that the informational, mechanical and topological properties of DNA are interdependent and act together to specify the primary characteristics of genetic organization and chromatin structures. Superhelicity generated in vivo, in part by the action of DNA translocases, can be transmitted to topologically sensitive regions encoded by less stable DNA sequences.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Anastasios A. Tsonis ◽  
Geli Wang ◽  
Lvyi Zhang ◽  
Wenxu Lu ◽  
Aristotle Kayafas ◽  
...  

Abstract Background Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968. Methods The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences. Results The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality. Conclusions The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.


Cell ◽  
1984 ◽  
Vol 38 (3) ◽  
pp. 667-673 ◽  
Author(s):  
Michael Levine ◽  
Gerald M. Rubin ◽  
Robert Tjian

1999 ◽  
Vol 341 (1) ◽  
pp. 89-93 ◽  
Author(s):  
Gianluca TELL ◽  
Lucia PELLIZZARI ◽  
Gennaro ESPOSITO ◽  
Carlo PUCILLO ◽  
Paolo Emidio MACCHIA ◽  
...  

Pax proteins are transcriptional regulators that play important roles during embryogenesis. These proteins recognize specific DNA sequences via a conserved element: the paired domain (Prd domain). The low level of organized secondary structure, in the free state, is a general feature of Prd domains; however, these proteins undergo a dramatic gain in α-helical content upon interaction with DNA (‘induced fit’). Pax8 is expressed in the developing thyroid, kidney and several areas of the central nervous system. In humans, mutations of the Pax8 gene, which are mapped to the coding region of the Prd domain, give rise to congenital hypothyroidism. Here, we have investigated the molecular defects caused by a mutation in which leucine at position 62 is substituted for an arginine. Leu62 is conserved among Prd domains, and contributes towards the packing together of helices 1 and 3. The binding affinity of the Leu62Arg mutant for a specific DNA sequence (the C sequence of thyroglobulin promoter) is decreased 60-fold with respect to the wild-type Pax8 Prd domain. However, the affinities with which the wild-type and the mutant proteins bind to a non-specific DNA sequence are very similar. CD spectra demonstrate that, in the absence of DNA, both wild-type Pax8 and the Leu62Arg mutant possess a low α-helical content; however, in the Leu62Arg mutant, the gain in α-helical content upon interaction with DNA is greatly reduced with respect to the wild-type protein. Thus the molecular defect of the Leu62Arg mutant causes a reduced capability for induced fit upon DNA interaction.


1991 ◽  
Vol 11 (1) ◽  
pp. 533-543
Author(s):  
R M Mulligan ◽  
P Leon ◽  
V Walbot

Lysed maize mitochondria synthesize RNA in the presence of radioactive nucleoside triphosphates, and this assay was utilized to compare the rates of transcription of seven genes. The rates of incorporation varied over a 14-fold range, with the following rank order: 18S rRNA greater than 26S rRNA greater than atp1 greater than atp6 greater than atp9 greater than cob greater than cox3. The products of run-on transcription hybridized specifically to known transcribed regions and selectively to the antisense DNA strand; thus, the isolated run-on transcription system appears to be an accurate representation of endogenous transcription. Although there were small differences in gene copy abundance, these differences cannot account for the differences in apparent transcription rates; we conclude that promoter strength is the main determinant. Among the protein coding genes, incorporation was greatest for atp1. The most active transcription initiation site of this gene was characterized by hybridization with in vitro-capped RNA and by primer extension analyses. The DNA sequences at this and other transcription initiation sites that we have previously mapped were analyzed with respect to the apparent promoter strengths. We propose that two short sequence elements just upstream of initiation sites form at least a portion of the sequence requirements for a maize mitochondrial promoter. In addition to modulation at the level of transcription, steady-state abundance of protein-coding mRNAs varied over a 20-fold range and did not correlate with transcriptional activity. These observations suggest that posttranscriptional processes are important in the modulation of mRNA abundance.


1987 ◽  
Vol 7 (8) ◽  
pp. 2933-2940
Author(s):  
H Honkawa ◽  
W Masahashi ◽  
S Hashimoto ◽  
T Hashimoto-Gotoh

A number of deletion mutants were isolated, including 5', 3', and internal deletions in the 5'-flanking region of the human cellular oncogene related to the Harvey sarcoma virus (c-H-ras), and their transforming activities were examined in NIH 3T3 cells. DNA sequences which could not be detected without losing transforming activity were localized to a relatively short stretch upstream of the region which showed homology to the 5'-flanking region of v-H-ras oncogene. S1 nuclease analysis indicated that there were two clusters of mRNA start sites at positions that were about 1,371 and 1,298 base pairs upstream of the first coding ATG. The minimum region required for promoter function was estimated to be a 51-base-pair-long (or less) DNA segment. The promoter was GC rich (78%) and did not contain the consensus sequences that are usually observed in PolII-directed promoters but contained a GC box within which one of the mRNA start sites was included. In addition, two sets of positive and negative elements seemed to be located between the promoter and the protein-coding region, which appeared to influence positively and negatively, respectively, the efficiency of transformation with the c-H-ras oncogene.


Sign in / Sign up

Export Citation Format

Share Document