A Biologically-Inspired Computational Solution for Protein Coding Regions Identification in Noisy DNA Sequences

Author(s):  
Muneer Ahmad

Biologically inspired computational solutions for protein coding regions identification are termed as optimized solutions that could enhance regions of interest in noisy DNA signals contrary to contemporary identification. Exponentially growing genomic data needs better protein translation. The solutions proposed so far rely on statistical, digital signal processing and Fourier transforms approaches lacking the reflection for optimal biologically inspired identification of coding regions. This paper presents a peculiar biologically inspired solution for coding regions identification based on wavelet transforms with notion of a peculiar indicator sequence. DNA signal noise has been reduced considerably and exon peaks can be discriminated from introns significantly. A comparative analysis performed over datasets commonly used for protein coding identification revealed the outperformance of proposed solution in power spectral density estimation graphs and numerical discrimination measure's calculations. The significant results achieved depict 75% reduction in computational complexity than Binary indicator sequence method and 32% to 266% improvement than other methods in literature (as a comparison with standard NCBI range). The significance in results has been achieved by efficiently denosing the target DNA signal employing wavelets and peculiar indicator sequence.

2013 ◽  
pp. 1745-1754
Author(s):  
Muneer Ahmad ◽  
Azween Abdullah ◽  
Noor Zaman

Significant improvement in coding regions identification was observed over many real datasets, which were obtained from the national center for bioinformatics. Quantitatively, the authors monitored a gain of 80.5% in coding identification with the Complex method, 42.5% with the Binary method, and 15% with the EIIP indicator sequence method over Mus Musculus Domesticus (House rat), NCBI Accession number: NC_006914, Length of gene: 7700 bp with number of coding regions: 4. Continuous improvement in significance with dyadic wavelet transforms will be observed as a future expectation.


Author(s):  
Muneer Ahmad ◽  
Azween Abdullah ◽  
Noor Zaman

Significant improvement in coding regions identification was observed over many real datasets, which were obtained from the national center for bioinformatics. Quantitatively, the authors monitored a gain of 80.5% in coding identification with the Complex method, 42.5% with the Binary method, and 15% with the EIIP indicator sequence method over Mus Musculus Domesticus (House rat), NCBI Accession number: NC_006914, Length of gene: 7700 bp with number of coding regions: 4. Continuous improvement in significance with dyadic wavelet transforms will be observed as a future expectation.


2019 ◽  
Vol 31 (01) ◽  
pp. 1950002
Author(s):  
Subhajit Kar ◽  
Madhabi Ganguly ◽  
Saptarshi Das

The new research platform on biomedical engineering by Digital Signal Processing (DSP) is playing a vital role in the prediction of protein coding regions (Exons) from genomic sequences with great accuracy. We can determine the protein coding area in DNA sequences with the help of period-3 property. It has been seen that in order to find out the period-3 property, the DFT algorithm is mostly used but in this paper, we have tested FFT algorithm instead of DFT algorithm. DSP is basically concerned with processing numerical sequences. When digital signal processing used in DNA sequences analysis, it requires conversion of base characters sequence to the numerical version. The numerical representation of DNA sequences strongly impacts the biological properties mirrored through the numerical genre. In this work, the proposed technique based on DIT-FFT algorithm has been used to identify the exonic area with the help of integer value representation for transforming the DNA sequences. Digital filters are used to read out period 3 components from the output spectrum and to eliminate the unwanted high frequency noise from DNA sequences. To overcome background noise means to suppress the non-coding regions, i.e., Introns. Proposed algorithm is tested on four nucleotide sequences having single or multiple numbers of exons.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
T. M. Inbamalar ◽  
R. Sivakumar

Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.


Zootaxa ◽  
2020 ◽  
Vol 4748 (1) ◽  
pp. 182-194 ◽  
Author(s):  
JING ZHANG ◽  
ERNST BROCKMANN ◽  
QIAN CONG ◽  
JINHUI SHEN ◽  
NICK V. GRISHIN

We obtained whole genome shotgun sequences and phylogenetically analyzed protein-coding regions of representative skipper butterflies from the genus Carcharodus Hübner, [1819] and its close relatives. Type species of all available genus-group names were sequenced. We find that species attributed to four exclusively Old World genera (Spialia Swinhoe, 1912, Gomalia Moore, 1879, Carcharodus Hübner, [1819] and Muschampia Tutt, 1906) form a monophyletic group that we call a subtribe Carcharodina Verity, 1940. In the phylogenetic trees built from various genomic regions, these species form 7 (not 4) groups that we treat as genera. We find that Muschampia Tutt, 1906 is not monophyletic, and the 5th group is formed by currently monotypic genus Favria Tutt, 1906 new status (type species Hesperia cribrellum Eversmann, 1841), which is sister to Gomalia. The 6th and 7th groups are composed of mostly African species presently placed in Spialia. These groups do not have names and are described here as Ernsta Grishin, gen. n. (type species Pyrgus colotes Druce, 1875) and Agyllia Grishin, gen. n. (type species Pyrgus agylla Trimen, 1889). Two subgroups are recognized in Ernsta: the nominal subgenus and a new one: Delaga Grishin, subgen. n. (type species Pyrgus delagoae Trimen, 1898). Next, we observe that Carcharodus is not monophyletic, and species formerly placed in subgenera Reverdinus Ragusa, 1919 and Lavatheria Verity, 1940 are here transferred to Muschampia. Furthermore, due to differences in male genitalia or DNA sequences, we reinstate Gomalia albofasciata Moore, 1879 and Gomalia jeanneli (Picard, 1949) as species, not subspecies or synonyms of Gomalia elma (Trimen, 1862), and Spialia bifida (Higgins, 1924) as a species, not subspecies of Spialia zebra (Butler, 1888). Sequencing of the type specimens reveals 2.2-3.2% difference in COI barcodes, the evidence that combined with wing pattern differences suggests a new status of a species for Spialia lugens (Staudinger, 1886) and Spialia carnea (Reverdin, 1927), formerly subspecies of Spialia orbifer (Hübner, [1823]). 


1982 ◽  
Vol 10 (17) ◽  
pp. 5303-5318 ◽  
Author(s):  
James W. Fickett

2017 ◽  
Vol 13 (4) ◽  
pp. 63-78
Author(s):  
حمیدرضا صابرکاری ◽  
موسی شمسی ◽  
Hossein صداقی ◽  
◽  
◽  
...  

Sign in / Sign up

Export Citation Format

Share Document