An Optimization to Protein Coding Regions Identification in Eukaryotes

2012 ◽

pp. 281-290

Author(s):

Muneer Ahmad ◽

Azween Abdullah ◽

Noor Zaman

Keyword(s):

Continuous Improvement ◽

Wavelet Transforms ◽

Complex Method ◽

Mus Musculus Domesticus ◽

Protein Coding ◽

Coding Regions ◽

Dyadic Wavelet ◽

Sequence Method ◽

Future Expectation ◽

Ncbi Accession Number

Significant improvement in coding regions identification was observed over many real datasets, which were obtained from the national center for bioinformatics. Quantitatively, the authors monitored a gain of 80.5% in coding identification with the Complex method, 42.5% with the Binary method, and 15% with the EIIP indicator sequence method over Mus Musculus Domesticus (House rat), NCBI Accession number: NC_006914, Length of gene: 7700 bp with number of coding regions: 4. Continuous improvement in significance with dyadic wavelet transforms will be observed as a future expectation.

Download Full-text

A Biologically-Inspired Computational Solution for Protein Coding Regions Identification in Noisy DNA Sequences

Advances in Environmental Engineering and Green Technologies - Biologically-Inspired Energy Harvesting through Wireless Sensor Technologies ◽

10.4018/978-1-4666-9792-8.ch010 ◽

2016 ◽

pp. 201-216

Author(s):

Muneer Ahmad

Keyword(s):

Dna Sequences ◽

Wavelet Transforms ◽

Fourier Transforms ◽

Digital Signal ◽

Biologically Inspired ◽

Protein Coding ◽

Spectral Density Estimation ◽

Coding Regions ◽

Power Spectral ◽

Sequence Method

Biologically inspired computational solutions for protein coding regions identification are termed as optimized solutions that could enhance regions of interest in noisy DNA signals contrary to contemporary identification. Exponentially growing genomic data needs better protein translation. The solutions proposed so far rely on statistical, digital signal processing and Fourier transforms approaches lacking the reflection for optimal biologically inspired identification of coding regions. This paper presents a peculiar biologically inspired solution for coding regions identification based on wavelet transforms with notion of a peculiar indicator sequence. DNA signal noise has been reduced considerably and exon peaks can be discriminated from introns significantly. A comparative analysis performed over datasets commonly used for protein coding identification revealed the outperformance of proposed solution in power spectral density estimation graphs and numerical discrimination measure's calculations. The significant results achieved depict 75% reduction in computational complexity than Binary indicator sequence method and 32% to 266% improvement than other methods in literature (as a comparison with standard NCBI range). The significance in results has been achieved by efficiently denosing the target DNA signal employing wavelets and peculiar indicator sequence.

Download Full-text

Evolutionary Analysis of DNA-Protein-Coding Regions Based on a Genetic Code Cube Metric

Current Topics in Medicinal Chemistry ◽

10.2174/1568026613666131204110022 ◽

2014 ◽

Vol 14 (3) ◽

pp. 407-417

Author(s):

Robersy Sanchez

Keyword(s):

Genetic Code ◽

Evolutionary Analysis ◽

Protein Coding ◽

Coding Regions

Download Full-text

The open targets post-GWAS analysis pipeline

Bioinformatics ◽

10.1093/bioinformatics/btaa020 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2936-2937 ◽

Cited By ~ 4

Author(s):

Gareth Peat ◽

William Jones ◽

Michael Nuhn ◽

José Carlos Marugán ◽

William Newell ◽

...

Keyword(s):

Drug Targets ◽

Gene Expression Regulation ◽

Association Studies ◽

Genome Wide Association Studies ◽

Protein Coding ◽

Data Resource ◽

Coding Regions ◽

Genome Wide ◽

Causal Genes ◽

Interactive Data

Abstract Motivation Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e. causal protein-coding genes, therefore, requires crossing the genetics results with functional data. Results We present a novel data integration pipeline that analyses GWAS results in the light of experimental epigenetic and cis-regulatory datasets, such as ChIP-Seq, Promoter-Capture Hi-C or eQTL, and presents them in a single report, which can be used for inferring likely causal genes. This pipeline was then fed into an interactive data resource. Availability and implementation The analysis code is available at www.github.com/Ensembl/postgap and the interactive data browser at postgwas.opentargets.io.

Download Full-text

Novel exon 1 protein‐coding regions N‐terminally extend human KCNE3 and KCNE4

The FASEB Journal ◽

10.1096/fj.201600467r ◽

2016 ◽

Vol 30 (8) ◽

pp. 2959-2969 ◽

Cited By ~ 8

Author(s):

Geoffrey W. Abbott

Keyword(s):

Protein Coding ◽

Coding Regions ◽

Exon 1 ◽

Novel Exon

Download Full-text

Protein-coding structured RNAs: A computational survey of conserved RNA secondary structures overlapping coding regions in drosophilids

Biochimie ◽

10.1016/j.biochi.2011.07.023 ◽

2011 ◽

Vol 93 (11) ◽

pp. 2019-2023 ◽

Cited By ~ 8

Author(s):

Sven Findeiß ◽

Jan Engelhardt ◽

Sonja J. Prohaska ◽

Peter F. Stadler

Keyword(s):

Secondary Structures ◽

Protein Coding ◽

Rna Secondary Structures ◽

Coding Regions

Download Full-text

Structure and expression of canary myc family genes

Molecular and Cellular Biology ◽

10.1128/mcb.11.3.1770-1776.1991 ◽

1991 ◽

Vol 11 (3) ◽

pp. 1770-1776

Author(s):

R G Collum ◽

D F Clayton ◽

F W Alt

Keyword(s):

Untranslated Region ◽

Untranslated Regions ◽

Coding Region ◽

Protein Coding ◽

Coding Regions ◽

Neuronal Precursors ◽

Myc Gene ◽

Mature Neurons

We found that the canary N-myc gene is highly related to mammalian N-myc genes in both the protein-coding region and the long 3' untranslated region. Examined coding regions of the canary c-myc gene were also highly related to their mammalian counterparts, but in contrast to N-myc, the canary and mammalian c-myc genes were quite divergent in their 3' untranslated regions. We readily detected N-myc and c-myc expression in the adult canary brain and found N-myc expression both at sites of proliferating neuronal precursors and in mature neurons.

Download Full-text

Sequence and phylogenetic analysis of the non-structural 3A and 3B protein-coding regions of foot-and-mouth disease virus subtype A Iran 05

Journal of Veterinary Science ◽

10.4142/jvs.2010.11.3.243 ◽

2010 ◽

Vol 11 (3) ◽

pp. 243

Author(s):

Saber Jelokhani-Niaraki ◽

Majid Esmaelizad ◽

Morteza Daliri ◽

Rasoul Vaez-Torshizi ◽

Morteza Kamalzadeh ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Disease Virus ◽

Foot And Mouth Disease ◽

Mouth Disease ◽

Protein Coding ◽

Coding Regions ◽

Mouth Disease Virus ◽

Foot And Mouth ◽

Subtype A ◽

Virus Subtype

Download Full-text

Genome Sequence of Rheinheimera salexigens sp. nov. Isolated from a Fishing Hook off O‘ahu, Hawai‘i

Genome Announcements ◽

10.1128/genomea.01390-16 ◽

2016 ◽

Vol 4 (6) ◽

Cited By ~ 1

Author(s):

Xuehua Wan ◽

Shaobin Hou ◽

Kazukuni Hayashi ◽

James Anderson ◽

Stuart P. Donachie

Keyword(s):

Genome Sequence ◽

Draft Genome ◽

Draft Genome Sequence ◽

Protein Coding ◽

Coding Sequences ◽

Coding Regions ◽

Roche 454

Rheinheimera salexigens KH87 T is an obligately halophilic gammaproteobacterium. The strain’s draft genome sequence, generated by the Roche 454 GS FLX+ platform, comprises two scaffolds of ~3.4 Mbp and ~3 kbp, with 3,030 protein-coding sequences and 58 tRNA coding regions. The G+C content is 42 mol%.

Download Full-text

Gene prediction by multiple syntenic alignment

Journal of Integrative Bioinformatics ◽

10.1515/jib-2005-13 ◽

2005 ◽

Vol 2 (1) ◽

pp. 38-47

Author(s):

Said S. Adi ◽

Carlos E. Ferreira

Keyword(s):

Computer Program ◽

Genomic Dna ◽

Gene Prediction ◽

Genomic Sequences ◽

Prediction Problem ◽

Huge Amount ◽

Protein Coding ◽

Coding Regions ◽

Conserved Regions ◽

Similarity Information

Summary Given the increasing number of available genomic sequences, one now faces the task of identifying their functional parts, like the protein coding regions. The gene prediction problem can be addressed in several ways. One of the most promising methods makes use of similarity information between the genomic DNA and previously annotated sequences (proteins, cDNAs and ESTs). Recently, given the huge amount of newly sequenced genomes, new similarity-based methods are being successfully applied in the task of gene prediction. The so-called comparative-based methods lie in the similarities shared by regions of two evolutionary related genomic sequences. Despite the number of different gene prediction approaches in the literature, this problem remains challenging. In this paper we present a new comparative-based approach to the gene prediction problem. It is based on a syntenic alignment of three or more genomic sequences. With syntenic alignment we mean an alignment that is constructed taking into account the fact that the involved sequences include conserved regions intervened by unconserved ones. We have implemented the proposed algorithm in a computer program and confirm the validity of the approach on a benchmark including triples of human, mouse and rat genomic sequences.

Download Full-text