Automated Genomic Signal Processing for Diseased Gene Identification
Genomic signal processing (GSP) is the engineering discipline for the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Statistical Computations on DNA Sequences is one of key areas in which GSP can be applied. In this paper, we apply DSP tools on trinucleotide repeat disorders (too many copies of a certain nucleotide triplet in the DNA) to classify any gene sequence into diseased/non-diseased state. Intially, we collected the Gene sequences responsible for trinucleotide repeat disorders from NCBI. Then, we applied GSP techniques to convert the given gene sequence into an indicator sequence, and furthermore we apply Fast Fourier transforms (FFTs) and Discrete Wavelet Transforms (DWTs), followed by statistical feature extraction and the obtained statistical features, fed into an Artificial Neural Network to predict the state of the input genomic sequence.