Linear Prediction of Nucleotides in a Genome Sequence
Nucleotides are organic molecules, which are monomer units that form polymers of nucleic acid „deoxyribonucleic acid (DNA)‟ and „ribonucleic acid (RNA)‟. The four nucleotides A, T, G and C get connected by phosphodiester bonds to form strands. Strand formation depends on innumerable factors related to inter and intra cellular parameters and functions. One cannot precisely say that a particular strand gets formed using such and such rules. The infinite possibilities of strand formation cannot be determined experimentally or in the framework of classical genetics. One can alternatively formulate a notion of the “Language of Genomes” wherein one can finitely specify infinite strands. This paper introduces a novel prediction algorithm, which generates possible strands based on available nucleotides statistics.