scholarly journals An alignment method for nucleic acid sequences against annotated genomes

2017 ◽  
Author(s):  
Koen Deforche

AbstractMotivationBiological sequence alignment is fundamental to their further interpretation. Current alignment algorithms typically align either nucleic acid or amino acid sequences. Using only nucleic acid sequence similarity, divergent sequences cannot be aligned reliably because of the limited alphabet and genetic saturation. To align divergent coding nucleic acid sequences, one can align using the translated amino acid sequences. This requires the detection of the correct open reading frame, is prone to eventual frame shift errors, and typically requires the treatment of genes separately. It was our motivation to design a nucleic acid sequence alignment algorithm to align a nucleic acid sequence against a (reference) genome sequence, that works equally well for similar and divergent sequences, and produces an optimal alignment considering simultaneously the alignment of all annotated coding sequences.ResultsWe define a genome alignment score for evaluating the quality of an alignment of a nucleic acid query sequence against a reference genome sequence, for which coding sequence features have been annotated (for example in a GenBank record). The genome alignment score combines the a ne gap score for the nucleic acid sequence with an a ne gap score for all amino acid alignments resulting from coding sequences in open reading frames contained within the query sequence. We present a Dynamic Programming algorithm to compute the optimal global or local alignment using this genomic alignment score and provide a formal proof of correctness. This algorithm allows the alignment of nucleic acid sequences from closely related and highly divergent sequences within the same software and using the same parameters, automatically correcting any eventual frame shift errors and produces at the same time the aligned translated amino acid sequences of all relevant coding sequence features.AvailabilityThe software is available as a web application at http://www.genomedetective.com/app/aga and as command-line application at https://github.com/emweb/aga


2019 ◽  
Author(s):  
Veeren Chauhan ◽  
Mohamed M Elsutohy ◽  
C Patrick McClure ◽  
Will Irving ◽  
Neil Roddis ◽  
...  

<p>Enteroviruses are a ubiquitous mammalian pathogen that can produce mild to life-threatening disease. Bearing this in mind, we have developed a rapid, accurate and economical point-of-care biosensor that can detect a nucleic acid sequences conserved amongst 96% of all known enteroviruses. The biosensor harnesses the physicochemical properties of gold nanoparticles and aptamers to provide colourimetric, spectroscopic and lateral flow-based identification of an exclusive enteroviral RNA sequence (23 bases), which was identified through in silico screening. Aptamers were designed to demonstrate specific complementarity towards the target enteroviral RNA to produce aggregated gold-aptamer nanoconstructs. Conserved target enteroviral nucleic acid sequence (≥ 1x10<sup>-7</sup> M, ≥1.4×10<sup>-14</sup> g/mL), initiates gold-aptamer-nanoconstructs disaggregation and a signal transduction mechanism, producing a colourimetric and spectroscopic blueshift (544 nm (purple) > 524 nm (red)). Furthermore, lateral-flow-assays that utilise gold-aptamer-nanoconstructs were unaffected by contaminating human genomic DNA, demonstrated rapid detection of conserved target enteroviral nucleic acid sequence (< 60 s) and could be interpreted with a bespoke software and hardware electronic interface. We anticipate our methodology will translate in-silico screening of nucleic acid databases to a tangible enteroviral desktop detector, which could be readily translated to related organisms. This will pave-the-way forward in the clinical evaluation of disease and complement existing strategies at overcoming antimicrobial resistance.</p>



1991 ◽  
Vol 266 (1) ◽  
pp. 536-539
Author(s):  
H C Kim ◽  
W W Idler ◽  
I G Kim ◽  
J H Han ◽  
S I Chung ◽  
...  


1986 ◽  
Vol 14 (1) ◽  
pp. 99-107 ◽  
Author(s):  
Hannu Peltola ◽  
Hans Söderlund ◽  
Esko Ukkonen


1985 ◽  
Vol 13 (5) ◽  
pp. 1493-1504 ◽  
Author(s):  
Salomé Prat ◽  
Jordi Cortadas ◽  
Pere Puigdomènech ◽  
Jaume Palau


2021 ◽  
Author(s):  
Lin Miao ◽  
Miaoxin Li

AbstractThe mechanism of ohnolog retention is a subject of concern in evolutionary biology. Natural selections on coding sequences and gene dosages have been proposed to be determinants of ohnolog retention. However, the relationship between the two models is not widely accepted, and the role of regulatory sequences on ohnolog retention has long been neglected. In this study, based on a model of complex traits’ genetic architecture, we compared the natural selection’s strength on corresponding sequences between ohnologs and non-ohnologs by comparing complex traits’ heritability enrichments. We showed that complex traits’ regulatory sequences’ heritability enrichments (p = 1.1 × 10−5 in 5 kb flanking regions) and expression-mediated heritability enrichments (p = 2.1 × 10−5) of ohnologs were significantly higher than non-ohnologs. Then, we deduced that regulatory sequences of ohnologs were under substantial natural selection, which was also a determent of ohnolog retention. Meanwhile, we showed that in coding sequences, the complex traits’ heritability enrichments of ohnologs were significantly higher than of non-ohnologs (p = 9.9 × 10−5), supporting the ohnolog retention model of natural selection on coding sequences. We also showed that complex traits’ causal gene expression effect sizes of ohnologs were significantly larger than of non-ohnologs (p = 8.8 × 10−6), supporting the ohnolog retention model of natural selection on gene dosages. In conclusion, we provide the first unified framework to show that both amino acid sequences and expression levels of ohnologs are under substantial selection, which may end the long-standing debate on ohnolog retention models.



2019 ◽  
Author(s):  
Veeren Chauhan ◽  
Mohamed M Elsutohy ◽  
C Patrick McClure ◽  
Will Irving ◽  
Neil Roddis ◽  
...  

<p>Enteroviruses are a ubiquitous mammalian pathogen that can produce mild to life-threatening disease. Bearing this in mind, we have developed a rapid, accurate and economical point-of-care biosensor that can detect a nucleic acid sequences conserved amongst 96% of all known enteroviruses. The biosensor harnesses the physicochemical properties of gold nanoparticles and aptamers to provide colourimetric, spectroscopic and lateral flow-based identification of an exclusive enteroviral RNA sequence (23 bases), which was identified through in silico screening. Aptamers were designed to demonstrate specific complementarity towards the target enteroviral RNA to produce aggregated gold-aptamer nanoconstructs. Conserved target enteroviral nucleic acid sequence (≥ 1x10<sup>-7</sup> M, ≥1.4×10<sup>-14</sup> g/mL), initiates gold-aptamer-nanoconstructs disaggregation and a signal transduction mechanism, producing a colourimetric and spectroscopic blueshift (544 nm (purple) > 524 nm (red)). Furthermore, lateral-flow-assays that utilise gold-aptamer-nanoconstructs were unaffected by contaminating human genomic DNA, demonstrated rapid detection of conserved target enteroviral nucleic acid sequence (< 60 s) and could be interpreted with a bespoke software and hardware electronic interface. We anticipate our methodology will translate in-silico screening of nucleic acid databases to a tangible enteroviral desktop detector, which could be readily translated to related organisms. This will pave-the-way forward in the clinical evaluation of disease and complement existing strategies at overcoming antimicrobial resistance.</p>



1982 ◽  
Vol 2 (11) ◽  
pp. 1362-1371
Author(s):  
J M Monson ◽  
J Friedman ◽  
B J McCarthy

In a 3.8-kilobase mouse DNA sequence encoding amino acid sequences for the pro alpha 1(I) chain of type I procollagen, 14 coding sequences were identified which specify a sequence 95% homologous to amino acid residues 568 to 963 of the bovine alpha 1(I) chain. All of these coding sequences were flanked by appropriate splice junctions following the GT/AG rule. These observations suggest, but do not prove, that this pro alpha 1(I) gene is transcriptionally active. Of the 14 coding sequences, 7 were 54 base pairs in length, whereas the remainder were higher multiples of 54 base pairs. Nonrandom utilization of codons pertained throughout all of the coding sequences showing a preference (56%) for U in the wobble position. Two of the intervening sequences encoded imperfect vestiges of coding sequences which exhibited a codon preference different from that of the pro alpha 1(I) gene proper and were not flanked by splice junctions. One intervening sequence encoded a member of the mouse B1 family of middle repetitive sequences. It was flanked by 8-base-pair direct repeats and had a truncated A-rich region, suggesting that it may be a mobile element. Within this element were sequences which could function as a RNA polymerase III split promoter.





Sign in / Sign up

Export Citation Format

Share Document