Algorithms for String Comparison in DNA Sequences

Author(s):  
Dhiman Goswami ◽  
Nishat Sultana ◽  
Warda Ruheen Bristi
Algorithms ◽  
2020 ◽  
Vol 13 (2) ◽  
pp. 47
Author(s):  
Sarah Pilz ◽  
Florian Porrmann ◽  
Martin Kaiser ◽  
Jens Hagemeyer ◽  
James M. Hogan ◽  
...  

This paper is concerned with Field Programmable Gate Arrays (FPGA)-based systems for energy-efficient high-throughput string comparison. Modern applications which involve comparisons across large data sets—such as large sequence sets in molecular biology—are by their nature computationally intensive. In this work, we present a scalable FPGA-based system architecture to accelerate the comparison of binary strings. The current architecture supports arbitrary lengths in the range 16 to 2048-bit, covering a wide range of possible applications. In our example application, we consider DNA sequences embedded in a binary vector space through Locality Sensitive Hashing (LSH) one of several possible encodings that enable us to avoid more costly character-based operations. Here the resulting encoding is a 512-bit binary signature with comparisons based on the Hamming distance. In this approach, most of the load arises from the calculation of the O ( m ∗ n ) Hamming distances between the signatures, where m is the number of queries and n is the number of signatures contained in the database. Signature generation only needs to be performed once, and we do not consider it further, focusing instead on accelerating the signature comparisons. The proposed FPGA-based architecture is optimized for high-throughput using hundreds of computing elements, arranged in a systolic array. These core computing elements can be adapted to support other string comparison algorithms with little effort, while the other infrastructure stays the same. On a Xilinx Virtex UltraScale+ FPGA (XCVU9P-2), a peak throughput of 75.4 billion comparisons per second—of 512-bit signatures—was achieved, using a design with 384 parallel processing elements and a clock frequency of 200 MHz. This makes our FPGA design 86 times faster than a highly optimized CPU implementation. Compared to a GPU design, executed on an NVIDIA GTX1060, it performs nearly five times faster.


Author(s):  
David P. Bazett-Jones ◽  
Mark L. Brown

A multisubunit RNA polymerase enzyme is ultimately responsible for transcription initiation and elongation of RNA, but recognition of the proper start site by the enzyme is regulated by general, temporal and gene-specific trans-factors interacting at promoter and enhancer DNA sequences. To understand the molecular mechanisms which precisely regulate the transcription initiation event, it is crucial to elucidate the structure of the transcription factor/DNA complexes involved. Electron spectroscopic imaging (ESI) provides the opportunity to visualize individual DNA molecules. Enhancement of DNA contrast with ESI is accomplished by imaging with electrons that have interacted with inner shell electrons of phosphorus in the DNA backbone. Phosphorus detection at this intermediately high level of resolution (≈lnm) permits selective imaging of the DNA, to determine whether the protein factors compact, bend or wrap the DNA. Simultaneously, mass analysis and phosphorus content can be measured quantitatively, using adjacent DNA or tobacco mosaic virus (TMV) as mass and phosphorus standards. These two parameters provide stoichiometric information relating the ratios of protein:DNA content.


Author(s):  
Barbara Trask ◽  
Susan Allen ◽  
Anne Bergmann ◽  
Mari Christensen ◽  
Anne Fertitta ◽  
...  

Using fluorescence in situ hybridization (FISH), the positions of DNA sequences can be discretely marked with a fluorescent spot. The efficiency of marking DNA sequences of the size cloned in cosmids is 90-95%, and the fluorescent spots produced after FISH are ≈0.3 μm in diameter. Sites of two sequences can be distinguished using two-color FISH. Different reporter molecules, such as biotin or digoxigenin, are incorporated into DNA sequence probes by nick translation. These reporter molecules are labeled after hybridization with different fluorochromes, e.g., FITC and Texas Red. The development of dual band pass filters (Chromatechnology) allows these fluorochromes to be photographed simultaneously without registration shift.


Author(s):  
José L. Carrascosa ◽  
José M. Valpuesta ◽  
Hisao Fujisawa

The head to tail connector of bacteriophages plays a fundamental role in the assembly of viral heads and DNA packaging. In spite of the absence of sequence homology, the structure of connectors from different viruses (T4, Ø29, T3, P22, etc) share common morphological features, that are most clearly revealed in their three-dimensional structure. We have studied the three-dimensional reconstruction of the connector protein from phage T3 (gp 8) from tilted view of two dimensional crystals obtained from this protein after cloning and purification.DNA sequences including gene 8 from phage T3 were cloned, into Bam Hl-Eco Rl sites down stream of lambda promotor PL, in the expression vector pNT45 under the control of cI857. E R204 (pNT89) cells were incubated at 42°C for 2h, harvested and resuspended in 20 mM Tris HC1 (pH 7.4), 7mM 2 mercaptoethanol, ImM EDTA. The cells were lysed by freezing and thawing in the presence of lysozyme (lmg/ml) and ligthly sonicated. The low speed supernatant was precipitated by ammonium sulfate (60% saturated) and dissolved in the original buffer to be subjected to gel nitration through Sepharose 6B, followed by phosphocellulose colum (Pll) and DEAE cellulose colum (DE52). Purified gp8 appeared at 0.3M NaCl and formed crystals when its concentration increased above 1.5 mg/ml.


2019 ◽  
Vol 63 (6) ◽  
pp. 757-771 ◽  
Author(s):  
Claire Francastel ◽  
Frédérique Magdinier

Abstract Despite the tremendous progress made in recent years in assembling the human genome, tandemly repeated DNA elements remain poorly characterized. These sequences account for the vast majority of methylated sites in the human genome and their methylated state is necessary for this repetitive DNA to function properly and to maintain genome integrity. Furthermore, recent advances highlight the emerging role of these sequences in regulating the functions of the human genome and its variability during evolution, among individuals, or in disease susceptibility. In addition, a number of inherited rare diseases are directly linked to the alteration of some of these repetitive DNA sequences, either through changes in the organization or size of the tandem repeat arrays or through mutations in genes encoding chromatin modifiers involved in the epigenetic regulation of these elements. Although largely overlooked so far in the functional annotation of the human genome, satellite elements play key roles in its architectural and topological organization. This includes functions as boundary elements delimitating functional domains or assembly of repressive nuclear compartments, with local or distal impact on gene expression. Thus, the consideration of satellite repeats organization and their associated epigenetic landmarks, including DNA methylation (DNAme), will become unavoidable in the near future to fully decipher human phenotypes and associated diseases.


2003 ◽  
Author(s):  
Hector Sabelli ◽  
Arthur Sugerman ◽  
Lazar Kovacevic ◽  
Louis Kauffman

Acta Naturae ◽  
2016 ◽  
Vol 8 (2) ◽  
pp. 79-86 ◽  
Author(s):  
P. V. Elizar’ev ◽  
D. V. Lomaev ◽  
D. A. Chetverina ◽  
P. G. Georgiev ◽  
M. M. Erokhin

Maintenance of the individual patterns of gene expression in different cell types is required for the differentiation and development of multicellular organisms. Expression of many genes is controlled by Polycomb (PcG) and Trithorax (TrxG) group proteins that act through association with chromatin. PcG/TrxG are assembled on the DNA sequences termed PREs (Polycomb Response Elements), the activity of which can be modulated and switched from repression to activation. In this study, we analyzed the influence of transcriptional read-through on PRE activity switch mediated by the yeast activator GAL4. We show that a transcription terminator inserted between the promoter and PRE doesnt prevent switching of PRE activity from repression to activation. We demonstrate that, independently of PRE orientation, high levels of transcription fail to dislodge PcG/TrxG proteins from PRE in the absence of a terminator. Thus, transcription is not the main factor required for PRE activity switch.


Sign in / Sign up

Export Citation Format

Share Document