scholarly journals From Space to Sequence and Back Again: Iterative DNA Proximity Ligation and its Applications to DNA-Based Imaging

2018 ◽  
Author(s):  
Alexander A. Boulgakov ◽  
Erhu Xiong ◽  
Sanchita Bhadra ◽  
Andrew D. Ellington ◽  
Edward M. Marcotte

AbstractWe extend the concept of DNA proximity ligation from a single readout per oligonucleotide pair to multiple reversible, iterative ligations re-using the same oligonucleotide molecules. Using iterative proximity ligation (IPL), we can in principle capture multiple ligation events between each oligonucleotide and its various neighbors and thus recover a far richer knowledge about their relative positions than single, irreversible ligation events. IPL would thus act to sample and record local molecular neighborhoods. By integrating a unique DNA barcode into each participating oligonucleotide, we can catalog the individual ligation events and thus capture the positional information contained therein in a high throughput manner using next-generation DNA sequencing. We propose that by interpreting IPL sequencing results in the context of graph theory and by applying spring layout algorithms, we can recover geometric patterns of objects labeled by DNA. Using simulations, we demonstrate that we can in principle recover letter patterns photolithographed onto slide surfaces using only IPL sequencing data, illustrating how our technique maps complex spatial configurations into DNA sequences and then – using only this sequence information – recovers them. We complement our theoretical work with an experimental proof-of-concept of iterative proximity ligation on an oligonucleotide population.

2017 ◽  
Vol 205 ◽  
pp. 517-536 ◽  
Author(s):  
S. Dick ◽  
S. E. J. Bell

To address the question of whether the SERS signals of ss-DNA are simply combinations of the signals from the individual bases that comprise the sequence, SERS spectra of unmodified ss-DNA sequences were obtained using a hydroxylamine-reduced Ag colloid aggregated with MgSO4. Initially, synthetic oligodeoxynucleotides with systematic structural variations were used to investigate the effect of adding single nucleobases to the 3′ terminus of 10-mer and 20-mer sequences. It was found that the resulting SERS difference spectra could be used to identify the added nucleobases since they closely matched reference spectra of the same nucleobase. Investigation of the variation in intensity of an adenine probe which was moved along a test sequence showed there was a small end effect where nucleobases near the 3′ terminus gave slightly larger signals but the effect was minor (30%). More significantly, in a sample set comprising 25-mer sequences where A, T or G nucleobases were substituted either near the centres of the sequences or the 5′ or 3′ ends, the SERS difference spectra only matched the expected form in approximately half the cases tested. This variation appeared to be due to changes in secondary structure induced by altering the sequences since uncoiling the sequences in a thermal pre-treatment step gave difference spectra which in all cases matched the expected form. Multivariate analysis of the set of substitution data showed that 99% of the variance could be accounted for in a model with just three factors whose loadings matched the spectra of the A, T, and G nucleobases and which contained no positional information. This suggests that aside from the differences in secondary structure which can be eliminated by thermal pre-treatment, the SERS spectra of the 25-mers studied here are simply the sum of their component parts. Although this means that SERS provides very little information on the primary sequence it should be excellent for the detection of post-transcription modifications to DNA which can occur at multiple positions along a given sequence.


2014 ◽  
Author(s):  
Christopher W. Beitel ◽  
Lutz Froenicke ◽  
Jenna M. Lang ◽  
Ian F. Korf ◽  
Richard W. Michelmore ◽  
...  

Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of “binning” the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.


2019 ◽  
Vol 8 ◽  
pp. 57-61
Author(s):  
Sunil Bhandari ◽  
Jay Bhandari ◽  
Sanjay Lama

DNA barcoding is an emerging tool for species identification that uses internationally agreed protocols and regions of DNA to create a global database of living organisms. Initiatives are taking place to generate DNA ba rcodes for all groups of living organisms make these genomic identity publically available to understand, conserve, and utilize the world 's biodiversity. Most of the terrestrial plants are characterized using two section of coding region within chloplast, part of chloroplast gene, a more conserved rbcl and more polymorphic MatK gene. In order to create high quality databases, each plants are characterized not only with the rbcl and MatK DNA sequences, an additional sequence information from internal transcribed spacer (ITS) region is more efficient. The quality of barcode depends on the various factors such as efficient primers, purity of DNA templates, as well as the quality of PCR amplicon from which the sequence data will derive. The protocol described here led to the generation of high efficient PCR amplicon which will aid in the minimization of erroneous DNA sequence infonnation from which bioinformatics procedure will generate efficient barcodes. The primers used to amplified MatK, rbcl and ITS sequence were MatK-4 13f-1 and MatK- 1227r-1, rbcl-1F and rbcl-724R, ITS1 and ITS4 showed a strong amplification successes of 80% of each in the tasted medicinal plants of Nepal. This study propose that the used sets of primers and amplification condition will help, in part, the development of DNA barcode for medicinally important plants of Nepal to conserve their identity with its nativeness.


2014 ◽  
Author(s):  
Christopher W. Beitel ◽  
Lutz Froenicke ◽  
Jenna M. Lang ◽  
Ian F. Korf ◽  
Richard W. Michelmore ◽  
...  

Metagenomics is a valuable tool for the study of microbial communities but has been limited by the difficulty of “binning” the resulting sequences into groups corresponding to the individual species and strains that constitute the community. Moreover, there are presently no methods to track the flow of mobile DNA elements such as plasmids through communities or to determine which of these are co-localized within the same cell. We address these limitations by applying Hi-C, a technology originally designed for the study of three-dimensional genome structure in eukaryotes, to measure the cellular co-localization of DNA sequences. We leveraged Hi-C data generated from a synthetic metagenome sample to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species. The Hi-C data also reliably associated plasmids with the chromosomes of their host and with each other. We further demonstrated that Hi-C data provides a long-range signal of strain-specific genotypes, indicating such data may be useful for high-resolution genotyping of microbial populations. Our work demonstrates that Hi-C sequencing data provide valuable information for metagenome analyses that are not currently obtainable by other methods. This metagenomic Hi-C method could facilitate future studies of the fine-scale population structure of microbes, as well as studies of how antibiotic resistance plasmids (or other genetic elements) mobilize in microbial communities. The method is not limited to microbiology; the genetic architecture of other heterogeneous populations of cells could also be studied with this technique.


Acta Naturae ◽  
2016 ◽  
Vol 8 (2) ◽  
pp. 79-86 ◽  
Author(s):  
P. V. Elizar’ev ◽  
D. V. Lomaev ◽  
D. A. Chetverina ◽  
P. G. Georgiev ◽  
M. M. Erokhin

Maintenance of the individual patterns of gene expression in different cell types is required for the differentiation and development of multicellular organisms. Expression of many genes is controlled by Polycomb (PcG) and Trithorax (TrxG) group proteins that act through association with chromatin. PcG/TrxG are assembled on the DNA sequences termed PREs (Polycomb Response Elements), the activity of which can be modulated and switched from repression to activation. In this study, we analyzed the influence of transcriptional read-through on PRE activity switch mediated by the yeast activator GAL4. We show that a transcription terminator inserted between the promoter and PRE doesnt prevent switching of PRE activity from repression to activation. We demonstrate that, independently of PRE orientation, high levels of transcription fail to dislodge PcG/TrxG proteins from PRE in the absence of a terminator. Thus, transcription is not the main factor required for PRE activity switch.


2020 ◽  
Vol 15 ◽  
Author(s):  
Hongdong Li ◽  
Wenjing Zhang ◽  
Yuwen Luo ◽  
Jianxin Wang

Aims: Accurately detect isoforms from third generation sequencing data. Background: Transcriptome annotation is the basis for the analysis of gene expression and regulation. The transcriptome annotation of many organisms such as humans is far from incomplete, due partly to the challenge in the identification of isoforms that are produced from the same gene through alternative splicing. Third generation sequencing (TGS) reads provide unprecedented opportunity for detecting isoforms due to their long length that exceeds the length of most isoforms. One limitation of current TGS reads-based isoform detection methods is that they are exclusively based on sequence reads, without incorporating the sequence information of known isoforms. Objective: Develop an efficient method for isoform detection. Method: Based on annotated isoforms, we propose a splice isoform detection method called IsoDetect. First, the sequence at exon-exon junction is extracted from annotated isoforms as the “short feature sequence”, which is used to distinguish different splice isoforms. Second, we aligned these feature sequences to long reads and divided long reads into groups that contain the same set of feature sequences, thereby avoiding the pair-wise comparison among the large number of long reads. Third, clustering and consensus generation are carried out based on sequence similarity. For the long reads that do not contain any short feature sequence, clustering analysis based on sequence similarity is performed to identify isoforms. Result: Tested on two datasets from Calypte Anna and Zebra Finch, IsoDetect showed higher speed and compelling accuracy compared with four existing methods. Conclusion: IsoDetect is a promising method for isoform detection. Other: This paper was accepted by the CBC2019 conference.


Genetics ◽  
1986 ◽  
Vol 113 (4) ◽  
pp. 1077-1091
Author(s):  
John H Gillespie

ABSTRACT A statistical analysis of DNA sequences from four nuclear loci and five mitochondrial loci from different orders of mammals is described. A major aim of the study is to describe the variation in the rate of molecular evolution of proteins and DNA. A measure of rate variability is the statistic R, the ratio of the variance in the number of substitutions to the mean number. For proteins, R is found to be in the range 0.16 < R < 35.55, thus extending in both directions the values seen in previous studies. An analysis of codons shows that there is a highly significant excess of double substitutions in the first and second positions, but not in the second and third or first and third positions. The analysis of the dynamics of nucleotide evolution showed that the ergodic Markov chain models that are the basis of most published formulas for correcting for multiple substitutions are incompatible with the data. A bootstrap procedure was used to show that the evolution of the individual nucleotides, even the third positions, show the same variation in rates as seen in the proteins. It is argued that protein and silent DNA evolution are uncoupled, with the evolution at both levels showing patterns that are better explained by the action of natural selection than by neutrality. This conclusion is based primarily on a comparison of the nuclear and mitochondrial results.


2021 ◽  
Vol 54 (1) ◽  
pp. 1-22
Author(s):  
Rayan Chikhi ◽  
Jan Holub ◽  
Paul Medvedev

The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k -mers, which are short fixed-length strings present in a dataset. While these approaches are rather diverse, storing and querying a k -mer set has emerged as a shared underlying component. A set of k -mers has unique features and applications that, over the past 10 years, have resulted in many specialized approaches for its representation. In this survey, we give a unified presentation and comparison of the data structures that have been proposed to store and query a k -mer set. We hope this survey will serve as a resource for researchers in the field as well as make the area more accessible to researchers outside the field.


2019 ◽  
Vol 102 (5) ◽  
pp. 1263-1270 ◽  
Author(s):  
Weili Xiong ◽  
Melinda A McFarland ◽  
Cary Pirone ◽  
Christine H Parker

Abstract Background: To effectively safeguard the food-allergic population and support compliance with food-labeling regulations, the food industry and regulatory agencies require reliable methods for food allergen detection and quantification. MS-based detection of food allergens relies on the systematic identification of robust and selective target peptide markers. The selection of proteotypic peptide markers, however, relies on the availability of high-quality protein sequence information, a bottleneck for the analysis of many plant-based proteomes. Method: In this work, data were compiled for reference tree nut ingredients and evaluated using a parsimony-driven global proteomics workflow. Results: The utility of supplementing existing incomplete protein sequence databases with translated genomic sequencing data was evaluated for English walnut and provided enhanced selection of candidate peptide markers and differentiation between closely related species. Highlights: Future improvements of protein databases and release of genomics-derived sequences are expected to facilitate the development of robust and harmonized LC–tandem MS-based methods for food allergen detection.


1999 ◽  
Vol 81 (3) ◽  
pp. 1274-1283 ◽  
Author(s):  
F. K. Skinner ◽  
L. Zhang ◽  
J. L. Perez Velazquez ◽  
P. L. Carlen

Bursting in inhibitory interneuronal networks: a role for gap-junctional coupling. Much work now emphasizes the concept that interneuronal networks play critical roles in generating synchronized, oscillatory behavior. Experimental work has shown that functional inhibitory networks alone can produce synchronized activity, and theoretical work has demonstrated how synchrony could occur in mutually inhibitory networks. Even though gap junctions are known to exist between interneurons, their role is far from clear. We present a mechanism by which synchronized bursting can be produced in a minimal network of mutually inhibitory and gap-junctionally coupled neurons. The bursting relies on the presence of persistent sodium and slowly inactivating potassium currents in the individual neurons. Both GABAA inhibitory currents and gap-junctional coupling are required for stable bursting behavior to be obtained. Typically, the role of gap-junctional coupling is focused on synchronization mechanisms. However, these results suggest that a possible role of gap-junctional coupling may lie in the generation and stabilization of bursting oscillatory behavior.


Sign in / Sign up

Export Citation Format

Share Document