reverse complement
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 26)

H-INDEX

8
(FIVE YEARS 3)

2021 ◽  
Author(s):  
Megha Mathur ◽  
Sumeet Patiyal ◽  
Anjali Dhall ◽  
Shipra Jain ◽  
Ritu Tomer ◽  
...  

In the past few decades, public repositories on nucleotides have increased with exponential rates. This pose a major challenge to researchers to predict the structure and function of nucleotide sequences. In order to annotate function of nucleotide sequences it is important to compute features/attributes for predicting function of these sequences using machine learning techniques. In last two decades, several software/platforms have been developed to elicit a wide range of features for nucleotide sequences. In order to complement the existing methods, here we present a platform named Nfeature developed for computing wide range of features of DNA and RNA sequences. It comprises of three major modules namely Composition, Correlation, and Binary profiles. Composition module allow to compute different type of compositions that includes mono-/di-tri-nucleotide composition, reverse complement composition, pseudo composition. Correlation module allow to compute various type of correlations that includes auto-correlation, cross-correlation, pseudo-correlation. Similarly, binary profile is developed for computing binary profile based on nucleotides, di-nucleotides, di-/tri-nucleotide properties. Nfeature also allow to compute entropy of sequences, repeats in sequences and distribution of nucleotides in sequences. In addition to compute feature in whole sequence, it also allows to compute features from part of sequence like split-composition, N-terminal, C-terminal. In a nutshell, Nfeature amalgamates existing features as well as number of novel features like nucleotide repeat index, distance distribution, entropy, binary profile, and properties. This tool computes a total of 29217 and 14385 features for DNA and RNA sequence, respectively. In order to provide, a highly efficient and user-friendly tool, we have developed a standalone package and web-based platform (https://webs.iiitd.edu.in/raghava/nfeature).


2021 ◽  
Vol 144 ◽  
pp. 104993
Author(s):  
Jordy P.M. Coolen ◽  
Femke Wolters ◽  
Alma Tostmann ◽  
Lenneke F.J. van Groningen ◽  
Chantal P. Bleeker-Rovers ◽  
...  

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Alon Kafri ◽  
Benny Chor ◽  
David Horn

Abstract Background Inversion Symmetry is a generalization of the second Chargaff rule, stating that the count of a string of k nucleotides on a single chromosomal strand equals the count of its inverse (reverse-complement) k-mer. It holds for many species, both eukaryotes and prokaryotes, for ranges of k which may vary from 7 to 10 as chromosomal lengths vary from 2Mbp to 200 Mbp. Building on this formalism we introduce the concept of k-mer distances between chromosomes. We formulate two k-mer distance measures, D1 and D2, which depend on k. D1 takes into account all k-mers (for a single k) appearing on single strands of the two compared chromosomes, whereas D2 takes into account both strands of each chromosome. Both measures reflect dissimilarities in global chromosomal structures. Results After defining the various distance measures and summarizing their properties, we also define proximities that rely on the existence of synteny blocks between chromosomes of different bacterial strains. Comparing pairs of strains of bacteria, we find negative correlations between synteny proximities and k-mer distances, thus establishing the meaning of the latter as measures of evolutionary distances among bacterial strains. The synteny measures we use are appropriate for closely related bacterial strains, where considerable sections of chromosomes demonstrate high direct or reversed equality. These measures are not appropriate for comparing different bacteria or eukaryotes. K-mer structural distances can be defined for all species. Because of the arbitrariness of strand choices, we employ only the D2 measure when comparing chromosomes of different species. The results for comparisons of various eukaryotes display interesting behavior which is partially consistent with conventional understanding of evolutionary genomics. In particular, we define ratios of minimal k-mer distances (KDR) between unmasked and masked chromosomes of two species, which correlate with both short and long evolutionary scales. Conclusions k-mer distances reflect dissimilarities among global chromosomal structures. They carry information which aggregates all mutations. As such they can complement traditional evolution studies , which mainly concentrate on coding regions.


BioTechniques ◽  
2021 ◽  
Author(s):  
Magdalena M Bus ◽  
Erik AC de Jong ◽  
Jonathan L King ◽  
Walter van der Vliet ◽  
Joop Theelen ◽  
...  

DNA analyses from challenging samples such as touch evidence, hairs and skeletal remains push the limits of the current forensic DNA typing technologies. Reverse complement PCR (RC-PCR) is a novel, single-step PCR target enrichment method adapted to amplify degraded DNA. The sample preparation process involves a limited number of steps, decreasing the labor required for library preparation and reducing the possibility of contamination due to less sample manipulation. These features of the RC-PCR make the technology a unique application to successfully target single nucleotide polymorphisms (SNPs) in fragmented and low copy number DNA and yield results from samples in which no or limited data are obtained with standard DNA typing methods. The developed RC-PCR short amplicon 85 SNP-plex panel is a substantial improvement over the previously reported 27-plex RC-PCR multiplex that will provide higher discrimination power for challenging DNA sample analyses.


2021 ◽  
Vol 10 (14) ◽  
pp. 3157
Author(s):  
Christian Ehrnthaller ◽  
Sonja Braumüller ◽  
Stephanie Kellermann ◽  
Florian Gebhard ◽  
Mario Perl ◽  
...  

Life-threatening polytrauma results in early activation of the complement and apoptotic system, as well as leukocytes, ultimately leading to the clearance of damaged cells. However, little is known about interactions between the complement and apoptotic systems in PMN (polymorphonuclear neutrophils) after multiple injuries. PMN from polytrauma patients and healthy volunteers were obtained and assessed for apoptotic events along the post-traumatic time course. In vitro studies simulated complement activation by the exposure of PMN to C3a or C5a and addressed both the intrinsic and extrinsic apoptotic pathway. Specific blockade of the C5a-receptor 1 (C5aR1) on PMN was evaluated for efficacy to reverse complement-driven alterations. PMN from polytrauma patients exhibited significantly reduced apoptotic rates up to 10 days post trauma compared to healthy controls. Polytrauma-induced resistance was associated with significantly reduced Fas-ligand (FasL) and Fas-receptor (FasR) on PMN and in contrast, significantly enhanced FasL and FasR in serum. Simulation of systemic complement activation revealed for C5a, but not for C3a, a dose-dependent abrogation of PMN apoptosis in both intrinsic and extrinsic pathways. Furthermore, specific blockade of the C5aR1 reversed C5a-induced PMN resistance to apoptosis. The data suggest an important regulatory and putative mechanistic and therapeutic role of the C5a/C5aR1 interaction on PMN apoptosis after polytrauma.


2021 ◽  
Author(s):  
Kristoffer Sahlin

Short-read genome alignment is a fundamental computational step used in many bioinformatic analyses. It is therefore desirable to align such data as fast as possible. Most alignment algorithms consider a seed-and-extend approach. Several popular programs perform the seeding step based on the Burrows-Wheeler Transform with a low memory footprint, but they are relatively slow compared to more recent approaches that use a minimizer-based seeding-and-chaining strategy. Recently, syncmers and strobemers were proposed for sequence comparison. Both protocols were designed for improved conservation of matches between sequences under mutations. Syncmers is a thinning protocol proposed as an alternative to minimizers, while strobemers is a linking protocol for gapped sequences and was proposed as an alternative to k-mers. The main contribution in this work is a new seeding approach that combines syncmers and strobemers. We use a strobemer protocol (randstrobes) to link together syncmers (i.e., in syncmer-space) instead of over the original sequence. Our protocol allows us to create longer seeds while preserving mapping accuracy. A longer seed length reduces the number of candidate regions which allows faster mapping and alignment. We also contribute the insight that speed-wise, this protocol is particularly effective when syncmers are canonical. Canonical syncmers can be created for specific parameter combinations and reduce the computational burden of computing the non-canonical randstrobes in reverse complement. We implement our idea in a proof-of-concept short-read aligner strobealign that aligns short reads 3-4x faster than minimap2 and 15-23x faster than BWA and Bowtie2. Many implementation versions of, e.g., BWA, achieve high speed on specific hardware. Our contribution is algorithmic and requires no hardware architecture or system-specific instructions. Strobealign is available at https://github.com/ksahlin/StrobeAlign.


2021 ◽  
Author(s):  
Vincent Mallet ◽  
Jean-Philippe Vert

As DNA sequencing technologies keep improving in scale and cost, there is a growing need to develop machine learning models to analyze DNA sequences, e.g., to decipher regulatory signals from DNA fragments bound by a particular protein of interest. As a double helix made of two complementary strands, a DNA fragment can be sequenced as two equivalent, so-called reverse complement (RC) sequences of nucleotides. To take into account this inherent symmetry of the data in machine learning models can facilitate learning. In this sense, several authors have recently proposed particular RC-equivariant convolutional neural networks (CNNs). However, it remains unknown whether other RC-equivariant architectures exist, which could potentially increase the set of basic models adapted to DNA sequences for practitioners. Here, we close this gap by characterizing the set of all linear RC-equivariant layers, and show in particular that new architectures exist beyond the ones already explored. We further discuss RC-equivariant pointwise nonlinearities adapted to different architectures, as well as RC-equivariant embeddings of k-mers as an alternative to one-hot encoding of nucleotides. We show experimentally that the new architectures can outperform existing ones.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Yunhee Choi ◽  
Ha Pham ◽  
Mai Phuong Nguyen ◽  
Le Viet Ha Tran ◽  
Jueun Kim ◽  
...  

AbstractThe conjugative plasmid (pBV71) possibly confers a selective advantage to Bacillus velezensis strain GH1-13, although a selective marker gene is yet to be identified. Here we show that few non-mucoid wild-type GH1-13 cells are spontaneously converted to mucoid variants with or without the loss of pBV71. Mucoid phenotypes, which contain or lack the plasmid, become sensitive to bacitracin, gramicidin, selenite, and tellurite. Using the differences in antibiotic resistance and phenotype, we isolated a reverse complement (COM) and a transconjugant of strain FZB42 with the native pBV71. Transformed COM and FZB42p cells were similar to the wild-type strain GH1-13 with high antibiotic resistance and slow growth rates on lactose compared to those of mucoid phenotypes. RT-PCR analysis revealed that the expression of plasmid-encoded orphan aspartate phosphatase (pRapD) was coordinated with a new quorum-sensing (QS) cassette of RapF2–PhrF2 present in the chromosome of strain GH1-13, but not in strain FZB42. Multi-omics analysis on wild-type and plasmid-cured cells of strain GH1-13 suggested that the conjugative plasmid expression has a crucial role in induction of early envelope stress response that promotes cell morphogenesis, biofilm formation, catabolite repression, and biosynthesis of extracellular-matrix components and antibiotics for protection of host cell during exponential phase.


2021 ◽  
Author(s):  
Heli A. M. Mönttinen ◽  
Ari Löytynoja

The evolutionary origin of ribonucleic acid (RNA) stem structures (1, 2) and the preservation of their base-pairing under a spontaneous and random mutation process have puzzled theoretical evolutionary biologists (3, 4). DNA replication-related template switching (5, 6) is a mutation mechanism that creates reverse-complement copies of sequence regions within a genome by replicating briefly either along the complementary or nascent DNA strand. Depending on the relative positions and context of the four switch points, this process may produce a reverse-complement repeat capable of forming the stem of a perfect DNA hairpin, or fix the base-pairing of an existing stem (7). Template switching is typically thought to trigger large structural changes (8–10) and its possible role in the origin and evolution of RNA genes has not been studied. Here we show that the reconstructed ancestral history of ribosomal RNA sequences contains compensatory base substitutions that are linked with parallel sequence changes consistent with the DNA replication-related template switching. In addition to compensatory mutations, the mechanism can explain complex changes involving non-Watson-Crick pairing and appearances of novel stem structures, though mutations breaking the structure rarely get fixed in evolution. Our results suggest a solution for the longstanding dilemma of RNA gene evolution (1, 3, 4) and demonstrate how template switching can both create perfect stem structures with a single mutation event and maintain their base pairing over time with matching changes. The mechanism can also generate parallel sequence changes, many inexplicable under the point mutation model (11), and provides an explanation for the asymmetric base-pair frequencies in stem structures (12).


2021 ◽  
Author(s):  
Christopher J Mattocks ◽  
Daniel Ward ◽  
Deborah JG Mackay

We describe a novel assay method: reverse-transcription reverse-complement polymerase chain reaction (RT-RC-PCR), which rationalises reverse transcription and NGS library preparation into a single closed tube reaction. By simplifying the analytical process and cross-contamination risks, RT-RC-PCR presents disruptive scalability and economy while using NGS and LIMS infrastructure widely available across health service, institutional and commercial laboratories. We present a validation of RT-RC-PCR for the qualitative detection of SARS-CoV-2 RNA by NGS. The limit of detection is comparable to real-time RT-PCR, and no obvious difference in sensitivity was detected between extracted nasopharyngeal swab (NPS) RNA and native saliva samples. The end point measurement of RT-RC-PCR is NGS of amplified sequences within the SARS-CoV-2 genome; we demonstrated its capacity to detect different variants using amplicons containing delH69-V70 and N501Y, both of which emerged in the UK Variant of Concern B.1.1.7 in 2020. In summary, RT-RC-PCR has potential to facilitate accurate mass testing at disruptive scale and cost, with concurrent detection of variants of concern.


Sign in / Sign up

Export Citation Format

Share Document