scholarly journals A Computational Method to Quantify the Effects of Slipped Strand Mispairing on Bacterial Tetranucleotide Repeats

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Gregory P. Harhay ◽  
Dayna M. Harhay ◽  
James L. Bono ◽  
Sarah F. Capik ◽  
Keith D. DeDonder ◽  
...  

AbstractThe virulence and pathogenicity of bacterial pathogens are related to their adaptability to changing environments. One process enabling adaptation is based on minor changes in genome sequence, as small as a few base pairs, within segments of genome called simple sequence repeats (SSRs) that consist of multiple copies of a short sequence (from one to several nucleotides), repeated in series. SSRs are found in eukaryotes as well as prokaryotes, and length variation in them occurs at frequencies up to a million-fold higher than bacterial point mutations through the process of slipped strand mispairing (SSM) by DNA polymerase during replication. The characterization of SSR length by standard sequencing methods is complicated by the appearance of length variation introduced during the sequencing process that obscures the lower abundance repeat number variants in a population. Here we report a computational approach to correct for sequencing process-induced artifacts, validated for tetranucleotide repeats by use of synthetic constructs of fixed, known length. We apply this method to a laboratory culture of Histophilus somni, prepared from a single colony, and demonstrate that the culture consists of populations of distinct sequence phase and length variants at individual tetranucleotide SSR loci.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Gregory P. Harhay ◽  
Dayna M. Harhay ◽  
James L. Bono ◽  
Sarah F. Capik ◽  
Keith D. DeDonder ◽  
...  

2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Manyun Guo ◽  
Yucheng Ma ◽  
Wanyuan Liu ◽  
Zuyi Yuan

AbstractNucleocapsid protein (NC) in the group-specific antigen (gag) of retrovirus is essential in the interactions of most retroviral gag proteins with RNAs. Computational method to predict NCs would benefit subsequent structure analysis and functional study on them. However, no computational method to predict the exact locations of NCs in retroviruses has been proposed yet. The wide range of length variation of NCs also increases the difficulties. In this paper, a computational method to identify NCs in retroviruses is proposed. All available retrovirus sequences with NC annotations were collected from NCBI. Models based on random forest (RF) and weighted support vector machine (WSVM) were built to predict initiation and termination sites of NCs. Factor analysis scales of generalized amino acid information along with position weight matrix were utilized to generate the feature space. Homology based gene prediction methods were also compared and integrated to bring out better predicting performance. Candidate initiation and termination sites predicted were then combined and screened according to their intervals, decision values and alignment scores. All available gag sequences without NC annotations were scanned with the model to detect putative NCs. Geometric means of sensitivity and specificity generated from prediction of initiation and termination sites under fivefold cross-validation are 0.9900 and 0.9548 respectively. 90.91% of all the collected retrovirus sequences with NC annotations could be predicted totally correct by the model combining WSVM, RF and simple alignment. The composite model performs better than the simplex ones. 235 putative NCs in unannotated gags were detected by the model. Our prediction method performs well on NC recognition and could also be expanded to solve other gene prediction problems, especially those whose training samples have large length variations.


2005 ◽  
Vol 272 (1568) ◽  
pp. 1153-1161 ◽  
Author(s):  
Denae Nash ◽  
Shalini Nair ◽  
Mayfong Mayxay ◽  
Paul N Newton ◽  
Jean-Paul Guthmann ◽  
...  

Neutral mutations may hitchhike to high frequency when they are situated close to sites under positive selection, generating local reductions in genetic diversity. This process is thought to be an important determinant of levels of genomic variation in natural populations. The size of genome regions affected by genetic hitchhiking is expected to be dependent on the strength of selection, but there is little empirical data supporting this prediction. Here, we compare microsatellite variation around two drug resistance genes (chloroquine resistance transporter ( pfcrt ), chromosome 7, and dihydrofolate reductase ( dhfr ), chromosome 4) in malaria parasite populations exposed to strong (Thailand) or weak selection (Laos) by anti-malarial drugs. In each population, we examined the point mutations underlying resistance and length variation at 22 (chromosome 4) or 25 (chromosome 7) microsatellite markers across these chromosomes. All parasites from Thailand carried the K76T mutation in pfcrt conferring resistance to chloroquine (CQ) and 2–4 mutations in dhfr conferring resistance to pyrimethamine. By contrast, we found both wild-type and resistant alleles at both genes in Laos. There were dramatic differences in the extent of hitchhiking in the two countries. The size of genome regions affected was smaller in Laos than in Thailand. We observed significant reduction in variation relative to sensitive parasites for 34–64 kb (2–4 cM) in Laos on chromosome 4, compared with 98–137 kb (6–8 cM) in Thailand. Similarly, on chromosome 7, we observed reduced variation for 34–69 kb (2–4 cM) around pfcrt in Laos, but for 195–268 kb (11–16 cM) in Thailand. Reduction in genetic variation was also less extreme in Laos than in Thailand. Most loci were monomorphic in a 12 kb region surrounding both genes on resistant chromosomes from Thailand, whereas in Laos, even loci immediately proximal to selective sites showed some variation on resistant chromosomes. Finally, linkage disequilibrium (LD) decayed more rapidly around resistant pfcrt and dhfr alleles from Laos than from Thailand. These results demonstrate that different realizations of the same selective sweeps may vary considerably in size and shape, in a manner broadly consistent with selection history. From a practical perspective, genomic regions containing resistance genes may be most effectively located by genome-wide association in populations exposed to strong drug selection. However, the lower levels of LD surrounding resistance alleles in populations under weak selection may simplify identification of functional mutations.


Author(s):  
Kalliopi Skamaki ◽  
Stephane Emond ◽  
Matthieu Chodorge ◽  
John Andrews ◽  
D. Gareth Rees ◽  
...  

AbstractWe report the first systematic combinatorial exploration of affinity enhancement of antibodies by insertions and deletions (InDels). Transposon-based introduction of InDels via the method TRIAD was used to generate large libraries with random in-frame InDels across the entire scFv gene that were further recombined and screened by ribosome display. Knowledge of potential insertion points from TRIAD libraries formed the basis of exploration of length and sequence diversity of novel insertions by insertional-scanning mutagenesis (ISM). An overall 256-fold affinity improvement of an anti-IL-13 antibody BAK1 as a result of InDel mutagenesis and combination with known point mutations validates this approach and suggests that the results of this InDel approach and conventional exploration of point mutations can synergize to generate antibodies with higher affinity.SignificanceInsertion/deletion (InDel) mutations play key roles in genome and protein evolution. Despite their prominence in evolutionary history, the potential of InDels for changing function in protein engineering by directed evolution remains unexplored. Instead point mutagenesis is widely used. Here we create antibody libraries containing InDels and demonstrate that affinity maturation can be achieved in this way, establishing an alternative to the point mutation strategies employed in all previous in vitro selections. These InDels mirror the observation of considerable length variation in loops of natural antibodies originating from the same germline genes and be combined with point mutations, making both natural sources of functional innovation available for artificial evolution in the test tube.


2014 ◽  
Author(s):  
Brad Gulko ◽  
Ilan Gronau ◽  
Melissa J Hubisz ◽  
Adam Siepel

We describe a novel computational method for estimating the probability that a point mutation at each position in a genome will influence fitness. These fitness consequence (fitCons) scores serve as evolution-based measures of potential genomic function. Our approach is to cluster genomic positions into groups exhibiting distinct "fingerprints" based on high-throughput functional genomic data, then to estimate a probability of fitness consequences for each group from associated patterns of genetic polymorphism and divergence. We have generated fitCons scores for three human cell types based on public data from ENCODE. Compared with conventional conservation scores, fitCons scores show considerably improved prediction power for cis-regulatory elements. In addition, fitCons scores indicate that 4.2-7.5% of nucleotides in the human genome have influenced fitness since the human-chimpanzee divergence, and, in contrast to several recent studies, they suggest that recent evolutionary turnover has had alimited impact on the functional content of the genome.


2021 ◽  
Vol 17 (8) ◽  
pp. e1009233
Author(s):  
Karoline Horgmo Jæger ◽  
Andrew G. Edwards ◽  
Wayne R. Giles ◽  
Aslak Tveito

Mutations are known to cause perturbations in essential functional features of integral membrane proteins, including ion channels. Even restricted or point mutations can result in substantially changed properties of ion currents. The additive effect of these alterations for a specific ion channel can result in significantly changed properties of the action potential (AP). Both AP shortening and AP prolongation can result from known mutations, and the consequences can be life-threatening. Here, we present a computational method for identifying new drugs utilizing combinations of existing drugs. Based on the knowledge of theoretical effects of existing drugs on individual ion currents, our aim is to compute optimal combinations that can ‘repair’ the mutant AP waveforms so that the baseline AP-properties are restored. More specifically, we compute optimal, combined, drug concentrations such that the waveforms of the transmembrane potential and the cytosolic calcium concentration of the mutant cardiomyocytes (CMs) becomes as similar as possible to their wild type counterparts after the drug has been applied. In order to demonstrate the utility of this method, we address the question of computing an optimal drug for the short QT syndrome type 1 (SQT1). For the SQT1 mutation N588K, there are available data sets that describe the effect of various drugs on the mutated K+ channel. These published findings are the basis for our computational analysis which can identify optimal compounds in the sense that the AP of the mutant CMs resembles essential biomarkers of the wild type CMs. Using recently developed insights regarding electrophysiological properties among myocytes from different species, we compute optimal drug combinations for hiPSC-CMs, rabbit ventricular CMs and adult human ventricular CMs with the SQT1 mutation. Since the ‘composition’ of ion channels that form the AP is different for the three types of myocytes under consideration, so is the composition of the optimal drug.


2020 ◽  
Author(s):  
Espada Rocío ◽  
Zarevski Nikola ◽  
Dramé-Maigné Adèle ◽  
Rondelez Yannick

AbstractNanopore sequencing is a powerful single molecule DNA sequencing technology which offers high throughput and long sequence reads. Nevertheless, its high native error rate limits the direct detection of point mutations in individual reads of amplicon libraries, as these mutations are difficult to distinguish from the sequencing noise.In this work, we developed SINGLe (SNPs In Nanopore reads of Gene Libraries), a computational method to reduce the noise in nanopore reads of amplicons containing point variations. Our approach uses the fact that all reads are very similar to a wild type sequence, for which we experimentally characterize the position-specific systematic sequencing error pattern. We then use this information to reweight the confidence given to nucleotides that do not match the wild type in individual variant reads. We tested this method in a set of variants of KlenTaq, where the true mutation rate was well below the sequencing noise. SINGLe improves between 4 and 9 fold the signal to noise ratio, in comparison to the data returned by the basecaller guppy. Downstream, this approach improves variants clustering and consensus calling.SINGLe is simple to implement and requires only a few thousands reads of the wild type sequence of interest, which can be easily obtained by multiplexing in a single minION run. It does not require any modification in the experimental protocol, it does not imply a large loss of sequencing throughput, and it can be incorporated downstream of standard basecalling.


2022 ◽  
Author(s):  
Etienne Sollier ◽  
Jack Kuipers ◽  
Niko Beerenwinkel ◽  
Koichi Takahashi ◽  
Katharina Jahn

Reconstructing the history of somatic DNA alterations that occurred in a tumour can help understand its evolution and predict its resistance to treatment. Single-cell DNA sequencing (scDNAseq) can be used to investigate clonal heterogeneity and to inform phylogeny reconstruction. However, existing phylogenetic methods for scDNAseq data are designed either for point mutations or for large copy number variations, but not for both types of events simultaneously. Here, we develop COMPASS, a computational method for inferring the joint phylogeny of mutations and copy number alterations from targeted scDNAseq data. We evaluate COMPASS on simulated data and show that it outperforms existing methods. We apply COMPASS to a large cohort of 123 patients with acute myeloid leukemia (AML) and detect copy number alterations, including subclonal ones, which are in agreement with current knowledge of AML development. We further used bulk SNP array data to orthogonally validate or findings.


Genome ◽  
1999 ◽  
Vol 42 (1) ◽  
pp. 158-161 ◽  
Author(s):  
N Soranzo ◽  
J Provan ◽  
W Powell

In the present study, the intergenic region between the mitochondrial genes encoding subunit 3 of NADH dehydrogenase (nad3) and ribosomal protein S12 (rps12) was shown to contain a Gn mononucleotide microsatellite repeat. This region was analysed in 15 species belonging to the genus Pinus and interspecific variation was detected in the form of repeat length polymorphism. Sequence analysis of a 576-bp region containing the microsatellite confirmed that the variability was due to expansion and contraction of the repeat motif and that no point mutations were present in the coding regions of the two genes. This is the first report of the occurrence of a microsatellite polymorphism in plant mitochondria.Key words: mitochondrial, microsatellite, Pinus, hard pines, taxonomy.


2011 ◽  
Vol 308-310 ◽  
pp. 2191-2194
Author(s):  
Si Yi Li ◽  
Ya Ping Yang

A combined mechanism, which cycloid tooth profile planetary mechanism and screw mechanism are connected in series, can convert the input-component’s high-speed rotary motion to output-component’s low-speed reciprocating linear motion. Base on that mechanism, a linear drivers was developed. And it is composed by electric motor, eccentric shaft, cycloid-plate, screw & nut, and so on. Because an intermediate transmission mode used rolling-body is applied in the contact surfaces among the kinematic pairs, the drivers’ stability and transmission efficiency are enhanced through improving kinematic pairs’ friction state and reducing parts abrasion. In the paper, it is emphasis discussed that the kinematics and main parameters’ computational method for cycloid tooth profile planetary mechanism.


Sign in / Sign up

Export Citation Format

Share Document