scholarly journals Accelerating in-silico saturation mutagenesis using compressed sensing

2021 ◽  
Author(s):  
Jacob Schreiber ◽  
Surag Nair ◽  
Akshay Balsubramani ◽  
Anshul Kundaje

In-silico saturation mutagenesis (ISM) is a popular approach in computational genomics for calculating feature attributions on biological sequences that proceeds by systematically perturbing each position in a sequence and recording the difference in model output. However, this method can be slow because systematically perturbing each position requires performing a number of forward passes proportional to the length of the sequence being examined. In this work, we propose a modification of ISM that leverages the principles of compressed sensing to require only a constant number of forward passes, regardless of sequence length, when applied to models that contain operations with a limited receptive field, such as convolutions. Our method, named Yuzu, can reduce the time that ISM spends in convolution operations by several orders of magnitude and, consequently, Yuzu can speed up ISM on several commonly used architectures in genomics by over an order of magnitude. Notably, we found that Yuzu provides speedups that increase with the complexity of the convolution operation and the length of the sequence being analyzed, suggesting that Yuzu provides large benefits in realistic settings. We have made this tool available at https://github.com/kundajelab/yuzu.

2021 ◽  
Vol 7 (2) ◽  
pp. 99
Author(s):  
Hamza Mbareche ◽  
Marc Veillette ◽  
Guillaume J. Bilodeau

This paper presents an in silico analysis to assess the current state of the fungal UNITE database in terms of the two eukaryote nuclear ribosomal regions, Internal Transcribed Spacers 1 and 2 (ITS1 and ITS2), used in describing fungal diversity. Microbial diversity is often evaluated with amplicon-based high-throughput sequencing approaches, which is a target enrichment method that relies on the amplification of a specific target using particular primers before sequencing. Thus, the results are highly dependent on the quality of the primers used for amplification. The goal of this study is to validate if the mismatches of the primers on the binding sites of the targeted taxa could explain the differences observed when using either ITS1 or ITS2 in describing airborne fungal diversity. Hence, the choice of the pairs of primers for each barcode concur with a study comparing the performance of ITS1 and ITS2 in three occupational environments. The sequence length varied between the amplicons retrieved from the UNITE database using the pair of primers targeting ITS1 and ITS2. However, the database contains an equal number of unidentified taxa from ITS1 and ITS2 regions in the six taxonomic levels employed (phylum, class, order, family, genus, species). The chosen ITS primers showed differences in their ability to amplify fungal sequences from the UNITE database. Eleven taxa consisting of Trichocomaceae, Dothioraceae, Botryosphaeriaceae, Mucorales, Saccharomycetes, Pucciniomycetes, Ophiocordyceps, Microsporidia, Archaeorhizomycetes, Mycenaceae, and Tulasnellaceae showed large variations between the two regions. Note that members of the latter taxa are not all typical fungi found in the air. As no universal method is currently available to cover all the fungal kingdom, continuous work in designing primers, and particularly combining multiple primers targeting the ITS region is the best way to compensate for the biases of each one to get a larger view of the fungal diversity.


Nature ◽  
2021 ◽  
Author(s):  
Ferran Muiños ◽  
Francisco Martínez-Jiménez ◽  
Oriol Pich ◽  
Abel Gonzalez-Perez ◽  
Nuria Lopez-Bigas

Author(s):  
Christoph Öhlknecht ◽  
Sonja Katz ◽  
Christina Kröß ◽  
Bernhard Sprenger ◽  
Petra Engele ◽  
...  

The present paper describes an investigation of diffusion in the solid state. Previous experimental work has been confined to the case in which the free energy of a mixture is a minimum for the single-phase state, and diffusion decreases local differences of concentration. This may be called ‘diffusion downhill’. However, it is possible for the free energy to be a minimum for the two-phase state; diffusion may then increase differences of concentration; and so may be called ‘diffusion uphill’. Becker (1937) has proposed a simple theoretical treatment of these two types of diffusion in a binary alloy. The present paper describes an experimental test of this theory, using the unusual properties of the alloy Cu 4 FeNi 3 . This alloy is single phase above 800° C and two-phase at lower temperatures, both the phases being face-centred cubic; the essential difference between the two phases is their content of copper. On dissociating from one phase into two the alloy develops a series of intermediate structures showing striking X-ray patterns which are very sensitive to changes of structure. It was found possible to utilize these results for a quantitative study of diffusion ‘uphill’ and ‘downhill’ in the alloy. The experimental results, which can be expressed very simply, are in fair agreement with conclusions drawn from Becker’s theory. It was found that Fick’s equation, dc / dt = D d2c / dx2 , can, within the limits of error, be applied in all cases, with the modification that c denotes the difference of the measured copper concentration from its equilibrium value. The theory postulates that D is the product of two factors, of which one is D 0f the coefficient of diffusion that would be measured if the alloy were an ideal solid solution. The theory is able to calculate D/D 0 , if only in first approximation, and the experiments confirm this calculation. It was found that in most cases the speed of diffusion—‘uphill’ or ‘downhill’—has the order of magnitude of D 0 . * Now with British Electrical Research Association.


2014 ◽  
Vol 790-791 ◽  
pp. 97-102
Author(s):  
Zoltán Erdélyi ◽  
Zoltán Balogh ◽  
Gabor L. Katona ◽  
Dezső L. Beke

The critical nucleus size—above which nuclei grow, below dissolve—during diffusion controlled nucleation in binary solid-solid phase transformation process is calculated using kinetic Monte Carlo (KMC). If atomic jumps are slower in an A-rich nucleus than in the embedding B-rich matrix, the nucleus traps the A atoms approaching its surface. It doesn’t have enough time to eject A atoms before new ones arrive, even if it would be favourable thermodynamically. In this case the critical nucleus size can be even by an order of magnitude smaller than expected from equilibrium thermodynamics or without trapping. These results were published in [Z. Erdélyi et al., Acta Mater. 58 (2010) 5639]. In a recent paper M. Leitner [M. Leitner, Acta Mater. 60 (2012) 6709] has questioned our results based on the arguments that his simulations led to different results, but he could not point out the reason for the difference. In this paper we summarize our original results and on the basis of recent KMC and kinetic mean field (KMF) simulations we show that Leitner’s conclusions are not valid and we confirm again our original results.


1966 ◽  
Vol 21 (9) ◽  
pp. 1377-1384
Author(s):  
A. V. Willi

Kinetic carbon-13 and deuterium isotope effects are calculated for the SN2 reaction of CH3I with CN-. The normal vibrational frequencies of CH3I, the transition state I · · · CH3 · · · CN, and the corresponding isotope substituted reactants and transition states are evaluated from the force constants by solving the secular equation on an IBM 7094 computer.Values for 7 force constants of the planar CH3 moiety in the transition state (with an sp2 C atom) are obtained by comparison with suitable stable molecules. The stretching force constants related to the bonds being broken or newly formed (fCC, fCC and the interaction between these two stretches, /12) are chosen in such a way that either a zero or imaginary value for νʟ≠ will result. Agreement between calculated and experimental methyl-C13 isotope effects (k12/ k13) can be obtained only in sample calculations with sufficiently large values of f12 which lead to imaginary νʟ≠ values. Furthermore, the difference between fCI and fCC must be small (in the order of 1 mdyn/Å). The bending force constants, fHCI and fHCC, exert relatively little influence on k12/k13. They are important for the D isotope effect, however. As soon as experimental data on kH/kD are available it will be possible to derive a value for fHCC in the transition state if fHCI is kept constant at 0.205 mdynA, and if fCI, fCC and f12 are held in a reasonable order of magnitude. There is no agreement between experimental and calculated cyanide-C13 isotope effects. Possible explanations are discussed. — Since fCI and fCC cannot differ much it must be concluded that the transition state is relatively “symmetric”, with approximately equal amounts of bond making and bond breaking.


2021 ◽  
Vol 19 ◽  
Author(s):  
Preeya Negi ◽  
Lalita Das ◽  
Surya Prakash ◽  
Vaishali M. Patil

Introduction: Natural products or phytochemicals have always been useful as effective therapeutics and for providing the lead for rational drug discovery approaches specific to anti-viral therapeutics. Methods: The ongoing pandemic caused by novel coronavirus has created a demand for effective therapeutics. Thus, to achieve the primary objective to search for effective anti-viral therapeutics, in silico screening of phytochemicals present in Curcuma longa extract (ex. Curcumin) has been planned. Results: The present work involves the evaluation of ADME properties and molecular docking studies. Conclusion: The application of rationalized drug discovery approaches to screen the diverse natural resources will speed up the anti-COVID drug discovery efforts and benefit the global community.


2018 ◽  
Vol 15 (9) ◽  
pp. 2909-2930 ◽  
Author(s):  
Sebastian Lienert ◽  
Fortunat Joos

Abstract. A dynamic global vegetation model (DGVM) is applied in a probabilistic framework and benchmarking system to constrain uncertain model parameters by observations and to quantify carbon emissions from land-use and land-cover change (LULCC). Processes featured in DGVMs include parameters which are prone to substantial uncertainty. To cope with these uncertainties Latin hypercube sampling (LHS) is used to create a 1000-member perturbed parameter ensemble, which is then evaluated with a diverse set of global and spatiotemporally resolved observational constraints. We discuss the performance of the constrained ensemble and use it to formulate a new best-guess version of the model (LPX-Bern v1.4). The observationally constrained ensemble is used to investigate historical emissions due to LULCC (ELUC) and their sensitivity to model parametrization. We find a global ELUC estimate of 158 (108, 211) PgC (median and 90 % confidence interval) between 1800 and 2016. We compare ELUC to other estimates both globally and regionally. Spatial patterns are investigated and estimates of ELUC of the 10 countries with the largest contribution to the flux over the historical period are reported. We consider model versions with and without additional land-use processes (shifting cultivation and wood harvest) and find that the difference in global ELUC is on the same order of magnitude as parameter-induced uncertainty and in some cases could potentially even be offset with appropriate parameter choice.


2021 ◽  
Vol 28 (2) ◽  
pp. 163-182
Author(s):  
José L. Simancas-García ◽  
Kemel George-González

Shannon’s sampling theorem is one of the most important results of modern signal theory. It describes the reconstruction of any band-limited signal from a finite number of its samples. On the other hand, although less well known, there is the discrete sampling theorem, proved by Cooley while he was working on the development of an algorithm to speed up the calculations of the discrete Fourier transform. Cooley showed that a sampled signal can be resampled by selecting a smaller number of samples, which reduces computational cost. Then it is possible to reconstruct the original sampled signal using a reverse process. In principle, the two theorems are not related. However, in this paper we will show that in the context of Non Standard Mathematical Analysis (NSA) and Hyperreal Numerical System R, the two theorems are equivalent. The difference between them becomes a matter of scale. With the scale changes that the hyperreal number system allows, the discrete variables and functions become continuous, and Shannon’s sampling theorem emerges from the discrete sampling theorem.


2020 ◽  
Author(s):  
Albert C. Aragonès ◽  
Katrin F. Domke

Abstract Progress in molecular electronics (ME) is largely based on improved understanding of the properties of single molecules (SM) trapped for seconds or longer to enable their detailed characterization. We present a plasmon-supported break-junction (PBJ) platform to significantly increase the lifetime of SM junctions of 1,4-benzendithiol (BDT) without the need for chemical modification of molecule or electrode. Moderate far-field power densities of ca. 11 mW/µm2 lead to a >10-fold increase in minimum lifetime compared to laser-OFF conditions. The nearfield trapping efficiency is twice as large for bridge-site contact compared to hollow-site geometry, which can be attributed to the difference in polarizability. Current measurements and tip-enhanced Raman spectra confirm that native structure and contact geometry of BDT are preserved during the PBJ experiment. By providing a non-invasive pathway to increase short lifetimes of SM junctions, PBJ is a valuable approach for ME, paving the way for improved SM sensing and recognition platforms.


Sign in / Sign up

Export Citation Format

Share Document