RELION-3: new tools for automated high-resolution cryo-EM structure determination

Mapping Intimacies ◽

10.1101/421123 ◽

2018 ◽

Cited By ~ 20

Author(s):

Jasenko Zivanov ◽

Takanori Nakane ◽

Björn Forsberg ◽

Dari Kimanius ◽

Wim J.H. Hagen ◽

...

Keyword(s):

Motion Correction ◽

De Novo ◽

Model Generation ◽

Data Sets ◽

Curvature Correction ◽

The Third ◽

Use Of Resources ◽

Ewald Sphere ◽

Large Particles ◽

Induced Motion

AbstractHere, we describe the third major release of relion. CPU-based vector acceleration has been added in addition to GPU support, which provides flexibility in use of resources and avoids memory limitations. Reference-free autopicking with Laplacian-of-Gaussian filtering and execution of jobs from python allows non-interactive processing during acquisition, including 2D-classification, de novo model generation and 3D-classification. Perparticle refinement of CTF parameters and correction of estimated beam tilt provides higher-resolution reconstructions when particles are at different heights in the ice, and/or coma-free alignment has not been optimal. Ewald sphere curvature correction improves resolution for large particles. We illustrate these developments with publicly available data sets: together with a Bayesian approach to beam-induced motion correction it leads to resolution improvements of 0.2-0.7 Å compared to previous relion versions.

New tools for automated high-resolution cryo-EM structure determination in RELION-3

eLife ◽

10.7554/elife.42166 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 1127

Author(s):

Jasenko Zivanov ◽

Takanori Nakane ◽

Björn O Forsberg ◽

Dari Kimanius ◽

Wim JH Hagen ◽

...

Keyword(s):

Motion Correction ◽

De Novo ◽

Model Generation ◽

Data Sets ◽

Curvature Correction ◽

Use Of Resources ◽

Ewald Sphere ◽

Large Particles ◽

Induced Motion ◽

Particle Refinement

Here, we describe the third major release of RELION. CPU-based vector acceleration has been added in addition to GPU support, which provides flexibility in use of resources and avoids memory limitations. Reference-free autopicking with Laplacian-of-Gaussian filtering and execution of jobs from python allows non-interactive processing during acquisition, including 2D-classification, de novo model generation and 3D-classification. Per-particle refinement of CTF parameters and correction of estimated beam tilt provides higher resolution reconstructions when particles are at different heights in the ice, and/or coma-free alignment has not been optimal. Ewald sphere curvature correction improves resolution for large particles. We illustrate these developments with publicly available data sets: together with a Bayesian approach to beam-induced motion correction it leads to resolution improvements of 0.2–0.7 Å compared to previous RELION versions.

Beam-induced motion correction for sub-megadalton cryo-EM particles

eLife ◽

10.7554/elife.03665 ◽

2014 ◽

Vol 3 ◽

Cited By ~ 238

Author(s):

Sjors HW Scheres

Keyword(s):

Motion Correction ◽

Field Of View ◽

Particle Analysis ◽

Data Sets ◽

Dependent Manner ◽

Research Article ◽

Multiple Particles ◽

Induced Motion ◽

Research Advance ◽

Dose Dependent

In electron cryo-microscopy (cryo-EM), the electron beam that is used for imaging also causes the sample to move. This motion blurs the images and limits the resolution attainable by single-particle analysis. In a previous Research article (<xref ref-type="bibr" rid="bib3">Bai et al., 2013</xref>) we showed that correcting for this motion by processing movies from fast direct-electron detectors allowed structure determination to near-atomic resolution from 35,000 ribosome particles. In this Research advance article, we show that an improved movie processing algorithm is applicable to a much wider range of specimens. The new algorithm estimates straight movement tracks by considering multiple particles that are close to each other in the field of view, and models the fall-off of high-resolution information content by radiation damage in a dose-dependent manner. Application of the new algorithm to four data sets illustrates its potential for significantly improving cryo-EM structures, even for particles that are smaller than 200 kDa.

Faculty Opinions recommendation of Efficient de novo assembly of single-cell bacterial genomes from short-read data sets.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13296960.14657061 ◽

2011 ◽

Author(s):

Steven Salzberg

Keyword(s):

Single Cell ◽

De Novo Assembly ◽

De Novo ◽

Data Sets ◽

Bacterial Genomes ◽

Short Read

Faculty Opinions recommendation of Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-EM.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718021192.793478920 ◽

2013 ◽

Author(s):

Jose Valpuesta

Keyword(s):

Single Particle ◽

Motion Correction ◽

Atomic Resolution ◽

Electron Counting ◽

Induced Motion ◽

Resolution Single

Fine-tuning of a generative neural network for designing multi-target compounds

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-021-00392-8 ◽

2021 ◽

Author(s):

Thomas Blaschke ◽

Jürgen Bajorath

Keyword(s):

Neural Network ◽

Small Molecules ◽

De Novo ◽

Pharmaceutical Research ◽

Generative Models ◽

Fine Tuning ◽

Data Sets ◽

Single Target ◽

Fine Tune ◽

Target Activity

AbstractExploring the origin of multi-target activity of small molecules and designing new multi-target compounds are highly topical issues in pharmaceutical research. We have investigated the ability of a generative neural network to create multi-target compounds. Data sets of experimentally confirmed multi-target, single-target, and consistently inactive compounds were extracted from public screening data considering positive and negative assay results. These data sets were used to fine-tune the REINVENT generative model via transfer learning to systematically recognize multi-target compounds, distinguish them from single-target or inactive compounds, and construct new multi-target compounds. During fine-tuning, the model showed a clear tendency to increasingly generate multi-target compounds and structural analogs. Our findings indicate that generative models can be adopted for de novo multi-target compound design.

CONSTRUCTION OF AN EFFECTIVE WAREHOUSE SYSTEM OF THE ENTERPRISE IN THE PROCESSING OF AGRICULTURAL CARGO

Municipal economy of cities ◽

10.33042/2522-1809-2021-6-166-217-226 ◽

2021 ◽

Vol 6 (166) ◽

pp. 217-226

Author(s):

O. Pavlenko ◽

D. Velykodnyi

Keyword(s):

Performance Indicator ◽

Optimization Problems ◽

Economic Effect ◽

Unit Cost ◽

Maximum Level ◽

Technological Parameters ◽

Goods And Services ◽

The Third ◽

Use Of Resources ◽

The Moment

The article investigates the existing trends and prospects for the development of warehousing services in the system of production and processing of products, which allowed us to form the purpose of the research in this development. The chosen theme is quite topical, because warehousing processes are an integral part of the formation of supply logistics in modern production processes. One of the ways to reduce the cost of goods and services is the efficient use of resources. Ukrainian and foreign scientists have touched many questions concerning the development of the infrastructure component, solutions of optimization problems regarding the process of import-export of goods to the warehouse, but without determining the optimal values of technological parameters of warehousing systems. The technological scheme of operation of the warehouse system of the enterprise LLC "MEGA CRISP» allows to see the whole chain of operations from the moment of arrival of the vehicle with cargo (containers and packaging) to the moment of sending the cargo (finished product) to the recipient; the necessary types of resources involved in these processes are also taken into account. Total costs were chosen as an evaluation indicator of the choice of an efficient supply channel. Relevant parameters of influence are taken into account: intensities of corresponding cargo flows, unit cost of the appropriate work and one hour of work of one worker, time of performance of an appropriate operation, quantity of the involved resources for performance of the appropriate operation and working time of the warehouse during the day. An imitation full-factor experiment was performed, based on the results of which a regression model in linear form with a non-zero coefficient was determined, in which each coefficient indicates the degree of influence of the relevant factor on the performance indicator. The results of determining the economic effect showed that "Variant 2" (increasing the number of workers) is the least expensive, and the level of costs is lower for all series of experiments compared to the basic variant - "Variant 1". The maximum difference is reached in 12217.8 hryvnias at the maximum loading of a warehouse. And when comparing the third and first variants: only at the maximum level of output flow (170 t / h), the third variant will be cheaper by 852.6 hryvnias. The highest level of positive value of the effect among the variants offered is "Variant 2", the level of savings will be 12,217.8 hryvnias per shift.

CAMISIM: Simulating metagenomes and microbial communities

10.1101/300970 ◽

2018 ◽

Cited By ~ 4

Author(s):

Adrian Fritz ◽

Peter Hofmann ◽

Stephan Majda ◽

Eik Dahms ◽

Johannes Dröge ◽

...

Keyword(s):

Microbial Communities ◽

De Novo ◽

Real Data ◽

Small Data ◽

Data Sets ◽

Sequencing Data ◽

Taxonomic Profiling ◽

Benchmark Data ◽

Sequencing Technologies ◽

Wide Range

Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required. Here, we describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series and differential abundance studies, includes real and simulated strain-level diversity, and generates second and third generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT and metaSPAdes, on several thousand small data sets generated with CAMISIM. CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with truth standards for method evaluation. All data sets and the software are freely available at: https://github.com/CAMI-challenge/CAMISIM

Prediction of Progestin Affinity for the Human Progesterone Receptor Based on Corrected RBA Data

Biomedical Chemistry Research and Methods ◽

10.18097/bmcrm00080 ◽

2018 ◽

Vol 1 (4) ◽

pp. e00080

Author(s):

A.V. Mikurova ◽

V.S. Skvortsov

Keyword(s):

Progesterone Receptor ◽

Prediction Equation ◽

Binding Activity ◽

Data Sets ◽

Data Set ◽

The Third ◽

Relative Binding ◽

Human Progesterone Receptor ◽

Nuclear Progesterone Receptor

The modeling of complexes of 3 sets of steroid and nonsteroidal progestins with the ligand-binding domain of the nuclear progesterone receptor was performed. Molecular docking procedure, long-term simulation of molecular dynamics and subsequent analysis by MM-PBSA (MM-GBSA) were used to model the complexes. Using the characteristics obtained by the MM-PBSA method two data sets of steroid compounds obtained in different scientific groups a prediction equation for the value of relative binding activity (RBA) was constructed. The RBA value was adjusted so that in all samples the actual activity was compared with the progesterone activity. The third data set of nonsteroidal compounds was used as a test. The resulted equation showed that the prediction results could be applied to both steroid molecules and nonsteroidal progestins.

Localization of Data Sets in Distributed Database Systems Using Slope-Based Vertical Fragmentation

Handling Priority Inversion in Time-Constrained Distributed Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2491-6.ch003 ◽

2020 ◽

pp. 36-60

Author(s):

Ashish Ranjan Mishra ◽

Neelendra Badal

Keyword(s):

Database Systems ◽

Distributed Database ◽

Communication Cost ◽

Data Sets ◽

Distributed Database Systems ◽

The Third ◽

Vertical Partitioning ◽

Vertical Fragmentation ◽

Partitioning Algorithm ◽

Better Than

This chapter explains an algorithm that can perform vertical partitioning of database tables dynamically on distributed database systems. After vertical partitioning, a new algorithm is developed to allocate that fragments to the proper sites. To accomplish this, three major tasks are performed in this chapter. The first task is to develop a partitioning algorithm, which can partition the relation in such a way that it would perform better than most of the existing algorithms. The second task is to allocate the fragments to the appropriate sites where allocating the fragments will incur low communication cost with respect to other sites. The third task is to monitor the change in frequency of queries at different sites as well as same site. If the change in frequency of queries at different sites as well as the same site exceeds the threshold, the re-partitioning and re-allocation are performed.

Graphs and Maps

Hurricane Climatology ◽

10.1093/oso/9780199827633.003.0008 ◽

2013 ◽

Author(s):

James B. Elsner ◽

Thomas H. Jagger

Keyword(s):

Web Site ◽

Data Sets ◽

Good Strategy ◽

Sample Mean ◽

The Third ◽

Standard Base ◽

Open Session ◽

Box Plot

Graphs and maps help you reason with data. They also help you communicate results. A good graph gives you the most information in the shortest time, with the least ink in the smallest space (Tufte, 1997). In this chapter, we show you how to make graphs and maps using R. A good strategy is to follow along with an open session, typing (or copying) the code as you read. Before you begin make sure you have the following data sets available in your workspace. Do this by typing . . . > SOI = read.table("SOI.txt", header=TRUE) > NAO = read.table("NAO.txt", header=TRUE) > SST = read.table("SST.txt", header=TRUE) > A = read.table("ATL.txt", header=TRUE) > US = read.table("H.txt", header=TRUE) . . . Not all the code is shown but all is available on our Web site. It is easy to make a graph. Here we provide guidance to help you make informative graphs. It is a tutorial on how to create publishable figures from your data. In R you have several choices. With the standard (base) graphics environment, you can produce a variety of plots with fine details. Most of the figures in this book use the standard graphics environment. The grid graphics environment is even more flexible. It allows you to design complex layouts with nested graphs where scaling is maintained upon resizing. The lattice and ggplot2 packages use grid graphics to create more specialized graphing functions and methods. The spplot function for example is plot method built with grid graphics that you will use to create maps. The ggplot2 package is an implementation of the grammar of graphics combining advantages from the standard and lattice graphic environments. It is worth the effort to learn. We begin with the standard graphics environment. A box plot is a graph of the five-number summary. The summary function applied to data produces the sample mean along with five other statistics including the minimum, the first quartile value, the median, the third quartile value, and the maximum. The box plot graphs these numbers. This is done using the boxplot function.