scholarly journals New tools for automated high-resolution cryo-EM structure determination in RELION-3

eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Jasenko Zivanov ◽  
Takanori Nakane ◽  
Björn O Forsberg ◽  
Dari Kimanius ◽  
Wim JH Hagen ◽  
...  

Here, we describe the third major release of RELION. CPU-based vector acceleration has been added in addition to GPU support, which provides flexibility in use of resources and avoids memory limitations. Reference-free autopicking with Laplacian-of-Gaussian filtering and execution of jobs from python allows non-interactive processing during acquisition, including 2D-classification, de novo model generation and 3D-classification. Per-particle refinement of CTF parameters and correction of estimated beam tilt provides higher resolution reconstructions when particles are at different heights in the ice, and/or coma-free alignment has not been optimal. Ewald sphere curvature correction improves resolution for large particles. We illustrate these developments with publicly available data sets: together with a Bayesian approach to beam-induced motion correction it leads to resolution improvements of 0.2–0.7 Å compared to previous RELION versions.


2018 ◽  
Author(s):  
Jasenko Zivanov ◽  
Takanori Nakane ◽  
Björn Forsberg ◽  
Dari Kimanius ◽  
Wim J.H. Hagen ◽  
...  

AbstractHere, we describe the third major release of relion. CPU-based vector acceleration has been added in addition to GPU support, which provides flexibility in use of resources and avoids memory limitations. Reference-free autopicking with Laplacian-of-Gaussian filtering and execution of jobs from python allows non-interactive processing during acquisition, including 2D-classification, de novo model generation and 3D-classification. Perparticle refinement of CTF parameters and correction of estimated beam tilt provides higher-resolution reconstructions when particles are at different heights in the ice, and/or coma-free alignment has not been optimal. Ewald sphere curvature correction improves resolution for large particles. We illustrate these developments with publicly available data sets: together with a Bayesian approach to beam-induced motion correction it leads to resolution improvements of 0.2-0.7 Å compared to previous relion versions.



eLife ◽  
2014 ◽  
Vol 3 ◽  
Author(s):  
Sjors HW Scheres

In electron cryo-microscopy (cryo-EM), the electron beam that is used for imaging also causes the sample to move. This motion blurs the images and limits the resolution attainable by single-particle analysis. In a previous Research article (<xref ref-type="bibr" rid="bib3">Bai et al., 2013</xref>) we showed that correcting for this motion by processing movies from fast direct-electron detectors allowed structure determination to near-atomic resolution from 35,000 ribosome particles. In this Research advance article, we show that an improved movie processing algorithm is applicable to a much wider range of specimens. The new algorithm estimates straight movement tracks by considering multiple particles that are close to each other in the field of view, and models the fall-off of high-resolution information content by radiation damage in a dose-dependent manner. Application of the new algorithm to four data sets illustrates its potential for significantly improving cryo-EM structures, even for particles that are smaller than 200 kDa.



Author(s):  
Thomas Blaschke ◽  
Jürgen Bajorath

AbstractExploring the origin of multi-target activity of small molecules and designing new multi-target compounds are highly topical issues in pharmaceutical research. We have investigated the ability of a generative neural network to create multi-target compounds. Data sets of experimentally confirmed multi-target, single-target, and consistently inactive compounds were extracted from public screening data considering positive and negative assay results. These data sets were used to fine-tune the REINVENT generative model via transfer learning to systematically recognize multi-target compounds, distinguish them from single-target or inactive compounds, and construct new multi-target compounds. During fine-tuning, the model showed a clear tendency to increasingly generate multi-target compounds and structural analogs. Our findings indicate that generative models can be adopted for de novo multi-target compound design.



2018 ◽  
Author(s):  
Adrian Fritz ◽  
Peter Hofmann ◽  
Stephan Majda ◽  
Eik Dahms ◽  
Johannes Dröge ◽  
...  

Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required. Here, we describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series and differential abundance studies, includes real and simulated strain-level diversity, and generates second and third generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT and metaSPAdes, on several thousand small data sets generated with CAMISIM. CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with truth standards for method evaluation. All data sets and the software are freely available at: https://github.com/CAMI-challenge/CAMISIM



2020 ◽  
Vol 37 (12) ◽  
pp. 3576-3600
Author(s):  
Di Chen ◽  
Marzia A Cremona ◽  
Zongtai Qi ◽  
Robi D Mitra ◽  
Francesca Chiaromonte ◽  
...  

Abstract Long INterspersed Elements-1 (L1s) constitute &gt;17% of the human genome and still actively transpose in it. Characterizing L1 transposition across the genome is critical for understanding genome evolution and somatic mutations. However, to date, L1 insertion and fixation patterns have not been studied comprehensively. To fill this gap, we investigated three genome-wide data sets of L1s that integrated at different evolutionary times: 17,037 de novo L1s (from an L1 insertion cell-line experiment conducted in-house), and 1,212 polymorphic and 1,205 human-specific L1s (from public databases). We characterized 49 genomic features—proxying chromatin accessibility, transcriptional activity, replication, recombination, etc.—in the ±50 kb flanks of these elements. These features were contrasted between the three L1 data sets and L1-free regions using state-of-the-art Functional Data Analysis statistical methods, which treat high-resolution data as mathematical functions. Our results indicate that de novo, polymorphic, and human-specific L1s are surrounded by different genomic features acting at specific locations and scales. This led to an integrative model of L1 transposition, according to which L1s preferentially integrate into open-chromatin regions enriched in non-B DNA motifs, whereas they are fixed in regions largely free of purifying selection—depleted of genes and noncoding most conserved elements. Intriguingly, our results suggest that L1 insertions modify local genomic landscape by extending CpG methylation and increasing mononucleotide microsatellite density. Altogether, our findings substantially facilitate understanding of L1 integration and fixation preferences, pave the way for uncovering their role in aging and cancer, and inform their use as mutagenesis tools in genetic studies.



2019 ◽  
Vol 11 (8) ◽  
pp. 2312-2329 ◽  
Author(s):  
Yu-Tian Tao ◽  
Fang Suo ◽  
Sergio Tusso ◽  
Yan-Kai Wang ◽  
Song Huang ◽  
...  

Abstract The fission yeast Schizosaccharomyces pombe is an important model organism, but its natural diversity and evolutionary history remain under-studied. In particular, the population genomics of the S. pombe mitochondrial genome (mitogenome) has not been thoroughly investigated. Here, we assembled the complete circular-mapping mitogenomes of 192 S. pombe isolates de novo, and found that these mitogenomes belong to 69 nonidentical sequence types ranging from 17,618 to 26,910 bp in length. Using the assembled mitogenomes, we identified 20 errors in the reference mitogenome and discovered two previously unknown mitochondrial introns. Analyzing sequence diversity of these 69 types of mitogenomes revealed two highly distinct clades, with only three mitogenomes exhibiting signs of inter-clade recombination. This diversity pattern suggests that currently available S. pombe isolates descend from two long-separated ancestral lineages. This conclusion is corroborated by the diversity pattern of the recombination-repressed K-region located between donor mating-type loci mat2 and mat3 in the nuclear genome. We estimated that the two ancestral S. pombe lineages diverged about 31 million generations ago. These findings shed new light on the evolution of S. pombe and the data sets generated in this study will facilitate future research on genome evolution.



Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 4283-4283
Author(s):  
Marko Kavcic ◽  
Brian T. Fisher ◽  
Yimei Li ◽  
Alix E. Seif ◽  
Kari Torp ◽  
...  

Abstract Abstract 4283 Background The role of Gemtuzumab ozogamicin (GO) for acute myeloid leukemia (AML) remains controversial. GO was removed from the U.S. market in 2010 due to concerns of increased induction mortality in adults. Other studies have shown a survival benefit without increased treatment related mortality. Moreover, no data are available on the resources required to deliver GO based chemotherapy. Since pediatric data are limited, we evaluated in-hospital mortality and resource utilization in pediatric AML patients treated with GO and standard chemotherapy. Methods We used the Pediatric Information Health System (PHIS) to establish a cohort of children < 19 years old treated for de novo AML with GO and standard cytarabine, daunorubicin, and etoposide (ADE) induction. Cohort assembly was validated by local chart review and used ICD-9 diagnosis codes and manual review of chemotherapy. Case fatality was determined after induction (defined from the start of therapy to the initiation course 3), at 6 months and at 12 months. Resource utilization was determined for each patient based on daily billing data. Each resource variable was dichotomized (exposure or no exposure) for each inpatient day and then summarized during each study period to determine resource utilization days per 1,000 hospital days. Results In total, 253 children who had billing data for GO during the first course of ADE induction were identified. Median age was 9.6 years; a slight male predominance was observed (54%) and most patients were white (69%). In-hospital case-fatality rates were 2.4% during induction, 6.7% at 6 months, and 13.0% at 12 months from start of therapy. PHIS billing data demonstrated that patients received opioids almost on one in four hospital days, that during induction period 12% of patients received vasopressors on at least two consecutive days, and 12% needed assisted ventilation. Mean inpatient stay and resource utilization rates are presented in Table 1. Discussion In-hospital mortality rates at the three time points were low and concordant with published data on pediatric AML trials using an ADE induction (Gibson, BJH 2011) and ADE Induction + GO (Cooper, Cancer 2012) and lower than trials using intensively timed DCTER regimens (Woods, Blood 2001; Lange, Blood 2008). Resource utilization data demonstrated an extensive use of resources needed to manage infections (blood cultures, imaging, antimicrobials). While infections are the leading cause of non-relapse morbidity and mortality in pediatric AML, such extensive use of resources has not been previously quantified. In addition, PHIS billing data describe toxicities such as pain (opioid use), hypotension (vasopressor support), and respiratory failure (assisted ventilation) at rates higher than those previously reported in clinical trials. In conclusion, the in-hospital mortality of children treated with GO at PHIS centers appears comparable to previously published studies of ADE and ADE + GO. The resource utilization data provide a more comprehensive description of resources needed to treat pediatric AML than previously reported. In addition, the resource utilization data suggest that toxicities reported on clinical trials may underestimate the resources needed to administer AML induction therapy safely. Disclosures: No relevant conflicts of interest to declare.



2012 ◽  
Vol 68 (11) ◽  
pp. 1522-1534 ◽  
Author(s):  
Rojan Shrestha ◽  
David Simoncini ◽  
Kam Y. J. Zhang

Recent advancements in computational methods for protein-structure prediction have made it possible to generate the high-qualityde novomodels required forab initiophasing of crystallographic diffraction data using molecular replacement. Despite those encouraging achievements inab initiophasing usingde novomodels, its success is limited only to those targets for which high-qualityde novomodels can be generated. In order to increase the scope of targets to whichab initiophasing withde novomodels can be successfully applied, it is necessary to reduce the errors in thede novomodels that are used as templates for molecular replacement. Here, an approach is introduced that can identify and rebuild the residues with larger errors, which subsequently reduces the overall Cαroot-mean-square deviation (CA-RMSD) from the native protein structure. The error in a predicted model is estimated from the average pairwise geometric distance per residue computed among selected lowest energy coarse-grained models. This score is subsequently employed to guide a rebuilding process that focuses on more error-prone residues in the coarse-grained models. This rebuilding methodology has been tested on ten protein targets that were unsuccessful using previous methods. The average CA-RMSD of the coarse-grained models was improved from 4.93 to 4.06 Å. For those models with CA-RMSD less than 3.0 Å, the average CA-RMSD was improved from 3.38 to 2.60 Å. These rebuilt coarse-grained models were then converted into all-atom models and refined to produce improvedde novomodels for molecular replacement. Seven diffraction data sets were successfully phased using rebuiltde novomodels, indicating the improved quality of these rebuiltde novomodels and the effectiveness of the rebuilding process. Software implementing this method, calledMORPHEUS, can be downloaded from http://www.riken.jp/zhangiru/software.html.



Sign in / Sign up

Export Citation Format

Share Document