scholarly journals Fully automated deep supervised and unsupervised learning approaches for 3D protein cryo-EM density map reconstruction

2019 ◽  
Author(s):  
◽  
Adil Al-Azzawi

One of the most important components of the human body is the protein. Protein uses for building and repairing tissues, making enzymes and hormones. It is the essential building block of bones, muscles, cartilages, skin and blood. Therefore, a large quantity of protein always needed. Proteins are stored in the form of sequence of nucleotides that can be easily converted into a sequence of amino acids, which is known as a protein primary structure. For protein to perform its job, it needs to be in its three-dimensional structure, which also known as the protein tertiary structure. Several methods were developed for this reason. The most important one among them are X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and recently Electron Microscopy (EM). These methods required complicated procedures that are hard to implement, very time consuming, labor intensive, required well-trained specialists. Therefore, an alternative approach that is less time and cost consuming is required. Molecular structure prediction and understanding leads to major breakthroughs in medicine to design and produce better drugs, which will increase its efficiency and reduce its side effect. Whereas for biotechnology new and more efficient enzymes can be designed which impact many areas of our daily life such as detergents, Textiles, Food and Beverages, Leather, and Bioethanol. In terms of gaining the popularity in structural biology using the Electron Microscopy (EM) technology, a hundred of thousands of single particle images are required to be extracted from two-dimensional (2D) cryo-electron microscopy (cryo-EM) to build a reliable high-resolution (3D) model. In order to reduce the radiation damage to the biomolecules of interest during the imaging process, a limited electron dose is used as the high-energy electrons can greatly damage the specimen during imaging and results in extremely noisy micrographs. Hence, single particle images picking still present significant challenges due to that much single particle in the original (2D) micrographs arises from different sources such as the very low single-to-noise-ration (SNR), low contrast, heavy background noise, ice contamination, particle overlap, and amorphous carbon. Many different computational methods have been proposed for the automated semi-automated single particle piking over the past decades. Most of these methods are based on different techniques such as template-based matching, edge detection, feature extraction, and conversional computational vison. These methods for particle picking often need a large training dataset, which requires extensive manual labor. Other reference-dependent methods rely on low-resolution templates for particle detection, matching and picking, and therefore are not fully automated. To address this challenge, we develop different models such as AutoCryoPicker--a fully automated particle picking approach based on image preprocessing, unsupervised clustering and shape detection. SuperCryoEMPicker--a fully automated super particle clustering method for picking particles of complex and irregular shape in cryo-EM images. DeepCryoPicker--a fully automated deep neural network for single particle picking in cryo-EM. Our approach solves the fully automated single particle in diversity cryo-EM images. We combined two different fully automated particle picking approaches (AutoCryoPicker and SuperCryoEMPicker) to do the fully automated single particle picking. Also, we generated fully automated approach for training dataset expanding and training particle images increasing. The fully automated training particle-selection can automatically distinguish between the "good" and "bad" training examples and isolated the selected particles to positive and negative detection examples. Later, a deep neural network is designed and trained using the generated training dataset. Finally, for each testing micrograph, we used the developed preprocessing stage to improve the quality of the low-SNR micrographs. Then, we use the trained deep neural network model and sliding windows to test every single sub-image based on using the NMS. The results indicated that DeepCryoPicker performed accurately as good as the RELION which is "semi-automated particle picking method", and DeepEM. Another essential process for fully understanding and determining the protein structure is a 3D density map reconstruction. 3D density map of a single protein molecule gives a significant indication to understand the protein functions and structural dynamics relationship. Individual cryo-EM particles provide an opportunity to build/reconstruct a 3D density map using single protein particles. However, always using low-dose images causes radiation of the particle damage (very low particle image contrast and highly noise particle images). That makes some limitations and more challenges for the particle's alignment during the 3D reconstruction at intermediate resolution (1-3nm). To overcome this issue, we design a DeepCryoMap a fully automated cryo-EM particles alignment for 3D Density Maps Reconstruction Based Deep Supervised and Unsupervised Learning Approaches. At the begging in the first two steps, we used our previous model DeepCryoPicker to fully automated pick the particle from the micrographs. The set of the picked particles are fully automated classified and labeled based on their view (top or side-view) using the deep classification network. Then, a perfect 2D particle mask is generated for every single particle and the original particle is aligned based on the binary mask. Finally, we used a 3D computer vision algorithm to reconstruct a localized 3D density map between every two single particle image that has the most corresponding features (information). Then, we average the localized 3D density maps localized to reconstruct the final 3D cryo-EM protein density map.

2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


Molecules ◽  
2019 ◽  
Vol 24 (6) ◽  
pp. 1181 ◽  
Author(s):  
Todor Avramov ◽  
Dan Vyenielo ◽  
Josue Gomez-Blanco ◽  
Swathi Adinarayanan ◽  
Javier Vargas ◽  
...  

Cryo-electron microscopy (cryo-EM) is becoming the imaging method of choice for determining protein structures. Many atomic structures have been resolved based on an exponentially growing number of published three-dimensional (3D) high resolution cryo-EM density maps. However, the resolution value claimed for the reconstructed 3D density map has been the topic of scientific debate for many years. The Fourier Shell Correlation (FSC) is the currently accepted cryo-EM resolution measure, but it can be subjective, manipulated, and has its own limitations. In this study, we first propose supervised deep learning methods to extract representative 3D features at high, medium and low resolutions from simulated protein density maps and build classification models that objectively validate resolutions of experimental 3D cryo-EM maps. Specifically, we build classification models based on dense artificial neural network (DNN) and 3D convolutional neural network (3D CNN) architectures. The trained models can classify a given 3D cryo-EM density map into one of three resolution levels: high, medium, low. The preliminary DNN and 3D CNN models achieved 92.73% accuracy and 99.75% accuracy on simulated test maps, respectively. Applying the DNN and 3D CNN models to thirty experimental cryo-EM maps achieved an agreement of 60.0% and 56.7%, respectively, with the author published resolution value of the density maps. We further augment these previous techniques and present preliminary results of a 3D U-Net model for local resolution classification. The model was trained to perform voxel-wise classification of 3D cryo-EM density maps into one of ten resolution classes, instead of a single global resolution value. The U-Net model achieved 88.3% and 94.7% accuracy when evaluated on experimental maps with local resolutions determined by MonoRes and ResMap methods, respectively. Our results suggest deep learning can potentially improve the resolution evaluation process of experimental cryo-EM maps.


2020 ◽  
Vol 21 (S21) ◽  
Author(s):  
Adil Al-Azzawi ◽  
Anes Ouadou ◽  
Ye Duan ◽  
Jianlin Cheng

Abstract Background Cryo-EM data generated by electron tomography (ET) contains images for individual protein particles in different orientations and tilted angles. Individual cryo-EM particles can be aligned to reconstruct a 3D density map of a protein structure. However, low contrast and high noise in particle images make it challenging to build 3D density maps at intermediate to high resolution (1–3 Å). To overcome this problem, we propose a fully automated cryo-EM 3D density map reconstruction approach based on deep learning particle picking. Results A perfect 2D particle mask is fully automatically generated for every single particle. Then, it uses a computer vision image alignment algorithm (image registration) to fully automatically align the particle masks. It calculates the difference of the particle image orientation angles to align the original particle image. Finally, it reconstructs a localized 3D density map between every two single-particle images that have the largest number of corresponding features. The localized 3D density maps are then averaged to reconstruct a final 3D density map. The constructed 3D density map results illustrate the potential to determine the structures of the molecules using a few samples of good particles. Also, using the localized particle samples (with no background) to generate the localized 3D density maps can improve the process of the resolution evaluation in experimental maps of cryo-EM. Tested on two widely used datasets, Auto3DCryoMap is able to reconstruct good 3D density maps using only a few thousand protein particle images, which is much smaller than hundreds of thousands of particles required by the existing methods. Conclusions We design a fully automated approach for cryo-EM 3D density maps reconstruction (Auto3DCryoMap). Instead of increasing the signal-to-noise ratio by using 2D class averaging, our approach uses 2D particle masks to produce locally aligned particle images. Auto3DCryoMap is able to accurately align structural particle shapes. Also, it is able to construct a decent 3D density map from only a few thousand aligned particle images while the existing tools require hundreds of thousands of particle images. Finally, by using the pre-processed particle images, Auto3DCryoMap reconstructs a better 3D density map than using the original particle images.


Author(s):  
Ali Punjani ◽  
Haowei Zhang ◽  
David J. Fleet

AbstractSingle particle cryo-EM is a powerful method for studying proteins and other biological macromolecules. Many of these molecules comprise regions with varying structural properties including disorder, flexibility, and partial occupancy. These traits make computational 3D reconstruction from 2D images challenging. Detergent micelles and lipid nanodiscs, used to keep membrane proteins in solution, are common examples of locally disordered structures that can negatively affect existing iterative refinement algorithms which assume rigidity (or spatial uniformity). We introduce a cross-validation approach to derive non-uniform refinement, an algorithm that automatically regularizes 3D density maps during iterative refinement to account for spatial variability, yielding dramatically improved resolution and 3D map quality. We find that in common iterative refinement methods, regularization using spatially uniform filtering operations can simultaneously over- and under-regularize local regions of a 3D map. In contrast, non-uniform refinement removes noise in disordered regions while retaining signal useful for aligning particle images. Our results include state-of-the-art resolution 3D reconstructions of multiple membrane proteins with molecular weight as low as 90kDa. These results demonstrate that higher resolutions and improved 3D density map quality can be achieved even for small membrane proteins, an important use case for single particle cryo-EM, both in structural biology and drug discovery. Non-uniform refinement is implemented in the cryoSPARC software package and has already been used successfully in several notable structural studies.


Mathematics ◽  
2021 ◽  
Vol 9 (8) ◽  
pp. 807
Author(s):  
Carlos M. Castorena ◽  
Itzel M. Abundez ◽  
Roberto Alejo ◽  
Everardo E. Granda-Gutiérrez ◽  
Eréndira Rendón ◽  
...  

The problem of gender-based violence in Mexico has been increased considerably. Many social associations and governmental institutions have addressed this problem in different ways. In the context of computer science, some effort has been developed to deal with this problem through the use of machine learning approaches to strengthen the strategic decision making. In this work, a deep learning neural network application to identify gender-based violence on Twitter messages is presented. A total of 1,857,450 messages (generated in Mexico) were downloaded from Twitter: 61,604 of them were manually tagged by human volunteers as negative, positive or neutral messages, to serve as training and test data sets. Results presented in this paper show the effectiveness of deep neural network (about 80% of the area under the receiver operating characteristic) in detection of gender violence on Twitter messages. The main contribution of this investigation is that the data set was minimally pre-processed (as a difference versus most state-of-the-art approaches). Thus, the original messages were converted into a numerical vector in accordance to the frequency of word’s appearance and only adverbs, conjunctions and prepositions were deleted (which occur very frequently in text and we think that these words do not contribute to discriminatory messages on Twitter). Finally, this work contributes to dealing with gender violence in Mexico, which is an issue that needs to be faced immediately.


2019 ◽  
Vol 36 (7) ◽  
pp. 2237-2243
Author(s):  
Cyril F Reboul ◽  
Simon Kiesewetter ◽  
Dominika Elmlund ◽  
Hans Elmlund

Abstract Motivation No rigorous statistical tests for detecting point-group symmetry in three-dimensional (3D) charge density maps obtained by electron microscopy (EM) and related techniques have been developed. Results We propose a method for determining the point-group symmetry of 3D charge density maps obtained by EM and related techniques. Our ab initio algorithm does not depend on atomic coordinates but utilizes the density map directly. We validate the approach for a range of publicly available single-particle cryo-EM datasets. In straightforward cases, our method enables fully automated single-particle 3D reconstruction without having to input an arbitrarily selected point-group symmetry. When pseudo-symmetry is present, our method provides statistics quantifying the degree to which the 3D density agrees with the different point-groups tested. Availability and implementation The software is freely available at https://github.com/hael/SIMPLE3.0.


2001 ◽  
Vol 133 (2-3) ◽  
pp. 233-245 ◽  
Author(s):  
A Pascual-Montano ◽  
L.E Donate ◽  
M Valle ◽  
M Bárcena ◽  
R.D Pascual-Marqui ◽  
...  

2020 ◽  
Author(s):  
Brajesh Rai ◽  
Vishnu Sresht ◽  
Qingyi Yang ◽  
Rayomond J. Unwalla ◽  
Meihua Tu ◽  
...  

<p></p><p>TorsionNet: A Deep Neural Network to Rapidly Predict Small Molecule Torsion Energy Profiles with the Accuracy of Quantum Mechanics </p> <p> </p> <p>Brajesh K. Rai<sup>*,1</sup>, Vishnu Sresht<sup>1</sup>, Qingyi Yang<sup>2</sup>, Ray Unwalla<sup>2</sup>, Meihua Tu<sup>2</sup>, Alan M. Mathiowetz<sup>2</sup>, and Gregory A. Bakken<sup>3</sup></p> <p><sup>1</sup>Simulation and Modeling Sciences and <sup>2</sup>Medicine Design, Pfizer Worldwide Research Development and Medical, 610 Main Street, Cambridge, Massachusetts 02139, United States</p> <p><sup>3</sup>Digital, Pfizer, Eastern Point Road, Groton, Connecticut 06340, United States</p> <p> </p> <p> </p> <p><b>ABSTRACT</b><b> </b><b></b></p> <p>Fast and accurate assessment of small molecule dihedral energetics is crucial for molecular design and optimization in medicinal chemistry. Yet, accurate prediction of torsion energy profiles remains a challenging task as current molecular mechanics methods are limited by insufficient coverage of druglike chemical space and accurate quantum mechanical (QM) methods are too expensive. To address this limitation, we introduce TorsionNet, a deep neural network (DNN) model specifically developed to predict small molecule torsion energy profiles with QM-level accuracy. We applied active learning to identify nearly 50k fragments (with elements H, C, N, O, F, S, and Cl) that maximized the coverage of our corporate library and leveraged massively parallel cloud computing resources to perform DFT torsion scan of these fragments, generating a training dataset of 1.2 million DFT energies. By training TorsionNet on this dataset, we obtain a model that can rapidly predict the torsion energy profile of typical druglike fragments with DFT-level accuracy. Importantly, our method also provides a direct estimate of the uncertainty in the predicted profiles without any additional calculations. In this report, we show that TorsionNet can reliably identify the preferred dihedral geometries observed in crystal structures. We also present practical applications of TorsionNet that demonstrate how consideration of DNN-based strain energy leads to substantial improvement in existing lead discovery and design workflows. A benchmark dataset (TorsionNet500) comprising 500 chemically diverse fragments with DFT torsion profiles (12k DFT-optimized geometries and energies) has been created and is made freely available.</p><br><p></p>


2021 ◽  
Vol 15 (58) ◽  
pp. 308-318
Author(s):  
Tran-Hieu Nguyen ◽  
Anh-Tuan Vu

In this paper, a machine learning-based framework is developed to quickly evaluate the structural safety of trusses. Three numerical examples of a 10-bar truss, a 25-bar truss, and a 47-bar truss are used to illustrate the proposed framework. Firstly, several truss cases with different cross-sectional areas are generated by employing the Latin Hypercube Sampling method. Stresses inside truss members as well as displacements of nodes are determined through finite element analyses and obtained values are compared with design constraints. According to the constraint verification, the safety state is assigned as safe or unsafe. Members’ sectional areas and the safety state are stored as the inputs and outputs of the training dataset, respectively. Three popular machine learning classifiers including Support Vector Machine, Deep Neural Network, and Adaptive Boosting are used for evaluating the safety of structures. The comparison is conducted based on two metrics: the accuracy and the area under the ROC curve. For the two first examples, three classifiers get more than 90% of accuracy. For the 47-bar truss, the accuracies of the Support Vector Machine model and the Deep Neural Network model are lower than 70% but the Adaptive Boosting model still retains the high accuracy of approximately 98%. In terms of the area under the ROC curve, the comparative results are similar. Overall, the Adaptive Boosting model outperforms the remaining models. In addition, an investigation is carried out to show the influence of the parameters on the performance of the Adaptive Boosting model.


Sign in / Sign up

Export Citation Format

Share Document