scholarly journals CDAE: A Cascade of Denoising Autoencoders for Noise Reduction in the Clustering of Single-Particle Cryo-EM Images

2021 ◽  
Vol 11 ◽  
Author(s):  
Houchao Lei ◽  
Yang Yang

As an emerging technology, cryo-electron microscopy (cryo-EM) has attracted more and more research interests from both structural biology and computer science, because many challenging computational tasks are involved in the processing of cryo-EM images. An important image processing step is to cluster the 2D cryo-EM images according to their projection angles, then the cluster mean images are used for the subsequent 3D reconstruction. However, cryo-EM images are quite noisy and denoising them is not easy, because the noise is a complicated mixture from samples and hardware. In this study, we design an effective cryo-EM image denoising model, CDAE, i.e., a cascade of denoising autoencoders. The new model comprises stacked blocks of deep neural networks to reduce noise in a progressive manner. Each block contains a convolutional autoencoder, pre-trained by simulated data of different SNRs and fine-tuned by target data set. We assess this new model on three simulated test sets and a real data set. CDAE achieves very competitive PSNR (peak signal-to-noise ratio) in the comparison of the state-of-the-art image denoising methods. Moreover, the denoised images have significantly enhanced clustering results compared to original image features or high-level abstraction features obtained by other deep neural networks. Both quantitative and visualized results demonstrate the good performance of CDAE for the noise reduction in clustering single-particle cryo-EM images.

2019 ◽  
Vol 488 (4) ◽  
pp. 5232-5250 ◽  
Author(s):  
Alexander Chaushev ◽  
Liam Raynard ◽  
Michael R Goad ◽  
Philipp Eigmüller ◽  
David J Armstrong ◽  
...  

ABSTRACT Vetting of exoplanet candidates in transit surveys is a manual process, which suffers from a large number of false positives and a lack of consistency. Previous work has shown that convolutional neural networks (CNN) provide an efficient solution to these problems. Here, we apply a CNN to classify planet candidates from the Next Generation Transit Survey (NGTS). For training data sets we compare both real data with injected planetary transits and fully simulated data, as well as how their different compositions affect network performance. We show that fewer hand labelled light curves can be utilized, while still achieving competitive results. With our best model, we achieve an area under the curve (AUC) score of $(95.6\pm {0.2}){{\ \rm per\ cent}}$ and an accuracy of $(88.5\pm {0.3}){{\ \rm per\ cent}}$ on our unseen test data, as well as $(76.5\pm {0.4}){{\ \rm per\ cent}}$ and $(74.6\pm {1.1}){{\ \rm per\ cent}}$ in comparison to our existing manual classifications. The neural network recovers 13 out of 14 confirmed planets observed by NGTS, with high probability. We use simulated data to show that the overall network performance is resilient to mislabelling of the training data set, a problem that might arise due to unidentified, low signal-to-noise transits. Using a CNN, the time required for vetting can be reduced by half, while still recovering the vast majority of manually flagged candidates. In addition, we identify many new candidates with high probabilities which were not flagged by human vetters.


Risks ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 33
Author(s):  
Łukasz Delong ◽  
Mario V. Wüthrich

The goal of this paper is to develop regression models and postulate distributions which can be used in practice to describe the joint development process of individual claim payments and claim incurred. We apply neural networks to estimate our regression models. As regressors we use the whole claim history of incremental payments and claim incurred, as well as any relevant feature information which is available to describe individual claims and their development characteristics. Our models are calibrated and tested on a real data set, and the results are benchmarked with the Chain-Ladder method. Our analysis focuses on the development of the so-called Reported But Not Settled (RBNS) claims. We show benefits of using deep neural network and the whole claim history in our prediction problem.


2019 ◽  
Vol 154 ◽  
pp. 369-376 ◽  
Author(s):  
Mehmet Oguz Kelek ◽  
Nurullah Calik ◽  
Tulay Yildirim

2020 ◽  
Vol 10 (17) ◽  
pp. 6077
Author(s):  
Gyuseok Park ◽  
Woohyeong Cho ◽  
Kyu-Sung Kim ◽  
Sangmin Lee

Hearing aids are small electronic devices designed to improve hearing for persons with impaired hearing, using sophisticated audio signal processing algorithms and technologies. In general, the speech enhancement algorithms in hearing aids remove the environmental noise and enhance speech while still giving consideration to hearing characteristics and the environmental surroundings. In this study, a speech enhancement algorithm was proposed to improve speech quality in a hearing aid environment by applying noise reduction algorithms with deep neural network learning based on noise classification. In order to evaluate the speech enhancement in an actual hearing aid environment, ten types of noise were self-recorded and classified using convolutional neural networks. In addition, noise reduction for speech enhancement in the hearing aid were applied by deep neural networks based on the noise classification. As a result, the speech quality based on the speech enhancements removed using the deep neural networks—and associated environmental noise classification—exhibited a significant improvement over that of the conventional hearing aid algorithm. The improved speech quality was also evaluated by objective measure through the perceptual evaluation of speech quality score, the short-time objective intelligibility score, the overall quality composite measure, and the log likelihood ratio score.


2005 ◽  
Vol 30 (4) ◽  
pp. 369-396 ◽  
Author(s):  
Eisuke Segawa

Multi-indicator growth models were formulated as special three-level hierarchical generalized linear models to analyze growth of a trait latent variable measured by ordinal items. Items are nested within a time-point, and time-points are nested within subject. These models are special because they include factor analytic structure. This model can analyze not only data with item- and time-level missing observations, but also data with time points freely specified over subjects. Furthermore, features useful for longitudinal analyses, “autoregressive error degree one” structure for the trait residuals and estimated time-scores, were included. The approach is Bayesian with Markov Chain and Monte Carlo, and the model is implemented in WinBUGS. They are illustrated with two simulated data sets and one real data set with planned missing items within a scale.


Geophysics ◽  
1998 ◽  
Vol 63 (6) ◽  
pp. 2035-2041 ◽  
Author(s):  
Zhengping Liu ◽  
Jiaqi Liu

We present a data‐driven method of joint inversion of well‐log and seismic data, based on the power of adaptive mapping of artificial neural networks (ANNs). We use the ANN technique to find and approximate the inversion operator guided by the data set consisting of well data and seismic recordings near the wells. Then we directly map seismic recordings to well parameters, trace by trace, to extrapolate the wide‐band profiles of these parameters using the approximation operator. Compared to traditional inversions, which are based on a few prior theoretical operators, our inversion is novel because (1) it inverts for multiple parameters and (2) it is nonlinear with a high degree of complexity. We first test our algorithm with synthetic data and analyze its sensitivity and robustness. We then invert real data to obtain two extrapolation profiles of sonic log (DT) and shale content (SH), the latter a unique parameter of the inversion and significant for the detailed evaluation of stratigraphic traps. The high‐frequency components of the two profiles are significantly richer than those of the original seismic section.


Energies ◽  
2021 ◽  
Vol 14 (19) ◽  
pp. 6156
Author(s):  
Stefan Hensel ◽  
Marin B. Marinov ◽  
Michael Koch ◽  
Dimitar Arnaudov

This paper presents a systematic approach for accurate short-time cloud coverage prediction based on a machine learning (ML) approach. Based on a newly built omnidirectional ground-based sky camera system, local training and evaluation data sets were created. These were used to train several state-of-the-art deep neural networks for object detection and segmentation. For this purpose, the camera-generated a full hemispherical image every 30 min over two months in daylight conditions with a fish-eye lens. From this data set, a subset of images was selected for training and evaluation according to various criteria. Deep neural networks, based on the two-stage R-CNN architecture, were trained and compared with a U-net segmentation approach implemented by CloudSegNet. All chosen deep networks were then evaluated and compared according to the local situation.


Author(s):  
Aydin Ayanzadeh ◽  
Sahand Vahidnia

In this paper, we leverage state of the art models on Imagenet data-sets. We use the pre-trained model and learned weighs to extract the feature from the Dog breeds identification data-set. Afterwards, we applied fine-tuning and dataaugmentation to increase the performance of our test accuracy in classification of dog breeds datasets. The performance of the proposed approaches are compared with the state of the art models of Image-Net datasets such as ResNet-50, DenseNet-121, DenseNet-169 and GoogleNet. we achieved 89.66% , 85.37% 84.01% and 82.08% test accuracy respectively which shows thesuperior performance of proposed method to the previous works on Stanford dog breeds datasets.


2020 ◽  
Author(s):  
Zhe Xu

<p>Despite the fact that artificial intelligence boosted with data-driven methods (e.g., deep neural networks) has surpassed human-level performance in various tasks, its application to autonomous</p> <p>systems still faces fundamental challenges such as lack of interpretability, intensive need for data and lack of verifiability. In this overview paper, I overview some attempts to address these fundamental challenges by explaining, guiding and verifying autonomous systems, taking into account limited availability of simulated and real data, the expressivity of high-level</p> <p>knowledge representations and the uncertainties of the underlying model. Specifically, this paper covers learning high-level knowledge from data for interpretable autonomous systems,</p><p>guiding autonomous systems with high-level knowledge, and</p><p>verifying and controlling autonomous systems against high-level specifications.</p>


2020 ◽  
Author(s):  
Edlin J. Guerra-Castro ◽  
Juan Carlos Cajas ◽  
Nuno Simões ◽  
Juan J Cruz-Motta ◽  
Maite Mascaró

ABSTRACTSSP (simulation-based sampling protocol) is an R package that uses simulation of ecological data and dissimilarity-based multivariate standard error (MultSE) as an estimator of precision to evaluate the adequacy of different sampling efforts for studies that will test hypothesis using permutational multivariate analysis of variance. The procedure consists in simulating several extensive data matrixes that mimic some of the relevant ecological features of the community of interest using a pilot data set. For each simulated data, several sampling efforts are repeatedly executed and MultSE calculated. The mean value, 0.025 and 0.975 quantiles of MultSE for each sampling effort across all simulated data are then estimated and standardized regarding the lowest sampling effort. The optimal sampling effort is identified as that in which the increase in sampling effort do not improve the precision beyond a threshold value (e.g. 2.5 %). The performance of SSP was validated using real data, and in all examples the simulated data mimicked well the real data, allowing to evaluate the relationship MultSE – n beyond the sampling size of the pilot studies. SSP can be used to estimate sample size in a wide range of situations, ranging from simple (e.g. single site) to more complex (e.g. several sites for different habitats) experimental designs. The latter constitutes an important advantage, since it offers new possibilities for complex sampling designs, as it has been advised for multi-scale studies in ecology.


Sign in / Sign up

Export Citation Format

Share Document