Training Restricted Boltzmann Machines With a D-Wave Quantum Annealer

Restricted Boltzmann Machine (RBM) is an energy-based, undirected graphical model. It is commonly used for unsupervised and supervised machine learning. Typically, RBM is trained using contrastive divergence (CD). However, training with CD is slow and does not estimate the exact gradient of the log-likelihood cost function. In this work, the model expectation of gradient learning for RBM has been calculated using a quantum annealer (D-Wave 2000Q), where obtaining samples is faster than Markov chain Monte Carlo (MCMC) used in CD. Training and classification results of RBM trained using quantum annealing are compared with the CD-based method. The performance of the two approaches is compared with respect to the classification accuracies, image reconstruction, and log-likelihood results. The classification accuracy results indicate comparable performances of the two methods. Image reconstruction and log-likelihood results show improved performance of the CD-based method. It is shown that the samples obtained from quantum annealer can be used to train an RBM on a 64-bit “bars and stripes” dataset with classification performance similar to an RBM trained with CD. Though training based on CD showed improved learning performance, training using a quantum annealer could be useful as it eliminates computationally expensive MCMC steps of CD.

Download Full-text

Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Scientific Reports ◽

10.1038/s41598-021-82197-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Guanglei Xu ◽

William S. Oates

Keyword(s):

Neural Network ◽

Maximum Likelihood ◽

Image Reconstruction ◽

Image Recognition ◽

Shannon Entropy ◽

Reconstruction Error ◽

Likelihood Method ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

D Wave

AbstractRestricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ($$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.

Download Full-text

Hierarchical Phoneme Classification for Improved Speech Recognition

Applied Sciences ◽

10.3390/app11010428 ◽

2021 ◽

Vol 11 (1) ◽

pp. 428

Author(s):

Donghoon Oh ◽

Jeong-Sik Park ◽

Ji-Hwan Kim ◽

Gil-Jin Jang

Keyword(s):

Speech Recognition ◽

Language Processing ◽

Confusion Matrix ◽

Critical Factor ◽

Recognition System ◽

Classification Performance ◽

Language Models ◽

Successful Implementation ◽

Phoneme Classification ◽

Improved Performance

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.

Download Full-text

Source allocation of per- and polyfluoroalkyl substances (PFAS) with supervised machine learning: Classification performance and the role of feature selection in an expanded dataset

Chemosphere ◽

10.1016/j.chemosphere.2021.130124 ◽

2021 ◽

Vol 275 ◽

pp. 130124

Author(s):

Tohren C.G. Kibbey ◽

Rafal Jabrzemski ◽

Denis M. O’Carroll

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Classification Performance ◽

Supervised Machine Learning ◽

Machine Learning Classification ◽

Polyfluoroalkyl Substances ◽

Source Allocation

Download Full-text

Development of a Pattern Recognition Tool for the Classification of Electronic Tongue Signals Using Machine Learning

Chemistry Proceedings ◽

10.3390/csac2021-10447 ◽

2021 ◽

Vol 5 (1) ◽

pp. 21

Author(s):

Edgar G. Mendez-Lopez ◽

Jersson X. Leon-Medina ◽

Diego A. Tibaduiza

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Performance Measures ◽

Sensor Array ◽

Three Dimensional ◽

Electronic Tongue ◽

Sensor Arrays ◽

Classification Performance ◽

Supervised Machine Learning ◽

Electrochemical Tests

Electronic tongue type sensor arrays are made of different materials with the property of capturing signals independently by each sensor. The signals captured when conducting electrochemical tests often have high dimensionality, which increases when performing the data unfolding process. This unfolding process consists of arranging the data coming from different experiments, sensors, and sample times, thus the obtained information is arranged in a two-dimensional matrix. In this work, a description of a tool for the analysis of electronic tongue signals is developed. This tool is developed in Matlab® App Designer, to process and classify the data from different substances analyzed by an electronic tongue type sensor array. The data processing is carried out through the execution of the following stages: (1) data unfolding, (2) normalization, (3) dimensionality reduction, (4) classification through a supervised machine learning model, and finally (5) a cross-validation procedure to calculate a set of classification performance measures. Some important characteristics of this tool are the possibility to tune the parameters of the dimensionality reduction and classifier algorithms, and also plot the two and three-dimensional scatter plot of the features after reduced the dimensionality. This to see the data separability between classes and compatibility in each class. This interface is successfully tested with two electronic tongue sensor array datasets with multi-frequency large amplitude pulse voltammetry (MLAPV) signals. The developed graphical user interface allows comparing different methods in each of the mentioned stages to find the best combination of methods and thus obtain the highest values of classification performance measures.

Download Full-text

Learning Compositional Representations of Interacting Systems with Restricted Boltzmann Machines: Comparative Study of Lattice Proteins

Neural Computation ◽

10.1162/neco_a_01210 ◽

2019 ◽

Vol 31 (8) ◽

pp. 1671-1717 ◽

Cited By ~ 1

Author(s):

Jérôme Tubiana ◽

Simona Cocco ◽

Rémi Monasson

Keyword(s):

Graphical Model ◽

A Priori ◽

Protein Sequences ◽

Ground Truth ◽

Representation Learning ◽

Statistical Features ◽

Restricted Boltzmann Machines ◽

Interacting Systems ◽

Hidden Layer ◽

Stochastic Mapping

A restricted Boltzmann machine (RBM) is an unsupervised machine learning bipartite graphical model that jointly learns a probability distribution over data and extracts their relevant statistical features. RBMs were recently proposed for characterizing the patterns of coevolution between amino acids in protein sequences and for designing new sequences. Here, we study how the nature of the features learned by RBM changes with its defining parameters, such as the dimensionality of the representations (size of the hidden layer) and the sparsity of the features. We show that for adequate values of these parameters, RBMs operate in a so-called compositional phase in which visible configurations sampled from the RBM are obtained by recombining these features. We then compare the performance of RBM with other standard representation learning algorithms, including principal or independent component analysis (PCA, ICA), autoencoders (AE), variational autoencoders (VAE), and their sparse variants. We show that RBMs, due to the stochastic mapping between data configurations and representations, better capture the underlying interactions in the system and are significantly more robust with respect to sample size than deterministic methods such as PCA or ICA. In addition, this stochastic mapping is not prescribed a priori as in VAE, but learned from data, which allows RBMs to show good performance even with shallow architectures. All numerical results are illustrated on synthetic lattice protein data that share similar statistical features with real protein sequences and for which ground-truth interactions are known.

Download Full-text

A Cross-Correlated Delay Shift Supervised Learning Method for Spiking Neurons with Application to Interictal Spike Detection in Epilepsy

International Journal of Neural Systems ◽

10.1142/s0129065717500022 ◽

2017 ◽

Vol 27 (03) ◽

pp. 1750002 ◽

Cited By ~ 24

Author(s):

Lilin Guo ◽

Zhenzhong Wang ◽

Mercedes Cabrerizo ◽

Malek Adjouadi

Keyword(s):

Adaptive Learning ◽

Learning Algorithm ◽

Recognition Performance ◽

Learning Rule ◽

Classification Performance ◽

Spiking Neurons ◽

Learning Performance ◽

Learning Method ◽

Neuron Models ◽

Interictal Spike

This study introduces a novel learning algorithm for spiking neurons, called CCDS, which is able to learn and reproduce arbitrary spike patterns in a supervised fashion allowing the processing of spatiotemporal information encoded in the precise timing of spikes. Unlike the Remote Supervised Method (ReSuMe), synapse delays and axonal delays in CCDS are variants which are modulated together with weights during learning. The CCDS rule is both biologically plausible and computationally efficient. The properties of this learning rule are investigated extensively through experimental evaluations in terms of reliability, adaptive learning performance, generality to different neuron models, learning in the presence of noise, effects of its learning parameters and classification performance. Results presented show that the CCDS learning method achieves learning accuracy and learning speed comparable with ReSuMe, but improves classification accuracy when compared to both the Spike Pattern Association Neuron (SPAN) learning rule and the Tempotron learning rule. The merit of CCDS rule is further validated on a practical example involving the automated detection of interictal spikes in EEG records of patients with epilepsy. Results again show that with proper encoding, the CCDS rule achieves good recognition performance.

Download Full-text

Automatic recognition of self-acknowledged limitations in clinical research literature

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy038 ◽

2018 ◽

Vol 25 (7) ◽

pp. 855-861 ◽

Cited By ~ 4

Author(s):

Halil Kilicoglu ◽

Graciela Rosemblat ◽

Mario Malički ◽

Gerben ter Riet

Keyword(s):

Machine Learning ◽

Clinical Research ◽

Binary Classification ◽

Classification Performance ◽

Research Literature ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Support Vector ◽

Rule Based ◽

Research Transparency

Abstract Objective To automatically recognize self-acknowledged limitations in clinical research publications to support efforts in improving research transparency. Methods To develop our recognition methods, we used a set of 8431 sentences from 1197 PubMed Central articles. A subset of these sentences was manually annotated for training/testing, and inter-annotator agreement was calculated. We cast the recognition problem as a binary classification task, in which we determine whether a given sentence from a publication discusses self-acknowledged limitations or not. We experimented with three methods: a rule-based approach based on document structure, supervised machine learning, and a semi-supervised method that uses self-training to expand the training set in order to improve classification performance. The machine learning algorithms used were logistic regression (LR) and support vector machines (SVM). Results Annotators had good agreement in labeling limitation sentences (Krippendorff’s α = 0.781). Of the three methods used, the rule-based method yielded the best performance with 91.5% accuracy (95% CI [90.1-92.9]), while self-training with SVM led to a small improvement over fully supervised learning (89.9%, 95% CI [88.4-91.4] vs 89.6%, 95% CI [88.1-91.1]). Conclusions The approach presented can be incorporated into the workflows of stakeholders focusing on research transparency to improve reporting of limitations in clinical studies.

Download Full-text

The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks

10.1101/2020.06.29.176925 ◽

2020 ◽

Author(s):

Friedemann Zenke ◽

Tim P. Vogels

Keyword(s):

Neural Networks ◽

Complex Function ◽

Spiking Neural Networks ◽

Learning Performance ◽

Design Parameters ◽

Classification Problems ◽

Systematic Account ◽

Practical Algorithms ◽

Spiking Networks ◽

Gradient Learning

AbstractBrains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.

Download Full-text

Restricted Boltzmann Machines for Fundus Image Reconstruction and Classification of Hypertension Retinopathy

Journal of Computer Science ◽

10.3844/jcssp.2021.156.166 ◽

2021 ◽

Vol 17 (2) ◽

pp. 156-166

Author(s):

Bambang Krismono Triwijoyo ◽

Boy Subirosa Sabarguna ◽

Widodo Budiharto ◽

Edi Abdurachman

Keyword(s):

Image Reconstruction ◽

Fundus Image ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines

Download Full-text

Exploring Binary Relations for Ontology Extension and Improved Adaptation to Clinical Text

10.1101/2020.12.04.411751 ◽

2020 ◽

Author(s):

Luke T Slater ◽

Robert Hoehndorf ◽

Andreas Karwath ◽

Georgios V Gkoutos

Keyword(s):

Text Mining ◽

Semantic Analysis ◽

Classification Performance ◽

Human Interaction ◽

Semantic Features ◽

Ontology Learning ◽

Disease Ontology ◽

Text Corpora ◽

Clinical Narrative ◽

Improved Performance

AbstractBackgroundThe controlled domain vocabularies provided by ontologies make them an indispensable tool for text mining. Ontologies also include semantic features in the form of taxonomy and axioms, which make annotated entities in text corpora useful for semantic analysis. Extending those semantic features may improve performance for characterisation and analytic tasks. Ontology learning techniques have previously been explored for novel ontology construction from text, though most recent approaches have focused on literature, with applications in information retrieval or human interaction tasks. We hypothesise that extension of existing ontologies using information mined from clinical narrative text may help to adapt those ontologies such that they better characterise those texts, and lead to improved classification performance.ResultsWe develop and present a framework for identifying new classes in text corpora, which can be integrated into existing ontology hierarchies. To do this, we employ the Stanford Open Information Extraction algorithm and integrate its implementation into the Komenti semantic text mining framework. To identify whether our approach leads to better characterisation of text, we present a case study, using the method to learn an adaptation to the Disease Ontology using text associated with a sample of 1,000 patient visits from the MIMIC-III critical care database. We use the adapted ontology to annotate and classify shared first diagnosis on patient visits with semantic similarity, revealing an improved performance over use of the base Disease Ontology on the set of visits the ontology was constructed from. Moreover, we show that the adapted ontology also improved performance for the same task over two additional unseen samples of 1,000 and 2,500 patient visits.ConclusionsWe report a promising new method for ontology learning and extension from text. We demonstrate that we can successfully use the method to adapt an existing ontology to a textual dataset, improving its ability to characterise the dataset, and leading to improved analytic performance, even on unseen portions of the dataset.

Download Full-text