scholarly journals Approximate Manifold Regularization: Scalable Algorithm and Generalization Analysis

Author(s):  
Jian Li ◽  
Yong Liu ◽  
Rong Yin ◽  
Weiping Wang

Graph-based semi-supervised learning is one of the most popular and successful semi-supervised learning approaches. Unfortunately, it suffers from high time and space complexity, at least quadratic with the number of training samples. In this paper, we propose an efficient graph-based semi-supervised algorithm with a sound theoretical guarantee. The proposed method combines Nystrom subsampling and preconditioned conjugate gradient descent, substantially improving computational efficiency and reducing memory requirements. Extensive empirical results reveal that our method achieves the state-of-the-art performance in a short time even with limited computing resources.

2019 ◽  
Vol 10 (1) ◽  
pp. 64
Author(s):  
Yi Lin ◽  
Honggang Zhang

In the era of Big Data, multi-instance learning, as a weakly supervised learning framework, has various applications since it is helpful to reduce the cost of the data-labeling process. Due to this weakly supervised setting, learning effective instance representation/embedding is challenging. To address this issue, we propose an instance-embedding regularizer that can boost the performance of both instance- and bag-embedding learning in a unified fashion. Specifically, the crux of the instance-embedding regularizer is to maximize correlation between instance-embedding and underlying instance-label similarities. The embedding-learning framework was implemented using a neural network and optimized in an end-to-end manner using stochastic gradient descent. In experiments, various applications were studied, and the results show that the proposed instance-embedding-regularization method is highly effective, having state-of-the-art performance.


Author(s):  
Ding Li ◽  
Scott Dick

AbstractGraph-based algorithms are known to be effective approaches to semi-supervised learning. However, there has been relatively little work on extending these algorithms to the multi-label classification case. We derive an extension of the Manifold Regularization algorithm to multi-label classification, which is significantly simpler than the general Vector Manifold Regularization approach. We then augment our algorithm with a weighting strategy to allow differential influence on a model between instances having ground-truth vs. induced labels. Experiments on four benchmark multi-label data sets show that the resulting algorithm performs better overall compared to the existing semi-supervised multi-label classification algorithms at various levels of label sparsity. Comparisons with state-of-the-art supervised multi-label approaches (which of course are fully labeled) also show that our algorithm outperforms all of them even with a substantial number of unlabeled examples.


Author(s):  
Michael Withnall ◽  
Edvard Lindelöf ◽  
Ola Engkvist ◽  
Hongming Chen

We introduce Attention and Edge Memory schemes to the existing Message Passing Neural Network framework for graph convolution, and benchmark our approaches against eight different physical-chemical and bioactivity datasets from the literature. We remove the need to introduce <i>a priori</i> knowledge of the task and chemical descriptor calculation by using only fundamental graph-derived properties. Our results consistently perform on-par with other state-of-the-art machine learning approaches, and set a new standard on sparse multi-task virtual screening targets. We also investigate model performance as a function of dataset preprocessing, and make some suggestions regarding hyperparameter selection.


2021 ◽  
Vol 11 (9) ◽  
pp. 4232
Author(s):  
Krishan Harkhoe ◽  
Guy Verschaffelt ◽  
Guy Van der Sande

Delay-based reservoir computing (RC), a neuromorphic computing technique, has gathered lots of interest, as it promises compact and high-speed RC implementations. To further boost the computing speeds, we introduce and study an RC setup based on spin-VCSELs, thereby exploiting the high polarization modulation speed inherent to these lasers. Based on numerical simulations, we benchmarked this setup against state-of-the-art delay-based RC systems and its parameter space was analyzed for optimal performance. The high modulation speed enabled us to have more virtual nodes in a shorter time interval. However, we found that at these short time scales, the delay time and feedback rate heavily influence the nonlinear dynamics. Therefore, and contrary to other laser-based RC systems, the delay time has to be optimized in order to obtain good RC performances. We achieved state-of-the-art performances on a benchmark timeseries prediction task. This spin-VCSEL-based RC system shows a ten-fold improvement in processing speed, which can further be enhanced in a straightforward way by increasing the birefringence of the VCSEL chip.


Electronics ◽  
2021 ◽  
Vol 10 (15) ◽  
pp. 1807
Author(s):  
Sascha Grollmisch ◽  
Estefanía Cano

Including unlabeled data in the training process of neural networks using Semi-Supervised Learning (SSL) has shown impressive results in the image domain, where state-of-the-art results were obtained with only a fraction of the labeled data. The commonality between recent SSL methods is that they strongly rely on the augmentation of unannotated data. This is vastly unexplored for audio data. In this work, SSL using the state-of-the-art FixMatch approach is evaluated on three audio classification tasks, including music, industrial sounds, and acoustic scenes. The performance of FixMatch is compared to Convolutional Neural Networks (CNN) trained from scratch, Transfer Learning, and SSL using the Mean Teacher approach. Additionally, a simple yet effective approach for selecting suitable augmentation methods for FixMatch is introduced. FixMatch with the proposed modifications always outperformed Mean Teacher and the CNNs trained from scratch. For the industrial sounds and music datasets, the CNN baseline performance using the full dataset was reached with less than 5% of the initial training data, demonstrating the potential of recent SSL methods for audio data. Transfer Learning outperformed FixMatch only for the most challenging dataset from acoustic scene classification, showing that there is still room for improvement.


Author(s):  
Hussein Mohammed ◽  
Volker Märgner ◽  
Giovanni Ciotti

AbstractAutomatic pattern detection has become increasingly important for scholars in the humanities as the number of manuscripts that have been digitised has grown. Most of the state-of-the-art methods used for pattern detection depend on the availability of a large number of training samples, which are typically not available in the humanities as they involve tedious manual annotation by researchers (e.g. marking the location and size of words, drawings, seals and so on). This makes the applicability of such methods very limited within the field of manuscript research. We propose a learning-free approach based on a state-of-the-art Naïve Bayes Nearest-Neighbour classifier for the task of pattern detection in manuscript images. The method has already been successfully applied to an actual research question from South Asian studies about palm-leaf manuscripts. Furthermore, state-of-the-art results have been achieved on two extremely challenging datasets, namely the AMADI_LontarSet dataset of handwriting on palm leaves for word-spotting and the DocExplore dataset of medieval manuscripts for pattern detection. A performance analysis is provided as well in order to facilitate later comparisons by other researchers. Finally, an easy-to-use implementation of the proposed method is developed as a software tool and made freely available.


2021 ◽  
Vol 11 (4) ◽  
pp. 1728
Author(s):  
Hua Zhong ◽  
Li Xu

The prediction interval (PI) is an important research topic in reliability analyses and decision support systems. Data size and computation costs are two of the issues which may hamper the construction of PIs. This paper proposes an all-batch (AB) loss function for constructing high quality PIs. Taking the full advantage of the likelihood principle, the proposed loss makes it possible to train PI generation models using the gradient descent (GD) method for both small and large batches of samples. With the structure of dual feedforward neural networks (FNNs), a high-quality PI generation framework is introduced, which can be adapted to a variety of problems including regression analysis. Numerical experiments were conducted on the benchmark datasets; the results show that higher-quality PIs were achieved using the proposed scheme. Its reliability and stability were also verified in comparison with various state-of-the-art PI construction methods.


2021 ◽  
Vol 13 (14) ◽  
pp. 2686
Author(s):  
Di Wei ◽  
Yuang Du ◽  
Lan Du ◽  
Lu Li

The existing Synthetic Aperture Radar (SAR) image target detection methods based on convolutional neural networks (CNNs) have achieved remarkable performance, but these methods require a large number of target-level labeled training samples to train the network. Moreover, some clutter is very similar to targets in SAR images with complex scenes, making the target detection task very difficult. Therefore, a SAR target detection network based on a semi-supervised learning and attention mechanism is proposed in this paper. Since the image-level label simply marks whether the image contains the target of interest or not, which is easier to be labeled than the target-level label, the proposed method uses a small number of target-level labeled training samples and a large number of image-level labeled training samples to train the network with a semi-supervised learning algorithm. The proposed network consists of a detection branch and a scene recognition branch with a feature extraction module and an attention module shared between these two branches. The feature extraction module can extract the deep features of the input SAR images, and the attention module can guide the network to focus on the target of interest while suppressing the clutter. During the semi-supervised learning process, the target-level labeled training samples will pass through the detection branch, while the image-level labeled training samples will pass through the scene recognition branch. During the test process, considering the help of global scene information in SAR images for detection, a novel coarse-to-fine detection procedure is proposed. After the coarse scene recognition determining whether the input SAR image contains the target of interest or not, the fine target detection is performed on the image that may contain the target. The experimental results based on the measured SAR dataset demonstrate that the proposed method can achieve better performance than the existing methods.


Sign in / Sign up

Export Citation Format

Share Document