training examples
Recently Published Documents


TOTAL DOCUMENTS

337
(FIVE YEARS 132)

H-INDEX

26
(FIVE YEARS 6)

2022 ◽  
Vol 16 (4) ◽  
pp. 1-18
Author(s):  
Min-Ling Zhang ◽  
Jing-Han Wu ◽  
Wei-Xuan Bao

As an emerging weakly supervised learning framework, partial label learning considers inaccurate supervision where each training example is associated with multiple candidate labels among which only one is valid. In this article, a first attempt toward employing dimensionality reduction to help improve the generalization performance of partial label learning system is investigated. Specifically, the popular linear discriminant analysis (LDA) techniques are endowed with the ability of dealing with partial label training examples. To tackle the challenge of unknown ground-truth labeling information, a novel learning approach named Delin is proposed which alternates between LDA dimensionality reduction and candidate label disambiguation based on estimated labeling confidences over candidate labels. On one hand, the (kernelized) projection matrix of LDA is optimized by utilizing disambiguation-guided labeling confidences. On the other hand, the labeling confidences are disambiguated by resorting to k NN aggregation in the LDA-induced feature space. Extensive experiments over a broad range of partial label datasets clearly validate the effectiveness of Delin in improving the generalization performance of well-established partial label learning algorithms.


2022 ◽  
Vol 5 (1) ◽  
pp. 11-15
Author(s):  
John K. Hillier ◽  
Chris Unsworth ◽  
Luke De Clerk ◽  
Sergey Savel'ev

Abstract. Insights from a geoscience communication activity, verified using preliminary investigations with an artificial neural network, illustrate that observation of humans' abilities can help design an effective artificial intelligence or “AI”. Even given only one set of “training” examples, survey participants could visually recognize which flow conditions created bedforms (e.g. sand dunes and riverbed ripples) from their shapes, but an interpreter's geoscience expertise does not help. Together, these observations were interpreted as indicating that a machine learning algorithm might be trained successfully from limited data, particularly if it is “helped” by pre-processing bedforms into a simple shape familiar from childhood play.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
M. A. Hakim Newton ◽  
Fereshteh Mataeimoghadam ◽  
Rianon Zaman ◽  
Abdul Sattar

Abstract Motivation Protein backbone angle prediction has achieved significant accuracy improvement with the development of deep learning methods. Usually the same deep learning model is used in making prediction for all residues regardless of the categories of secondary structures they belong to. In this paper, we propose to train separate deep learning models for each category of secondary structures. Machine learning methods strive to achieve generality over the training examples and consequently loose accuracy. In this work, we explicitly exploit classification knowledge to restrict generalisation within the specific class of training examples. This is to compensate the loss of generalisation by exploiting specialisation knowledge in an informed way. Results The new method named SAP4SS obtains mean absolute error (MAE) values of 15.59, 18.87, 6.03, and 21.71 respectively for four types of backbone angles $$\phi$$ ϕ , $$\psi$$ ψ , $$\theta$$ θ , and $$\tau$$ τ . Consequently, SAP4SS significantly outperforms existing state-of-the-art methods SAP, OPUS-TASS, and SPOT-1D: the differences in MAE for all four types of angles are from 1.5 to 4.1% compared to the best known results. Availability SAP4SS along with its data is available from https://gitlab.com/mahnewton/sap4ss.


2021 ◽  
Author(s):  
Hoifung Poon ◽  
Hai Wang ◽  
Hunter Lang

Deep learning has proven effective for various application tasks, but its applicability is limited by the reliance on annotated examples. Self-supervised learning has emerged as a promising direction to alleviate the supervision bottleneck, but existing work focuses on leveraging co-occurrences in unlabeled data for task-agnostic representation learning, as exemplified by masked language model pretraining. In this chapter, we explore task-specific self-supervision, which leverages domain knowledge to automatically annotate noisy training examples for end applications, either by introducing labeling functions for annotating individual instances, or by imposing constraints over interdependent label decisions. We first present deep probabilistic logic (DPL), which offers a unifying framework for task-specific self-supervision by composing probabilistic logic with deep learning. DPL represents unknown labels as latent variables and incorporates diverse self-supervision using probabilistic logic to train a deep neural network end-to-end using variational EM. Next, we present self-supervised self-supervision (S4), which adds to DPL the capability to learn new self-supervision automatically. Starting from an initial seed self-supervision, S4 iteratively uses the deep neural network to propose new self supervision. These are either added directly (a form of structured self-training) or verified by a human expert (as in feature-based active learning). Experiments on real-world applications such as biomedical machine reading and various text classification tasks show that task-specific self-supervision can effectively leverage domain expertise and often match the accuracy of supervised methods with a tiny fraction of human effort.


2021 ◽  
Vol 2021 (12) ◽  
pp. 124002
Author(s):  
Stéphane d’Ascoli ◽  
Levent Sagun ◽  
Giulio Biroli

Abstract A recent line of research has highlighted the existence of a ‘double descent’ phenomenon in deep learning, whereby increasing the number of training examples N causes the generalization error of neural networks (NNs) to peak when N is of the same order as the number of parameters P. In earlier works, a similar phenomenon was shown to exist in simpler models such as linear regression, where the peak instead occurs when N is equal to the input dimension D. Since both peaks coincide with the interpolation threshold, they are often conflated in the literature. In this paper, we show that despite their apparent similarity, these two scenarios are inherently different. In fact, both peaks can co-exist when NNs are applied to noisy regression tasks. The relative size of the peaks is then governed by the degree of nonlinearity of the activation function. Building on recent developments in the analysis of random feature models, we provide a theoretical ground for this sample-wise triple descent. As shown previously, the nonlinear peak at N = P is a true divergence caused by the extreme sensitivity of the output function to both the noise corrupting the labels and the initialization of the random features (or the weights in NNs). This peak survives in the absence of noise, but can be suppressed by regularization. In contrast, the linear peak at N = D is solely due to overfitting the noise in the labels, and forms earlier during training. We show that this peak is implicitly regularized by the nonlinearity, which is why it only becomes salient at high noise and is weakly affected by explicit regularization. Throughout the paper, we compare analytical results obtained in the random feature model with the outcomes of numerical experiments involving deep NNs.


2021 ◽  
Author(s):  
Andrew Cropper ◽  
Sebastijan Dumančić ◽  
Richard Evans ◽  
Stephen H. Muggleton

AbstractInductive logic programming (ILP) is a form of logic-based machine learning. The goal is to induce a hypothesis (a logic program) that generalises given training examples and background knowledge. As ILP turns 30, we review the last decade of research. We focus on (i) new meta-level search methods, (ii) techniques for learning recursive programs, (iii) new approaches for predicate invention, and (iv) the use of different technologies. We conclude by discussing current limitations of ILP and directions for future research.


2021 ◽  
Vol 17 (11) ◽  
pp. 155014772110523
Author(s):  
Mohammed Alarfaj ◽  
Zhenqiang Su ◽  
Raymond Liu ◽  
Abdulaziz Al-Humam ◽  
Huaping Liu

Image or feature matching-based indoor localization still faces many technical challenges. Image-tag-based schemes using pose estimation are accurate and robust, but they still cannot be deployed widely because their performance degrades significantly when the tag-camera distance is large, which requires densely distributed tags, and the designed system generally is specific to some special tags and lenses. Also, the lens distortion degrades the performance appreciably and is difficult to correct, especially for the wide-angle lenses. This article develops an image-tag-based indoor localization system using end-to-end learning to overcome these issues. It is a deep learning–based system that can learn the mapping from the original tag image to the final 2D location directly from training examples through self-learned features. It achieves consistent performance even when the tag-camera distance is large or when the image has a low resolution. The mapping learned by the deep learning model factors in all kinds of distortions without requiring any distortion estimation. The tag design is based on shape features to make it robust to lighting changes. The system can be easily adapted to new lenses/cameras and/or new tags. Thus, it facilitates easy and rapid deployment without requiring knowledge from domain experts. A drawback of the general deep learning model is its high computational requirements. We discuss practical solutions to enable real-time applications of the proposed scheme even when it is running on a mobile or embedded device. The performance of the proposed scheme is evaluated via a set of experiments in a real setting and has achieved less than 20 cm of positioning errors.


2021 ◽  
Vol 40 (5) ◽  
pp. 1-32
Author(s):  
Zhen Chen ◽  
Hsiao-Yu Chen ◽  
Danny M. Kaufman ◽  
Mélina Skouras ◽  
Etienne Vouga

We propose a new model and algorithm to capture the high-definition statics of thin shells via coarse meshes. This model predicts global, fine-scale wrinkling at frequencies much higher than the resolution of the coarse mesh; moreover, it is grounded in the geometric analysis of elasticity, and does not require manual guidance, a corpus of training examples, nor tuning of ad hoc parameters. We first approximate the coarse shape of the shell using tension field theory, in which material forces do not resist compression. We then augment this base mesh with wrinkles, parameterized by an amplitude and phase field that we solve for over the base mesh, which together characterize the geometry of the wrinkles. We validate our approach against both physical experiments and numerical simulations, and we show that our algorithm produces wrinkles qualitatively similar to those predicted by traditional shell solvers requiring orders of magnitude more degrees of freedom.


2021 ◽  
Author(s):  
Alfonso Rojas-Domínguez ◽  
Ivvan Valdez ◽  
Manuel Ornelas-Rodríguez ◽  
Martín Carpio

Abstract Fostered by technological and theoretical developments, deep neural networks have achieved great success in many applications, but their training by means of mini-batch stochastic gradient descent (SGD) can be very costly due to the possibly tens of millions of parameters to be optimized and the large amounts of training examples that must be processed. Said computational cost is exacerbated by the inefficiency of the uniform sampling method typically used by SGD to form the training mini-batches: since not all training examples are equally relevant for training, sampling these under a uniform distribution is far from optimal. A better strategy is to form the mini-batches by sampling the training examples under a distribution where the probability of being selected is proportional to the relevance of each individual example. This can be achieved through Importance Sampling (IS), which also achieves the minimization of the gradients’ variance w.r.t. the network parameters, further improving convergence. In this paper, an IS-based adaptive sampling method is studied that exploits side information to construct the required probability distribution. Said method is modified to enable its application to deep neural networks, and the improved method is dubbed Regularized Adaptive Sampling (RAS). Experimental comparison (using deep convolutional networks for classification of the MNIST and CIFAR-10 datasets) of RAS against SGD and against another sampling method in the state of the art, shows that RAS achieves relative improvements of the training process, without incurring significant overhead or affecting the accuracy of the networks.


Sign in / Sign up

Export Citation Format

Share Document