scholarly journals The structure of intrinsic complexity of learning

1997 ◽  
Vol 62 (4) ◽  
pp. 1187-1201 ◽  
Author(s):  
Sanjay Jain ◽  
Arun Sharma

AbstractLimiting identification of r.e. indexes for r.e. languages (from a presentation of elements of the language) and limiting identification of programs for computable functions (from a graph of the function) have served as models for investigating the boundaries of learnability. Recently, a new approach to the study of “intrinsic” complexity of identification in the limit has been proposed. This approach, instead of dealing with the resource requirements of the learning algorithm, uses the notion of reducibility from recursion theory to compare and to capture the intuitive difficulty of learning various classes of concepts. Freivalds, Kinber, and Smith have studied this approach for function identification and Jain and Sharma have studied it for language identification.The present paper explores the structure of these reducibilities in the context of language identification. It is shown that there is an infinite hierarchy of language classes that represent learning problems of increasing difficulty. It is also shown that the language classes in this hierarchy are incomparable, under the reductions introduced, to the collection of pattern languages.Richness of the structure of intrinsic complexity is demonstrated by proving that any finite, acyclic, directed graph can be embedded in the reducibility structure. However, it is also established that this structure is not dense. The question of embedding any infinite, acyclic, directed graph is open.

2021 ◽  
Author(s):  
Hui Wang ◽  
Lin Liu ◽  
Yan Song ◽  
Lei Fang ◽  
Ian McLoughlin ◽  
...  

2018 ◽  
Vol 29 (1) ◽  
pp. 1028-1042 ◽  
Author(s):  
Amina Dik ◽  
Khalid Jebari ◽  
Aziz Ettouhami

Abstract This paper presents a robust, dynamic, and unsupervised fuzzy learning algorithm (RDUFL) that aims to cluster a set of data samples with the ability to detect outliers and assign the numbers of clusters automatically. It consists of three main stages. The first (1) stage is a pre-processing method in which possible outliers are determined and quarantined using a concept of proximity degree. The second (2) stage is a learning method, which consists in auto-detecting the number of classes with their prototypes for a dynamic threshold. This threshold is automatically determined based on the similarity among the detected prototypes that are updated at the exploration of a new data. The last (3) stage treats quarantined samples detected from the first stage to determine whether they belong to some class defined in the second phase. The effectiveness of this method is assessed on eight real medical benchmark datasets in comparison to known unsupervised learning methods, namely, the fuzzy c-means (FCM), possibilistic c-means (PCM), and noise clustering (NC). The obtained accuracy of our scheme is very promising for unsupervised learning problems.


2014 ◽  
Vol 641-642 ◽  
pp. 1287-1290
Author(s):  
Lan Zhang ◽  
Yu Feng Nie ◽  
Zhen Hai Wang

Deep neural network as a part of deep learning algorithm is a state-of-the-art approach to find higher level representations of input data which has been introduced to many practical and challenging learning problems successfully. The primary goal of deep learning is to use large data to help solving a given task on machine learning. We propose an methodology for image de-noising project defined by this model and conduct training a large image database to get the experimental output. The result shows the robustness and efficient our our algorithm.


2019 ◽  
Vol 18 (1) ◽  
Author(s):  
Bin Li ◽  
Yuan Wen ◽  
Li Song ◽  
Weiguang Qu ◽  
Nianwen Xue

Abstract Meaning Representation (AMR) is a meaning representation framework in which the meaning of a full sentence is represented as a single-rooted, acyclic, directed graph. In this article, we describe an on-going project to build a Chinese AMR (CAMR) corpus, which currently includes 10,149 sentences from the newsgroup and weblog portion of the Chinese TreeBank (CTB). We describe the annotation specifications for the CAMR corpus, which follow the annotation principles of English AMR but make adaptations where needed to accommodate the linguistic facts of Chinese. The CAMR specifications also include a systematic treatment of sentence-internal discourse relations. One significant change we have made to the AMR annotation methodology is the inclusion of the alignment between word tokens in the sentence and the concepts/relations in the CAMR annotation to make it easier for automatic parsers to model the correspondence between a sentence and its meaning representation. We develop an annotation tool for CAMR, and the inter-agreement as measured by the Smatch score between the two annotators is 0.83, indicating reliable annotation. We also present some quantitative analysis of the CAMR corpus. 46.71% of the AMRs of the sentences are non-tree graphs. Moreover, the AMR of 88.95% of the sentences has concepts inferred from the context of the sentence but do not correspond to a specific word.


Author(s):  
Shuji Hao ◽  
Peilin Zhao ◽  
Yong Liu ◽  
Steven C. H. Hoi ◽  
Chunyan Miao

Relative similarity learning~(RSL) aims to learn similarity functions from data with relative constraints. Most previous algorithms developed for RSL are batch-based learning approaches which suffer from poor scalability when dealing with real-world data arriving sequentially. These methods are often designed to learn a single similarity function for a specific task. Therefore, they may be sub-optimal to solve multiple task learning problems. To overcome these limitations, we propose a scalable RSL framework named OMTRSL (Online Multi-Task Relative Similarity Learning). Specifically, we first develop a simple yet effective online learning algorithm for multi-task relative similarity learning. Then, we also propose an active learning algorithm to save the labeling cost. The proposed algorithms not only enjoy theoretical guarantee, but also show high efficacy and efficiency in extensive experiments on real-world datasets.


Author(s):  
Oghenejokpeme I. Orhobor ◽  
Larisa N. Soldatova ◽  
Ross D. King

Abstract Ensemble learning has been shown to significantly improve predictive accuracy in a variety of machine learning problems. For a given predictive task, the goal of ensemble learning is to improve predictive accuracy by combining the predictive power of multiple models. In this paper, we present an ensemble learning algorithm for regression problems which leverages the distribution of the samples in a learning set to achieve improved performance. We apply the proposed algorithm to a problem in precision medicine where the goal is to predict drug perturbation effects on genes in cancer cell lines. The proposed approach significantly outperforms the base case.


Sign in / Sign up

Export Citation Format

Share Document