scholarly journals Active Selection of Label Data for Semi-Supervised Learning Algorithm

2013 ◽  
Vol 17 (3) ◽  
pp. 254-259 ◽  
Author(s):  
Ji-Ho Han ◽  
Eun-Ae Park ◽  
Dong-Chul Park ◽  
Yunsik Lee ◽  
Soo-Young Min
Author(s):  
Mengchen Liu ◽  
Liu Jiang ◽  
Junlin Liu ◽  
Xiting Wang ◽  
Jun Zhu ◽  
...  

Although several effective learning-from-crowd methods have been developed to infer correct labels from noisy crowdsourced labels, a method for post-processed expert validation is still needed. This paper introduces a semi-supervised learning algorithm that is capable of selecting the most informative instances and maximizing the influence of expert labels. Specifically, we have developed a complete uncertainty assessment to facilitate the selection of the most informative instances. The expert labels are then propagated to similar instances via regularized Bayesian inference. Experiments on both real-world and simulated datasets indicate that given a specific accuracy goal (e.g., 95%) our method reduces expert effort from 39% to 60% compared with the state-of-the-art method.


Author(s):  
Dan Luo

Background: As known that the semi-supervised algorithm is a classical algorithm in semi-supervised learning algorithm. Methods: In the paper, it proposed improved cooperative semi-supervised learning algorithm, and the algorithm process is presented in detailed, and it is adopted to predict unlabeled electronic components image. Results: In the experiments of classification and recognition of electronic components, it show that through the method the accuracy the proposed algorithm in electron device image recognition can be significantly improved, the improved algorithm can be used in the actual recognition process . Conclusion: With the continuous development of science and technology, machine vision and deep learning will play a more important role in people's life in the future. The subject research based on the identification of the number of components is bound to develop towards the direction of high precision and multi-dimension, which will greatly improve the production efficiency of electronic components industry.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Weiwei Gu ◽  
Fei Gao ◽  
Xiaodan Lou ◽  
Jiang Zhang

AbstractIn this paper, we propose graph attention based network representation (GANR) which utilizes the graph attention architecture and takes graph structure as the supervised learning information. Compared with node classification based representations, GANR can be used to learn representation for any given graph. GANR is not only capable of learning high quality node representations that achieve a competitive performance on link prediction, network visualization and node classification but it can also extract meaningful attention weights that can be applied in node centrality measuring task. GANR can identify the leading venture capital investors, discover highly cited papers and find the most influential nodes in Susceptible Infected Recovered Model. We conclude that link structures in graphs are not limited on predicting linkage itself, it is capable of revealing latent node information in an unsupervised way once a appropriate learning algorithm, like GANR, is provided.


2021 ◽  
Vol 11 (9) ◽  
pp. 3836
Author(s):  
Valeri Gitis ◽  
Alexander Derendyaev ◽  
Konstantin Petrov ◽  
Eugene Yurkov ◽  
Sergey Pirogov ◽  
...  

Prostate cancer is the second most frequent malignancy (after lung cancer). Preoperative staging of PCa is the basis for the selection of adequate treatment tactics. In particular, an urgent problem is the classification of indolent and aggressive forms of PCa in patients with the initial stages of the tumor process. To solve this problem, we propose to use a new binary classification machine-learning method. The proposed method of monotonic functions uses a model in which the disease’s form is determined by the severity of the patient’s condition. It is assumed that the patient’s condition is the easier, the less the deviation of the indicators from the normal values inherent in healthy people. This assumption means that the severity (form) of the disease can be represented by monotonic functions from the values of the deviation of the patient’s indicators beyond the normal range. The method is used to solve the problem of classifying patients with indolent and aggressive forms of prostate cancer according to pretreatment data. The learning algorithm is nonparametric. At the same time, it allows an explanation of the classification results in the form of a logical function. To do this, you should indicate to the algorithm either the threshold value of the probability of successful classification of patients with an indolent form of PCa, or the threshold value of the probability of misclassification of patients with an aggressive form of PCa disease. The examples of logical rules given in the article show that they are quite simple and can be easily interpreted in terms of preoperative indicators of the form of the disease.


2021 ◽  
Author(s):  
ChunMing Yang

BACKGROUND Extracting relations between the entities from Chinese electronic medical records(EMRs) is the key to automatically constructing medical knowledge graphs. Due to the less available labeled corpus, most of the current researches are based on shallow networks, which cannot fully capture the complex semantic features in the text of Chinese EMRs. OBJECTIVE In this study, a hybrid deep learning method based on semi-supervised learning is proposed to extract the entity relations from small-scale complex Chinese EMRs. METHODS The semantic features of sentences are extracted by residual network (ResNet) and the long dependent information is captured by bidirectional GRU (Gated Recurrent Unit). Then the attention mechanism is used to assign weights to the extracted features respectively, and the output of the two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN model. The F1-score on DAP, SAP, SNAP, TeRD, TeAP, TeCP, TeRS, TeAS, TrAD, TrRD and TrAP 11 relation categories reaches 80.95%, 93.91%, 92.96%, 88.43%, 86.54%, 85.58%, 87.96%, 94.74%, 93.01%, 87.58% and 95.48%, respectively. CONCLUSIONS The hybrid neural network method strengthens the feature transfer and reuse between different network layers and reduces the cost of manual tagging relations. The results demonstrate that our proposed method is effective for the relation extraction in Chinese EMRs.


Algorithms ◽  
2018 ◽  
Vol 11 (9) ◽  
pp. 139 ◽  
Author(s):  
Ioannis Livieris ◽  
Andreas Kanavos ◽  
Vassilis Tampakas ◽  
Panagiotis Pintelas

Semi-supervised learning algorithms have become a topic of significant research as an alternative to traditional classification methods which exhibit remarkable performance over labeled data but lack the ability to be applied on large amounts of unlabeled data. In this work, we propose a new semi-supervised learning algorithm that dynamically selects the most promising learner for a classification problem from a pool of classifiers based on a self-training philosophy. Our experimental results illustrate that the proposed algorithm outperforms its component semi-supervised learning algorithms in terms of accuracy, leading to more efficient, stable and robust predictive models.


2019 ◽  
Vol 46 (1) ◽  
pp. 1 ◽  
Author(s):  
Hiroyuki Shimono ◽  
Graham Farquhar ◽  
Matthew Brookhouse ◽  
Florian A. Busch ◽  
Anthony O'Grady ◽  
...  

Elevated atmospheric CO2 concentration (e[CO2]) can stimulate the photosynthesis and productivity of C3 species including food and forest crops. Intraspecific variation in responsiveness to e[CO2] can be exploited to increase productivity under e[CO2]. However, active selection of genotypes to increase productivity under e[CO2] is rarely performed across a wide range of germplasm, because of constraints of space and the cost of CO2 fumigation facilities. If we are to capitalise on recent advances in whole genome sequencing, approaches are required to help overcome these issues of space and cost. Here, we discuss the advantage of applying prescreening as a tool in large genome×e[CO2] experiments, where a surrogate for e[CO2] was used to select cultivars for more detailed analysis under e[CO2] conditions. We discuss why phenotypic prescreening in population-wide screening for e[CO2] responsiveness is necessary, what approaches could be used for prescreening for e[CO2] responsiveness, and how the data can be used to improve genetic selection of high-performing cultivars. We do this within the framework of understanding the strengths and limitations of genotype–phenotype mapping.


2010 ◽  
Vol 22 (12) ◽  
pp. 3221-3235 ◽  
Author(s):  
Hongzhi Tong ◽  
Di-Rong Chen ◽  
Fenghong Yang

The selection of the penalty functional is critical for the performance of a regularized learning algorithm, and thus it deserves special attention. In this article, we present a least square regression algorithm based on lp-coefficient regularization. Comparing with the classical regularized least square regression, the new algorithm is different in the regularization term. Our primary focus is on the error analysis of the algorithm. An explicit learning rate is derived under some ordinary assumptions.


Sign in / Sign up

Export Citation Format

Share Document