When Does Label Propagation Fail? A View from a Network Generative Model

What kinds of data does Label Propagation (LP) work best on? Can we justify the solution of LP from a theoretical standpoint? LP is a semi-supervised learning algorithm that is widely used to predict unobserved node labels on a network (e.g., user's gender on an SNS). Despite its importance, its theoretical properties remain mostly unexplored. In this paper, we answer the above questions by interpreting LP from a statistical viewpoint. As our main result, we identify the network generative model behind the discretized version of LP (DLP), and we show that under specific conditions the solution of DLP is equal to the maximum {\it a posteriori} estimate of that generative model. Our main result reveals the critical limitations of LP. Specifically, we discover that LP would not work best on networks with (1) disassortative node labels, (2) clusters having different edge densities, (3) non-uniform label distributions, or (4) unreliable node labels provided. Our experiments under a variety of settings support our theoretical results.

Download Full-text

Classification and Recognition of Electronic Components Based on Improved Cooperative Semi-supervised Learning Algorithm

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096514666201224125653 ◽

2020 ◽

Vol 14 ◽

Author(s):

Dan Luo

Keyword(s):

Deep Learning ◽

Machine Vision ◽

Supervised Learning ◽

Image Recognition ◽

Production Efficiency ◽

Learning Algorithm ◽

Electronic Components ◽

Electron Device ◽

Actual Recognition ◽

The Subject

Background: As known that the semi-supervised algorithm is a classical algorithm in semi-supervised learning algorithm. Methods: In the paper, it proposed improved cooperative semi-supervised learning algorithm, and the algorithm process is presented in detailed, and it is adopted to predict unlabeled electronic components image. Results: In the experiments of classification and recognition of electronic components, it show that through the method the accuracy the proposed algorithm in electron device image recognition can be significantly improved, the improved algorithm can be used in the actual recognition process . Conclusion: With the continuous development of science and technology, machine vision and deep learning will play a more important role in people's life in the future. The subject research based on the identification of the number of components is bound to develop towards the direction of high precision and multi-dimension, which will greatly improve the production efficiency of electronic components industry.

Download Full-text

Hyperspectral image classification using semi-supervised learning with label propagation

2020 IEEE India Geoscience and Remote Sensing Symposium (InGARSS) ◽

10.1109/ingarss48198.2020.9358921 ◽

2020 ◽

Author(s):

Usha Patel ◽

Hardik Dave ◽

Vibha Patel

Keyword(s):

Image Classification ◽

Supervised Learning ◽

Hyperspectral Image ◽

Label Propagation ◽

Hyperspectral Image Classification

Download Full-text

Discovering latent node Information by graph attention network

Scientific Reports ◽

10.1038/s41598-021-85826-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Weiwei Gu ◽

Fei Gao ◽

Xiaodan Lou ◽

Jiang Zhang

Keyword(s):

Supervised Learning ◽

Link Prediction ◽

Learning Algorithm ◽

Network Visualization ◽

Graph Structure ◽

High Quality ◽

Node Centrality ◽

Influential Nodes ◽

Node Classification ◽

Highly Cited

AbstractIn this paper, we propose graph attention based network representation (GANR) which utilizes the graph attention architecture and takes graph structure as the supervised learning information. Compared with node classification based representations, GANR can be used to learn representation for any given graph. GANR is not only capable of learning high quality node representations that achieve a competitive performance on link prediction, network visualization and node classification but it can also extract meaningful attention weights that can be applied in node centrality measuring task. GANR can identify the leading venture capital investors, discover highly cited papers and find the most influential nodes in Susceptible Infected Recovered Model. We conclude that link structures in graphs are not limited on predicting linkage itself, it is capable of revealing latent node information in an unsupervised way once a appropriate learning algorithm, like GANR, is provided.

Download Full-text

A Hybrid Method Based on Semi-Supervised Learning for Relation Extraction in Chinese EMRs (Preprint)

10.2196/preprints.28220 ◽

2021 ◽

Author(s):

ChunMing Yang

Keyword(s):

Supervised Learning ◽

Learning Algorithm ◽

Medical Knowledge ◽

Relation Extraction ◽

Small Scale ◽

Semantic Features ◽

Training Process ◽

Network Layers ◽

Relation Prediction ◽

The Cost

BACKGROUND Extracting relations between the entities from Chinese electronic medical records(EMRs) is the key to automatically constructing medical knowledge graphs. Due to the less available labeled corpus, most of the current researches are based on shallow networks, which cannot fully capture the complex semantic features in the text of Chinese EMRs. OBJECTIVE In this study, a hybrid deep learning method based on semi-supervised learning is proposed to extract the entity relations from small-scale complex Chinese EMRs. METHODS The semantic features of sentences are extracted by residual network (ResNet) and the long dependent information is captured by bidirectional GRU (Gated Recurrent Unit). Then the attention mechanism is used to assign weights to the extracted features respectively, and the output of the two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN model. The F1-score on DAP, SAP, SNAP, TeRD, TeAP, TeCP, TeRS, TeAS, TrAD, TrRD and TrAP 11 relation categories reaches 80.95%, 93.91%, 92.96%, 88.43%, 86.54%, 85.58%, 87.96%, 94.74%, 93.01%, 87.58% and 95.48%, respectively. CONCLUSIONS The hybrid neural network method strengthens the feature transfer and reuse between different network layers and reduces the cost of manual tagging relations. The results demonstrate that our proposed method is effective for the relation extraction in Chinese EMRs.

Download Full-text

An Auto-Adjustable Semi-Supervised Self-Training Algorithm

Algorithms ◽

10.3390/a11090139 ◽

2018 ◽

Vol 11 (9) ◽

pp. 139 ◽

Cited By ~ 5

Author(s):

Ioannis Livieris ◽

Andreas Kanavos ◽

Vassilis Tampakas ◽

Panagiotis Pintelas

Keyword(s):

Supervised Learning ◽

Predictive Models ◽

Learning Algorithm ◽

Learning Algorithms ◽

Classification Problem ◽

Classification Methods ◽

Training Algorithm ◽

Traditional Classification ◽

Supervised Learning Algorithms ◽

Significant Research

Semi-supervised learning algorithms have become a topic of significant research as an alternative to traditional classification methods which exhibit remarkable performance over labeled data but lack the ability to be applied on large amounts of unlabeled data. In this work, we propose a new semi-supervised learning algorithm that dynamically selects the most promising learner for a classification problem from a pool of classifiers based on a self-training philosophy. Our experimental results illustrate that the proposed algorithm outperforms its component semi-supervised learning algorithms in terms of accuracy, leading to more efficient, stable and robust predictive models.

Download Full-text

Relation extraction using label propagation based semi-supervised learning

10.3115/1220175.1220192 ◽

2006 ◽

Cited By ~ 17

Author(s):

Jinxiu Chen ◽

Donghong Ji ◽

Chew Lim Tan ◽

Zhengyu Niu

Keyword(s):

Supervised Learning ◽

Relation Extraction ◽

Label Propagation

Download Full-text

Bayesian Poroelastic Aquifer Characterization From InSAR Surface Deformation Data. Part I: Maximum A Posteriori Estimate

Water Resources Research ◽

10.1029/2020wr027391 ◽

2020 ◽

Vol 56 (10) ◽

Cited By ~ 1

Author(s):

Amal Alghamdi ◽

Marc A. Hesse ◽

Jingyi Chen ◽

Omar Ghattas

Keyword(s):

Surface Deformation ◽

Maximum A Posteriori ◽

A Posteriori ◽

Aquifer Characterization ◽

Posteriori Estimate ◽

A Posteriori Estimate ◽

Maximum A Posteriori Estimate

Download Full-text

Design of Computer-Aided Course Teaching Control System Based on Supervised Learning Algorithm

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - e-Learning, e-Education, and Online Training ◽

10.1007/978-3-030-63955-6_11 ◽

2020 ◽

pp. 115-126

Author(s):

Chun-rong Zhou ◽

Ji-yin Zhou

Keyword(s):

Control System ◽

Supervised Learning ◽

Learning Algorithm ◽

Computer Aided

Download Full-text

XSS Attack Detection Model Based on Semi-supervised Learning Algorithm with Weighted Neighbor Purity

Ad-Hoc, Mobile, and Wireless Networks - Lecture Notes in Computer Science ◽

10.1007/978-3-030-61746-2_15 ◽

2020 ◽

pp. 198-213

Author(s):

Xinran Li ◽

Wenxing Ma ◽

Zan Zhou ◽

Changqiao Xu

Keyword(s):

Supervised Learning ◽

Learning Algorithm ◽

Attack Detection ◽

Detection Model ◽

Model Based

Download Full-text

Supervised Learning Applied to Graduation Forecast of Industrial Engineering Students

European Journal of Educational Research ◽

10.12973/eu-jer.11.1.325 ◽

2022 ◽

Vol 11 (1) ◽

pp. 325-337

Author(s):

Natalia Gil ◽

Marcelo Albuquerque ◽

Gabriela de

Keyword(s):

Machine Learning ◽

High School ◽

Logistic Regression ◽

Supervised Learning ◽

Grade Point Average ◽

Engineering Students ◽

Learning Algorithm ◽

Industrial Engineering ◽

Machine Learning Algorithm ◽

Grade Point

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>

Download Full-text