supervised learning Latest Research Papers

Combining Self-supervised Learning and Active Learning for Disfluency Detection

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3487290 ◽

2022 ◽

Vol 21 (3) ◽

pp. 1-25

Author(s):

Shaolei Wang ◽

Zhongyuan Wang ◽

Wanxiang Che ◽

Sendong Zhao ◽

Ting Liu

Keyword(s):

Neural Network ◽

Active Learning ◽

Supervised Learning ◽

Large Scale ◽

Training Data ◽

Fine Tuning ◽

Training Dataset ◽

Performance Gap ◽

Annotation Costs ◽

Trained Neural Network

Spoken language is fundamentally different from the written language in that it contains frequent disfluencies or parts of an utterance that are corrected by the speaker. Disfluency detection (removing these disfluencies) is desirable to clean the input for use in downstream NLP tasks. Most existing approaches to disfluency detection heavily rely on human-annotated data, which is scarce and expensive to obtain in practice. To tackle the training data bottleneck, in this work, we investigate methods for combining self-supervised learning and active learning for disfluency detection. First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled data and propose two self-supervised pre-training tasks: (i) a tagging task to detect the added noisy words and (ii) sentence classification to distinguish original sentences from grammatically incorrect sentences. We then combine these two tasks to jointly pre-train a neural network. The pre-trained neural network is then fine-tuned using human-annotated disfluency detection training data. The self-supervised learning method can capture task-special knowledge for disfluency detection and achieve better performance when fine-tuning on a small annotated dataset compared to other supervised methods. However, limited in that the pseudo training data are generated based on simple heuristics and cannot fully cover all the disfluency patterns, there is still a performance gap compared to the supervised models trained on the full training dataset. We further explore how to bridge the performance gap by integrating active learning during the fine-tuning process. Active learning strives to reduce annotation costs by choosing the most critical examples to label and can address the weakness of self-supervised learning with a small annotated dataset. We show that by combining self-supervised learning with active learning, our model is able to match state-of-the-art performance with just about 10% of the original training data on both the commonly used English Switchboard test set and a set of in-house annotated Chinese data.

Monitoring the propagation of mechanical discontinuity using data-driven causal discovery and supervised learning

Mechanical Systems and Signal Processing ◽

10.1016/j.ymssp.2021.108791 ◽

2022 ◽

Vol 170 ◽

pp. 108791

Author(s):

Rui Liu ◽

Siddharth Misra

Keyword(s):

Supervised Learning ◽

Data Driven ◽

Causal Discovery ◽

Using Data

S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3468675 ◽

2022 ◽

Vol 16 (2) ◽

pp. 1-27

Author(s):

Yang Yang ◽

Hongchen Wei ◽

Zhen-Qiang Sun ◽

Guang-Yu Li ◽

Yuanchun Zhou ◽

...

Keyword(s):

Supervised Learning ◽

Test Data ◽

State Of The Art ◽

Generative Models ◽

Streaming Data ◽

Classification Model ◽

Training Time ◽

Open Set ◽

Incremental Update ◽

Knowledge Memory

Open set classification (OSC) tackles the problem of determining whether the data are in-class or out-of-class during inference, when only provided with a set of in-class examples at training time. Traditional OSC methods usually train discriminative or generative models with the owned in-class data, and then utilize the pre-trained models to classify test data directly. However, these methods always suffer from the embedding confusion problem, i.e., partial out-of-class instances are mixed with in-class ones of similar semantics, making it difficult to classify. To solve this problem, we unify semi-supervised learning to develop a novel OSC algorithm, S2OSC, which incorporates out-of-class instances filtering and model re-training in a transductive manner. In detail, given a pool of newly coming test data, S2OSC firstly filters the mostly distinct out-of-class instances using the pre-trained model, and annotates super-class for them. Then, S2OSC trains a holistic classification model by combing in-class and out-of-class labeled data with the remaining unlabeled test data in a semi-supervised paradigm. Furthermore, considering that data are usually in the streaming form in real applications, we extend S2OSC into an incremental update framework (I-S2OSC), and adopt a knowledge memory regularization to mitigate the catastrophic forgetting problem in incremental update. Despite the simplicity of proposed models, the experimental results show that S2OSC achieves state-of-the-art performance across a variety of OSC tasks, including 85.4% of F1 on CIFAR-10 with only 300 pseudo-labels. We also demonstrate how S2OSC can be expanded to incremental OSC setting effectively with streaming data.

Electrical Pulsed Infrared Thermography and supervised learning for PV cells defects detection

Solar Energy Materials and Solar Cells ◽

10.1016/j.solmat.2021.111561 ◽

2022 ◽

Vol 237 ◽

pp. 111561

Author(s):

Chiwu Bu ◽

Tao Liu ◽

Rui Li ◽

Runhong Shen ◽

Bo Zhao ◽

...

Keyword(s):

Supervised Learning ◽

Infrared Thermography ◽

Pulsed Infrared Thermography

Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning

Tunnelling and Underground Space Technology ◽

10.1016/j.tust.2021.104285 ◽

2022 ◽

Vol 120 ◽

pp. 104285

Author(s):

Xin Yin ◽

Quansheng Liu ◽

Xing Huang ◽

Yucong Pan

Keyword(s):

Big Data ◽

Supervised Learning ◽

Surrounding Rock ◽

Geological Conditions ◽

Perception Model

Using supervised learning techniques to automatically classify vortex-induced vibration in long-span bridges

Journal of Wind Engineering and Industrial Aerodynamics ◽

10.1016/j.jweia.2022.104904 ◽

2022 ◽

Vol 221 ◽

pp. 104904

Author(s):

Jaeyeong Lim ◽

Sunjoong Kim ◽

Ho-Kyung Kim

Keyword(s):

Supervised Learning ◽

Vortex Induced Vibration ◽

Long Span Bridges ◽

Long Span ◽

Learning Techniques

Predicting bend-induced heterogeneity in sediment microbial communities by integrating bacteria-based index of biotic integrity and supervised learning algorithms

Journal of Environmental Management ◽

10.1016/j.jenvman.2021.114267 ◽

2022 ◽

Vol 304 ◽

pp. 114267

Author(s):

Wenlong Zhang ◽

Gang Yang ◽

Haolan Wang ◽

Yi Li ◽

Lihua Niu ◽

...

Keyword(s):

Microbial Communities ◽

Supervised Learning ◽

Learning Algorithms ◽

Index Of Biotic Integrity ◽

Biotic Integrity ◽

Supervised Learning Algorithms

Supervised Learning Applied to Graduation Forecast of Industrial Engineering Students

European Journal of Educational Research ◽

10.12973/eu-jer.11.1.325 ◽

2022 ◽

Vol 11 (1) ◽

pp. 325-337

Author(s):

Natalia Gil ◽

Marcelo Albuquerque ◽

Gabriela de

Keyword(s):

Machine Learning ◽

High School ◽

Logistic Regression ◽

Supervised Learning ◽

Grade Point Average ◽

Engineering Students ◽

Learning Algorithm ◽

Industrial Engineering ◽

Machine Learning Algorithm ◽

Grade Point

<p style="text-align: justify;">The article aims to develop a machine-learning algorithm that can predict student’s graduation in the Industrial Engineering course at the Federal University of Amazonas based on their performance data. The methodology makes use of an information package of 364 students with an admission period between 2007 and 2019, considering characteristics that can affect directly or indirectly in the graduation of each one, being: type of high school, number of semesters taken, grade-point average, lockouts, dropouts and course terminations. The data treatment considered the manual removal of several characteristics that did not add value to the output of the algorithm, resulting in a package composed of 2184 instances. Thus, the logistic regression, MLP and XGBoost models developed and compared could predict a binary output of graduation or non-graduation to each student using 30% of the dataset to test and 70% to train, so that was possible to identify a relationship between the six attributes explored and achieve, with the best model, 94.15% of accuracy on its predictions.</p>

Improving Semi-Supervised Learning for Remaining Useful Lifetime Estimation Through Self-Supervision

International Journal of Prognostics and Health Management ◽

10.36001/ijphm.2022.v13i1.3096 ◽

2022 ◽

Vol 13 (1) ◽

Author(s):

Tilman Krokotsch ◽

Mirko Knaak ◽

Clemens G¨uhmann

Keyword(s):

Supervised Learning ◽

End Of Life ◽

System Simulation ◽

Vital Role ◽

Data Driven ◽

Lifetime Estimation ◽

Realistic Evaluation ◽

Data Imbalance ◽

Using Data ◽

Useful Lifetime

RUL estimation plays a vital role in effectively scheduling maintenance operations. Unfortunately, it suffers from a severe data imbalance where data from machines near their end of life is rare. Additionally, the data produced by a machine can only be labeled after the machine failed. Both of these points make using data-driven methods for RUL estimation difficult. Semi-Supervised Learning (SSL) can incorporate the unlabeled data produced by machines that did not yet fail into data-driven methods. Previous work on SSL evaluated approaches under unrealistic conditions where the data near failure was still available. Even so, only moderate improvements were made. This paper defines more realistic evaluation conditions and proposes a novel SSL approach based on self-supervised pre-training. The method can outperform two competing approaches from the literature and the supervised baseline on the NASA Commercial Modular Aero-Propulsion System Simulation dataset.

High-efficient low-cost characterization of materials properties using domain-knowledge-guided self-supervised learning

10.21203/rs.3.rs-1241474/v1 ◽

2022 ◽

Author(s):

Binglin Xie ◽

Xianhua Yao ◽

Weining Mao ◽

Mohammad Rafiei ◽

Nan Hu

Keyword(s):

Supervised Learning ◽

Domain Knowledge ◽

Low Cost ◽

Testing Procedure ◽

High Efficient ◽

Science Community ◽

Properties Of Materials ◽

Material Science Community ◽

Characterization Of Materials

Abstract Modern AI-assisted approaches have helped material scientists revolutionize their abilities to better understand the properties of materials. However, current machine learning (ML) models would perform awful for materials with a lengthy production window and a complex testing procedure because only a limited amount of data can be produced to feed the model. Here, we introduce self-supervised learning (SSL) to address the issue of lacking labeled data in material characterization. We propose a generalized SSL-based framework with domain knowledge and demonstrate its robustness to predict the properties of a candidate material with the fewest data. Our numerical results show that the performance of the proposed SSL model can match the commonly-used supervised learning (SL) model with only 5 % of data, and the SSL model is also proven with ease of implementation. Our study paves the way to expand further the usability of ML tools for a broader material science community.

supervised learning
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Combining Self-supervised Learning and Active Learning for Disfluency Detection

Monitoring the propagation of mechanical discontinuity using data-driven causal discovery and supervised learning

S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification

Electrical Pulsed Infrared Thermography and supervised learning for PV cells defects detection

Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning

Using supervised learning techniques to automatically classify vortex-induced vibration in long-span bridges

Predicting bend-induced heterogeneity in sediment microbial communities by integrating bacteria-based index of biotic integrity and supervised learning algorithms

Supervised Learning Applied to Graduation Forecast of Industrial Engineering Students

Improving Semi-Supervised Learning for Remaining Useful Lifetime Estimation Through Self-Supervision

High-efficient low-cost characterization of materials properties using domain-knowledge-guided self-supervised learning

Export Citation Format

supervised learningRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Combining Self-supervised Learning and Active Learning for Disfluency Detection

Monitoring the propagation of mechanical discontinuity using data-driven causal discovery and supervised learning

S2OSC: A Holistic Semi-Supervised Approach for Open Set Classification

Electrical Pulsed Infrared Thermography and supervised learning for PV cells defects detection

Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning

Using supervised learning techniques to automatically classify vortex-induced vibration in long-span bridges

Predicting bend-induced heterogeneity in sediment microbial communities by integrating bacteria-based index of biotic integrity and supervised learning algorithms

Supervised Learning Applied to Graduation Forecast of Industrial Engineering Students

Improving Semi-Supervised Learning for Remaining Useful Lifetime Estimation Through Self-Supervision

High-efficient low-cost characterization of materials properties using domain-knowledge-guided self-supervised learning

supervised learning
Recently Published Documents