Using Supervised Learning to Guide the Selection of Software Inspectors in Industry

<p align="justify">This article proposes the architecture for a system that uses previously learned weights to sort query results from unstructured data bases when building specialized dictionaries. A common resource in the construction of dictionaries, unstructured data bases have been especially useful in providing information about lexical items frequencies and examples in use. However, when building specialized dictionaries, whose selection of lexical items does not rely on frequency, the use of these data bases gets restricted to a simple provider of examples. Even in this task, the information unstructured data bases provide may not be very useful when looking for specialized uses of lexical items with various meanings and very long lists of results. In the face of this problem, long lists of hits can be rescored based on a supervised learning model that relies on previously helpful results. The allocation of a vast set of high quality training data for this rescoring system is reported here. Finally, the architecture of sucha system,an unprecedented tool in specialized lexicography, is proposed.</p>

Download Full-text

THE S2-ENSEMBLE FUSION ALGORITHM

International Journal of Neural Systems ◽

10.1142/s0129065711003012 ◽

2011 ◽

Vol 21 (06) ◽

pp. 505-525 ◽

Cited By ~ 11

Author(s):

BRUNO BARUQUE ◽

EMILIO CORCHADO ◽

HUJUN YIN

Keyword(s):

Supervised Learning ◽

Complete Analysis ◽

Fusion Algorithm ◽

Self Organizing Maps ◽

The Family ◽

Real World Datasets ◽

Different Characteristics ◽

Novel Model ◽

New Algorithms ◽

Selection Of

This paper presents a novel model for performing classification and visualization of high-dimensional data by means of combining two enhancing techniques. The first is a semi-supervised learning, an extension of the supervised learning used to incorporate unlabeled information to the learning process. The second is an ensemble learning to replicate the analysis performed, followed by a fusion mechanism that yields as a combined result of previously performed analysis in order to improve the result of a single model. The proposed learning schema, termed S 2-Ensemble, is applied to several unsupervised learning algorithms within the family of topology maps, such as the Self-Organizing Maps and the Neural Gas. This study also includes a thorough research of the characteristics of these novel schemes, by means quality measures, which allow a complete analysis of the resultant classifiers from the viewpoint of various perspectives over the different ways that these classifiers are used. The study conducts empirical evaluations and comparisons on various real-world datasets from the UCI repository, which exhibit different characteristics, so to enable an extensive selection of situations where the presented new algorithms can be applied.

Download Full-text

A supervised learning approach for optimal selection of bidding strategies in reservoir hydro

Electric Power Systems Research ◽

10.1016/j.epsr.2020.106496 ◽

2020 ◽

Vol 187 ◽

pp. 106496 ◽

Cited By ~ 1

Author(s):

Hans Ole Riddervold ◽

Signe Riemer-Sørensen ◽

Peter Szederjesi ◽

Magnus Korpås

Keyword(s):

Supervised Learning ◽

Learning Approach ◽

Optimal Selection ◽

Bidding Strategies ◽

Selection Of

Download Full-text

Identifying Benchmarks for Failure Prediction in Industry 4.0

Informatics ◽

10.3390/informatics8040068 ◽

2021 ◽

Vol 8 (4) ◽

pp. 68

Author(s):

Mouhamadou Saliou Diallo ◽

Sid Ahmed Mokeddem ◽

Agnès Braud ◽

Gabriel Frey ◽

Nicolas Lachiche

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Prediction Model ◽

Supervised Learning ◽

Industry 4.0 ◽

Learning Algorithms ◽

Failure Prediction ◽

Machine Learning Algorithms ◽

Predictive Maintenance ◽

Selection Of

Industry 4.0 is characterized by the availability of sensors to operate the so-called intelligent factory. Predictive maintenance, in particular, failure prediction, is an important issue to cut the costs associated with production breaks. We studied more than 40 publications on predictive maintenance. We point out that they focus on various machine learning algorithms rather than on the selection of suitable datasets. In fact, most publications consider a single, usually non-public, benchmark. More benchmarks are needed to design and test the generality of the proposed approaches. This paper is the first to define the requirements on these benchmarks. It highlights that there are only two benchmarks that can be used for supervised learning among the six publicly available ones we found in the literature. We also illustrate how such a benchmark can be used with deep learning to successfully train and evaluate a failure prediction model. We raise several perspectives for research.

Download Full-text

Automatic segmentation and supervised learning-based selection of nuclei in cancer tissue images

Cytometry Part A ◽

10.1002/cyto.a.22097 ◽

2012 ◽

Vol 81A (9) ◽

pp. 743-754 ◽

Cited By ~ 20

Author(s):

Kaustav Nandy ◽

Prabhakar R. Gudla ◽

Ryan Amundsen ◽

Karen J. Meaburn ◽

Tom Misteli ◽

...

Keyword(s):

Supervised Learning ◽

Automatic Segmentation ◽

Cancer Tissue ◽

Selection Of

Download Full-text

Active Selection of Label Data for Semi-Supervised Learning Algorithm

Journal of IKEEE ◽

10.7471/ikeee.2013.17.3.254 ◽

2013 ◽

Vol 17 (3) ◽

pp. 254-259 ◽

Cited By ~ 1

Author(s):

Ji-Ho Han ◽

Eun-Ae Park ◽

Dong-Chul Park ◽

Yunsik Lee ◽

Soo-Young Min

Keyword(s):

Supervised Learning ◽

Learning Algorithm ◽

Label Data ◽

Active Selection ◽

Selection Of

Download Full-text

Offshore Software Maintenance Outsourcing: Predicting Client’s Proposal using Supervised Learning

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/151012021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 106-113

Keyword(s):

Supervised Learning ◽

Software Maintenance ◽

Engineering Software ◽

Software Products ◽

Maintenance Outsourcing ◽

Critical Problems ◽

Testing Accuracy ◽

Suitable Technique ◽

Multiple Clients ◽

Selection Of

In software engineering, software maintenance is the process of correction, updating, and improvement of software products after handed over to the customer. Through offshore software maintenance outsourcing (OSMO) clients can get advantages like reduce cost, save time, and improve quality. In most cases, the OSMO vendor generates considerable revenue. However, the selection of an appropriate proposal among multiple clients is one of the critical problems for OSMO vendors. The purpose of this paper is to suggest an effective machine learning technique that can be used by OSMO vendors to assess or predict the OSMO client’s proposal. The dataset is generated through a survey of OSMO vendors working in a developing country. The results showed that supervised learning-based classifiers like Naïve Bayesian, SMO, Logistics apprehended 69.75 %, 81.81 %, and 87.27 % testing accuracy respectively. This study concludes that supervised learning is the most suitable technique to predict the OSMO client's proposal.

Download Full-text

A Genetic Algorithm for Finding a Small and Diverse Set of Recent News Stories on a Given Subject: How We Generate AAAI’s AI-Alert

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019357 ◽

2019 ◽

Vol 33 ◽

pp. 9357-9364

Author(s):

Joshua Eckroth ◽

Eric Schoen

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Genetic Algorithm ◽

Supervised Learning ◽

A Priori ◽

News Stories ◽

Scoring Algorithm ◽

Representative Selection ◽

Story Selection ◽

Selection Of

This paper describes the genetic algorithm used to select news stories about artificial intelligence for AAAI’s weekly AIAlert, emailed to nearly 11,000 subscribers. Each week, about 1,500 news stories covering various aspects of artificial intelligence and machine learning are discovered by i2k Connect’s NewsFinder agent. Our challenge is to select just 10 stories from this collection that represent the important news about AI. Since stories and topics do not necessarily repeat in later weeks, we cannot use click tracking and supervised learning to predict which stories or topics are most preferred by readers. Instead, we must build a representative selection of stories a priori, using information about each story’s topics, content, publisher, date of publication, and other features. This paper describes a genetic algorithm that achieves this task. We demonstrate its effectiveness by comparing several engagement metrics from six months of “A/B testing” experiments that compare random story selection vs. a simple scoring algorithm vs. our new genetic algorithm.

Download Full-text

TrainSel: An R Package for Selection of Training Populations

Frontiers in Genetics ◽

10.3389/fgene.2021.655287 ◽

2021 ◽

Vol 12 ◽

Author(s):

Deniz Akdemir ◽

Simon Rio ◽

Julio Isidro y Sánchez

Keyword(s):

Supervised Learning ◽

Prediction Models ◽

R Package ◽

Training Data ◽

Major Barrier ◽

Predictive Learning ◽

Learning Tasks ◽

Training Examples ◽

Selection Of

A major barrier to the wider use of supervised learning in emerging applications, such as genomic selection, is the lack of sufficient and representative labeled data to train prediction models. The amount and quality of labeled training data in many applications is usually limited and therefore careful selection of the training examples to be labeled can be useful for improving the accuracies in predictive learning tasks. In this paper, we present an R package, TrainSel, which provides flexible, efficient, and easy-to-use tools that can be used for the selection of training populations (STP). We illustrate its use, performance, and potentials in four different supervised learning applications within and outside of the plant breeding area.

Download Full-text

Improving Learning-from-Crowds through Expert Validation

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/324 ◽

2017 ◽

Cited By ~ 8

Author(s):

Mengchen Liu ◽

Liu Jiang ◽

Junlin Liu ◽

Xiting Wang ◽

Jun Zhu ◽

...

Keyword(s):

Bayesian Inference ◽

Supervised Learning ◽

Real World ◽

Learning Algorithm ◽

State Of The Art ◽

Uncertainty Assessment ◽

Effective Learning ◽

Complete Uncertainty ◽

Expert Validation ◽

Selection Of

Although several effective learning-from-crowd methods have been developed to infer correct labels from noisy crowdsourced labels, a method for post-processed expert validation is still needed. This paper introduces a semi-supervised learning algorithm that is capable of selecting the most informative instances and maximizing the influence of expert labels. Specifically, we have developed a complete uncertainty assessment to facilitate the selection of the most informative instances. The expert labels are then propagated to similar instances via regularized Bayesian inference. Experiments on both real-world and simulated datasets indicate that given a specific accuracy goal (e.g., 95%) our method reduces expert effort from 39% to 60% compared with the state-of-the-art method.

Download Full-text