Using Supervised Learning to Guide the Selection of Software Inspectors in Industry

Author(s):  
Maninder Singh ◽  
Gursimran Singh Walia ◽  
Anurag Goswami
2014 ◽  
Vol 24 (38) ◽  
pp. 97
Author(s):  
Antonio Rico-Sulayes

<p align="justify">This article proposes the architecture for a system that uses previously learned weights to sort query results from unstructured data bases when building specialized dictionaries. A common resource in the construction of dictionaries, unstructured data bases have been especially useful in providing information about lexical items frequencies and examples in use. However, when building specialized dictionaries, whose selection of lexical items does not rely on frequency, the use of these data bases gets restricted to a simple provider of examples. Even in this task, the information unstructured data bases provide may not be very useful when looking for specialized uses of lexical items with various meanings and very long lists of results. In the face of this problem, long lists of hits can be rescored based on a supervised learning model that relies on previously helpful results. The allocation of a vast set of high quality training data for this rescoring system is reported here. Finally, the architecture of sucha system,an unprecedented tool in specialized lexicography, is proposed.</p>


2011 ◽  
Vol 21 (06) ◽  
pp. 505-525 ◽  
Author(s):  
BRUNO BARUQUE ◽  
EMILIO CORCHADO ◽  
HUJUN YIN

This paper presents a novel model for performing classification and visualization of high-dimensional data by means of combining two enhancing techniques. The first is a semi-supervised learning, an extension of the supervised learning used to incorporate unlabeled information to the learning process. The second is an ensemble learning to replicate the analysis performed, followed by a fusion mechanism that yields as a combined result of previously performed analysis in order to improve the result of a single model. The proposed learning schema, termed S 2-Ensemble, is applied to several unsupervised learning algorithms within the family of topology maps, such as the Self-Organizing Maps and the Neural Gas. This study also includes a thorough research of the characteristics of these novel schemes, by means quality measures, which allow a complete analysis of the resultant classifiers from the viewpoint of various perspectives over the different ways that these classifiers are used. The study conducts empirical evaluations and comparisons on various real-world datasets from the UCI repository, which exhibit different characteristics, so to enable an extensive selection of situations where the presented new algorithms can be applied.


2020 ◽  
Vol 187 ◽  
pp. 106496 ◽  
Author(s):  
Hans Ole Riddervold ◽  
Signe Riemer-Sørensen ◽  
Peter Szederjesi ◽  
Magnus Korpås

Informatics ◽  
2021 ◽  
Vol 8 (4) ◽  
pp. 68
Author(s):  
Mouhamadou Saliou Diallo ◽  
Sid Ahmed Mokeddem ◽  
Agnès Braud ◽  
Gabriel Frey ◽  
Nicolas Lachiche

Industry 4.0 is characterized by the availability of sensors to operate the so-called intelligent factory. Predictive maintenance, in particular, failure prediction, is an important issue to cut the costs associated with production breaks. We studied more than 40 publications on predictive maintenance. We point out that they focus on various machine learning algorithms rather than on the selection of suitable datasets. In fact, most publications consider a single, usually non-public, benchmark. More benchmarks are needed to design and test the generality of the proposed approaches. This paper is the first to define the requirements on these benchmarks. It highlights that there are only two benchmarks that can be used for supervised learning among the six publicly available ones we found in the literature. We also illustrate how such a benchmark can be used with deep learning to successfully train and evaluate a failure prediction model. We raise several perspectives for research.


2012 ◽  
Vol 81A (9) ◽  
pp. 743-754 ◽  
Author(s):  
Kaustav Nandy ◽  
Prabhakar R. Gudla ◽  
Ryan Amundsen ◽  
Karen J. Meaburn ◽  
Tom Misteli ◽  
...  

2013 ◽  
Vol 17 (3) ◽  
pp. 254-259 ◽  
Author(s):  
Ji-Ho Han ◽  
Eun-Ae Park ◽  
Dong-Chul Park ◽  
Yunsik Lee ◽  
Soo-Young Min

In software engineering, software maintenance is the process of correction, updating, and improvement of software products after handed over to the customer. Through offshore software maintenance outsourcing (OSMO) clients can get advantages like reduce cost, save time, and improve quality. In most cases, the OSMO vendor generates considerable revenue. However, the selection of an appropriate proposal among multiple clients is one of the critical problems for OSMO vendors. The purpose of this paper is to suggest an effective machine learning technique that can be used by OSMO vendors to assess or predict the OSMO client’s proposal. The dataset is generated through a survey of OSMO vendors working in a developing country. The results showed that supervised learning-based classifiers like Naïve Bayesian, SMO, Logistics apprehended 69.75 %, 81.81 %, and 87.27 % testing accuracy respectively. This study concludes that supervised learning is the most suitable technique to predict the OSMO client's proposal.


Author(s):  
Joshua Eckroth ◽  
Eric Schoen

This paper describes the genetic algorithm used to select news stories about artificial intelligence for AAAI’s weekly AIAlert, emailed to nearly 11,000 subscribers. Each week, about 1,500 news stories covering various aspects of artificial intelligence and machine learning are discovered by i2k Connect’s NewsFinder agent. Our challenge is to select just 10 stories from this collection that represent the important news about AI. Since stories and topics do not necessarily repeat in later weeks, we cannot use click tracking and supervised learning to predict which stories or topics are most preferred by readers. Instead, we must build a representative selection of stories a priori, using information about each story’s topics, content, publisher, date of publication, and other features. This paper describes a genetic algorithm that achieves this task. We demonstrate its effectiveness by comparing several engagement metrics from six months of “A/B testing” experiments that compare random story selection vs. a simple scoring algorithm vs. our new genetic algorithm.


2021 ◽  
Vol 12 ◽  
Author(s):  
Deniz Akdemir ◽  
Simon Rio ◽  
Julio Isidro y Sánchez

A major barrier to the wider use of supervised learning in emerging applications, such as genomic selection, is the lack of sufficient and representative labeled data to train prediction models. The amount and quality of labeled training data in many applications is usually limited and therefore careful selection of the training examples to be labeled can be useful for improving the accuracies in predictive learning tasks. In this paper, we present an R package, TrainSel, which provides flexible, efficient, and easy-to-use tools that can be used for the selection of training populations (STP). We illustrate its use, performance, and potentials in four different supervised learning applications within and outside of the plant breeding area.


Author(s):  
Mengchen Liu ◽  
Liu Jiang ◽  
Junlin Liu ◽  
Xiting Wang ◽  
Jun Zhu ◽  
...  

Although several effective learning-from-crowd methods have been developed to infer correct labels from noisy crowdsourced labels, a method for post-processed expert validation is still needed. This paper introduces a semi-supervised learning algorithm that is capable of selecting the most informative instances and maximizing the influence of expert labels. Specifically, we have developed a complete uncertainty assessment to facilitate the selection of the most informative instances. The expert labels are then propagated to similar instances via regularized Bayesian inference. Experiments on both real-world and simulated datasets indicate that given a specific accuracy goal (e.g., 95%) our method reduces expert effort from 39% to 60% compared with the state-of-the-art method.


Sign in / Sign up

Export Citation Format

Share Document