scholarly journals Mutation Testing Framework for Machine Learning

Author(s):  
Raju Singh

This is an article or technical note which is intended to provides an insight journey of Machine Learning Systems (MLS) testing, its evolution, current paradigm and future work. Machine Learning Models, used in critical applications such as healthcare industry[1], Automobile [2], [3] and Air Traffic control, Share Trading etc., and failure of ML Model can lead to severe consequences in terms of loss of life or property. To remediate this, developers, scientists, and ML community around the world, must build a highly reliable test architecture for critical ML application. At the very foundation layer, any test model must satisfy the core testing attributes such as test properties and its components. This attribute comes from the software engineering [5], [6], but the same cannot be applied in as-is form to the ML testing and we will tell you “why”.

2020 ◽  
Vol 13 (5) ◽  
pp. 1020-1030
Author(s):  
Pradeep S. ◽  
Jagadish S. Kallimani

Background: With the advent of data analysis and machine learning, there is a growing impetus of analyzing and generating models on historic data. The data comes in numerous forms and shapes with an abundance of challenges. The most sorted form of data for analysis is the numerical data. With the plethora of algorithms and tools it is quite manageable to deal with such data. Another form of data is of categorical nature, which is subdivided into, ordinal (order wise) and nominal (number wise). This data can be broadly classified as Sequential and Non-Sequential. Sequential data analysis is easier to preprocess using algorithms. Objective: The challenge of applying machine learning algorithms on categorical data of nonsequential nature is dealt in this paper. Methods: Upon implementing several data analysis algorithms on such data, we end up getting a biased result, which makes it impossible to generate a reliable predictive model. In this paper, we will address this problem by walking through a handful of techniques which during our research helped us in dealing with a large categorical data of non-sequential nature. In subsequent sections, we will discuss the possible implementable solutions and shortfalls of these techniques. Results: The methods are applied to sample datasets available in public domain and the results with respect to accuracy of classification are satisfactory. Conclusion: The best pre-processing technique we observed in our research is one hot encoding, which facilitates breaking down the categorical features into binary and feeding it into an Algorithm to predict the outcome. The example that we took is not abstract but it is a real – time production services dataset, which had many complex variations of categorical features. Our Future work includes creating a robust model on such data and deploying it into industry standard applications.


2021 ◽  
Vol 11 (11) ◽  
pp. 5057
Author(s):  
Wan-Yu Yu ◽  
Xiao-Qiang Huang ◽  
Hung-Yi Luo ◽  
Von-Wun Soo ◽  
Yung-Lung Lee

The autonomous vehicle technology has recently been developed rapidly in a wide variety of applications. However, coordinating a team of autonomous vehicles to complete missions in an unknown and changing environment has been a challenging and complicated task. We modify the consensus-based auction algorithm (CBAA) so that it can dynamically reallocate tasks among autonomous vehicles that can flexibly find a path to reach multiple dynamic targets while avoiding unexpected obstacles and staying close as a group as possible simultaneously. We propose the core algorithms and simulate with many scenarios empirically to illustrate how the proposed framework works. Specifically, we show that how autonomous vehicles could reallocate the tasks among each other in finding dynamically changing paths while certain targets may appear and disappear during the movement mission. We also discuss some challenging problems as a future work.


Author(s):  
Dhiraj J. Pangal ◽  
Guillaume Kugener ◽  
Shane Shahrestani ◽  
Frank Attenello ◽  
Gabriel Zada ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3616
Author(s):  
Jan Ubbo van Baardewijk ◽  
Sarthak Agarwal ◽  
Alex S. Cornelissen ◽  
Marloes J. A. Joosen ◽  
Jiska Kentrop ◽  
...  

Early detection of exposure to a toxic chemical, e.g., in a military context, can be life-saving. We propose to use machine learning techniques and multiple continuously measured physiological signals to detect exposure, and to identify the chemical agent. Such detection and identification could be used to alert individuals to take appropriate medical counter measures in time. As a first step, we evaluated whether exposure to an opioid (fentanyl) or a nerve agent (VX) could be detected in freely moving guinea pigs using features from respiration, electrocardiography (ECG) and electroencephalography (EEG), where machine learning models were trained and tested on different sets (across subject classification). Results showed this to be possible with close to perfect accuracy, where respiratory features were most relevant. Exposure detection accuracy rose steeply to over 95% correct during the first five minutes after exposure. Additional models were trained to correctly classify an exposed state as being induced either by fentanyl or VX. This was possible with an accuracy of almost 95%, where EEG features proved to be most relevant. Exposure detection models that were trained on subsets of animals generalized to subsets of animals that were exposed to other dosages of different chemicals. While future work is required to validate the principle in other species and to assess the robustness of the approach under different, realistic circumstances, our results indicate that utilizing different continuously measured physiological signals for early detection and identification of toxic agents is promising.


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


2001 ◽  
Vol 10 (04) ◽  
pp. 613-637 ◽  
Author(s):  
M. M. WEST ◽  
T. L. McCLUSKEY

In this paper we describe a project (IMPRESS) in which machine learning (ML) tools were created and utilised for the validation of an Air Traffic Control domain theory written in first order logic. During the project, novel techniques were devised for the automated revision of general clause form theories using training examples. These techniques were combined in an algorithm which focused in on the parts of a theory which involve ordinal sorts, and applied geometrical revision operators to repair faulty component parts. While we illustrate the feasibility of applying ML to this area, we conclude that to be effective it must be focused to the application at hand, and used in mixed-initiative mode within a tools environment. The method is illustrated with experimental results obtained during the project.


2021 ◽  
Vol 2083 (4) ◽  
pp. 042086
Author(s):  
Yuqi Qin

Abstract Machine learning algorithm is the core of artificial intelligence, is the fundamental way to make computer intelligent, its application in all fields of artificial intelligence. Aiming at the problems of the existing algorithms in the discrete manufacturing industry, this paper proposes a new 0-1 coding method to optimize the learning algorithm, and finally proposes a learning algorithm of “IG type learning only from the best”.


Sign in / Sign up

Export Citation Format

Share Document