scholarly journals Mining Pareto-Optimal Counterfactual Antecedents With A Branch-And-Bound Model-Agnostic Algorithm

Author(s):  
Marcos M. Raimundo ◽  
Luis Gustavo Nonato ◽  
Jorge Poco

Abstract Mining counterfactual antecedents became a valuable tool to discover knowledge and explain machine learning models. It consists of generating synthetic samples from an original sample to achieve the desired outcome in a machine learning model thus helping to understand the prediction. An insightful methodology would explore a broader set of counterfactual antecedents to reveal multiple possibilities while operating on any classifier. Thus, we create a tree-based search that requires monotonicity from the objective functions (a.k.a. cost functions); it allows pruning branches that will not improve the objective functions. Since monotonicity is only required for the objective function, this method can be used for any family of classifiers (e.g., linear models, neural networks, decision trees). However, additional classifier properties speed up the tree-search when it foresees branches that will not result in feasible actions. Moreover, the proposed optimization generates a diverse set of Pareto-optimal counterfactual antecedents by relying on multi-objective concepts. The results show an algorithm with working guarantees that enumerates a wide range of counterfactual antecedents. It helps the decision-maker understand the machine learning decision and finds alternatives to achieve the desired outcome. The user can inspect these multiple counterfactual antecedents to find the most suitable one and have a broader understanding of the prediction.

2018 ◽  
Vol 211 ◽  
pp. 17009
Author(s):  
Natalia Espinoza Sepulveda ◽  
Jyoti Sinha

The development of technologies for the maintenance industry has taken an important role to meet the demanding challenges. One of the important challenges is to predict the defects, if any, in machines as early as possible to manage the machines downtime. The vibration-based condition monitoring (VCM) is well-known for this purpose but requires the human experience and expertise. The machine learning models using the intelligent systems and pattern recognition seem to be the future avenue for machine fault detection without the human expertise. Several such studies are published in the literature. This paper is also on the machine learning model for the different machine faults classification and detection. Here the time domain and frequency domain features derived from the measured machine vibration data are used separated in the development of the machine learning models using the artificial neutral network method. The effectiveness of both the time and frequency domain features based models are compared when they are applied to an experimental rig. The paper presents the proposed machine learning models and their performance in terms of the observations and results.


2022 ◽  
Vol 9 (1) ◽  
Author(s):  
Marcos Fabietti ◽  
Mufti Mahmud ◽  
Ahmad Lotfi

AbstractAcquisition of neuronal signals involves a wide range of devices with specific electrical properties. Combined with other physiological sources within the body, the signals sensed by the devices are often distorted. Sometimes these distortions are visually identifiable, other times, they overlay with the signal characteristics making them very difficult to detect. To remove these distortions, the recordings are visually inspected and manually processed. However, this manual annotation process is time-consuming and automatic computational methods are needed to identify and remove these artefacts. Most of the existing artefact removal approaches rely on additional information from other recorded channels and fail when global artefacts are present or the affected channels constitute the majority of the recording system. Addressing this issue, this paper reports a novel channel-independent machine learning model to accurately identify and replace the artefactual segments present in the signals. Discarding these artifactual segments by the existing approaches causes discontinuities in the reproduced signals which may introduce errors in subsequent analyses. To avoid this, the proposed method predicts multiple values of the artefactual region using long–short term memory network to recreate the temporal and spectral properties of the recorded signal. The method has been tested on two open-access data sets and incorporated into the open-access SANTIA (SigMate Advanced: a Novel Tool for Identification of Artefacts in Neuronal Signals) toolbox for community use.


Data is the most crucial component of a successful ML system. Once a machine learning model is developed, it gets obsolete over time due to presence of new input data being generated every second. In order to keep our predictions accurate we need to find a way to keep our models up to date. Our research work involves finding a mechanism which can retrain the model with new data automatically. This research also involves exploring the possibilities of automating machine learning processes. We started this project by training and testing our model using conventional machine learning methods. The outcome was then compared with the outcome of those experiments conducted using the AutoML methods like TPOT. This helped us in finding an efficient technique to retrain our models. These techniques can be used in areas where people do not deal with the actual working of a ML model but only require the outputs of ML processes


An Individual method of living on with a daily existence it directly influences on your overall health. Since stress is the significant infection of our human body. Like depression, heart attack and mental illness. WHO says “Globally, more than 264 million people of all ages suffer from depression.”[8]. Also the report says that most of the time people are stressed because of their work. 10.7% of People disorder with stress, anxiety and depression [8]. There are different method to discovering stress ex. Smart watches, chest belt, and extraordinary machine. Our principle objective is to figure out pressure progressively utilizing smart watches through their Sensor. There are different kinds of sensor available to find stress such as PPG, GSR, HRV, ECG and temperature. Smart watches contain a wide range of data through various sensor. This kind of gathered information are applied on various machine learning method. Like linear regression, SVM, KNN, decision tree. Technique have distinct, comparing accuracy and chooses best Machine learning model. This paper investigation have different analysis to find and compare accuracy by various sensors data. It is also check whether using one sensor or multiple sensors such as HRV, ECG or GSR and PPG to predict the better accuracy score for stress detection.


2022 ◽  
pp. 220-249
Author(s):  
Md Ariful Haque ◽  
Sachin Shetty

Financial sectors are lucrative cyber-attack targets because of their immediate financial gain. As a result, financial institutions face challenges in developing systems that can automatically identify security breaches and separate fraudulent transactions from legitimate transactions. Today, organizations widely use machine learning techniques to identify any fraudulent behavior in customers' transactions. However, machine learning techniques are often challenging because of financial institutions' confidentiality policy, leading to not sharing the customer transaction data. This chapter discusses some crucial challenges of handling cybersecurity and fraud in the financial industry and building machine learning-based models to address those challenges. The authors utilize an open-source e-commerce transaction dataset to illustrate the forensic processes by creating a machine learning model to classify fraudulent transactions. Overall, the chapter focuses on how the machine learning models can help detect and prevent fraudulent activities in the financial sector in the age of cybersecurity.


2016 ◽  
Vol 7 (2) ◽  
pp. 43-71 ◽  
Author(s):  
Sangeeta Lal ◽  
Neetu Sardana ◽  
Ashish Sureka

Logging is an important yet tough decision for OSS developers. Machine-learning models are useful in improving several steps of OSS development, including logging. Several recent studies propose machine-learning models to predict logged code construct. The prediction performances of these models are limited due to the class-imbalance problem since the number of logged code constructs is small as compared to non-logged code constructs. No previous study analyzes the class-imbalance problem for logged code construct prediction. The authors first analyze the performances of J48, RF, and SVM classifiers for catch-blocks and if-blocks logged code constructs prediction on imbalanced datasets. Second, the authors propose LogIm, an ensemble and threshold-based machine-learning model. Third, the authors evaluate the performance of LogIm on three open-source projects. On average, LogIm model improves the performance of baseline classifiers, J48, RF, and SVM, by 7.38%, 9.24%, and 4.6% for catch-blocks, and 12.11%, 14.95%, and 19.13% for if-blocks logging prediction.


2020 ◽  
Vol 117 (19) ◽  
pp. 10492-10499 ◽  
Author(s):  
Zhan Ban ◽  
Peng Yuan ◽  
Fubo Yu ◽  
Ting Peng ◽  
Qixing Zhou ◽  
...  

Protein corona formation is critical for the design of ideal and safe nanoparticles (NPs) for nanomedicine, biosensing, organ targeting, and other applications, but methods to quantitatively predict the formation of the protein corona, especially for functional compositions, remain unavailable. The traditional linear regression model performs poorly for the protein corona, as measured by R2 (less than 0.40). Here, the performance with R2 over 0.75 in the prediction of the protein corona was achieved by integrating a machine learning model and meta-analysis. NPs without modification and surface modification were identified as the two most important factors determining protein corona formation. According to experimental verification, the functional protein compositions (e.g., immune proteins, complement proteins, and apolipoproteins) in complex coronas were precisely predicted with good R2 (most over 0.80). Moreover, the method successfully predicted the cellular recognition (e.g., cellular uptake by macrophages and cytokine release) mediated by functional corona proteins. This workflow provides a method to accurately and quantitatively predict the functional composition of the protein corona that determines cellular recognition and nanotoxicity to guide the synthesis and applications of a wide range of NPs by overcoming limitations and uncertainty.


Significance It required arguably the single largest computational effort for a machine learning model to date, and is it capable of producing text at times indistinguishable from the work of a human author. This has generated considerable excitement about potentially transformative business applications -- and concerns about the system's weaknesses and possible misuse. Impacts Stereotypes and biases in machine learning models will become increasingly problematic as they are adopted by businesses and governments. The use of flawed AI tools that result in embarrassing failures risk cuts to public funding for AI research. Academia and industry face pressure to advance research into explainable AI, but progress is slow.


Aerospace ◽  
2021 ◽  
Vol 8 (9) ◽  
pp. 236
Author(s):  
Junghyun Kim ◽  
Kyuman Lee

Obtaining reliable wind information is critical for efficiently managing air traffic and airport operations. Wind forecasting has been considered one of the most challenging tasks in the aviation industry. Recently, with the advent of artificial intelligence, many machine learning techniques have been widely used to address a variety of complex phenomena in wind predictions. In this paper, we propose a hybrid framework that combines a machine learning model with Kalman filtering for a wind nowcasting problem in the aviation industry. More specifically, this study has three objectives as follows: (1) compare the performance of the machine learning models (i.e., Gaussian process, multi-layer perceptron, and long short-term memory (LSTM) network) to identify the most appropriate model for wind predictions, (2) combine the machine learning model selected in step (1) with an unscented Kalman filter (UKF) to improve the fidelity of the model, and (3) perform Monte Carlo simulations to quantify uncertainties arising from the modeling process. Results show that short-term time-series wind datasets are best predicted by the LSTM network compared to the other machine learning models and the UKF-aided LSTM (UKF-LSTM) approach outperforms the LSTM network only, especially when long-term wind forecasting needs to be considered.


Sign in / Sign up

Export Citation Format

Share Document