scholarly journals SELM: Software Engineering of Machine Learning Models

2021 ◽  
Author(s):  
Nafiseh Jafari ◽  
Mohammad Reza Besharati ◽  
Maryam Hourali

One of the pillars of any machine learning model is its concepts. Using software engineering, we can engineer these concepts and then develop and expand them. In this article, we present a SELM framework for Software Engineering of machine Learning Models. We then evaluate this framework through a case study. Using the SELM framework, we can improve a machine learning process efficiency and provide more accuracy in learning with less processing hardware resources and a smaller training dataset. This issue highlights the importance of an interdisciplinary approach to machine learning. Therefore, in this article, we have provided interdisciplinary teams’ proposals for machine learning.

2021 ◽  
Vol 13 (23) ◽  
pp. 4844
Author(s):  
Jisun Shin ◽  
Jong-Seok Lee ◽  
Lee-Hyun Jang ◽  
Jinwook Lim ◽  
Boo-Keun Khim ◽  
...  

A record-breaking agglomeration of Sargassum was packed along the northern Jeju coast in Korea in 2021, and laborers suffered from removing them from the beach. If remote sensing can be used to detect the locations at which Sargassum accumulated in a timely and accurate manner, we could remove them before their arrival and reduce the damage caused by Sargassum. This study aims to detect Sargassum distribution on the coast of Jeju Island using the Geostationary KOMPSAT 2B (GK2B) Geostationary Ocean Color Imager-II (GOCI-II) imagery that was launched in February 2020, with measurements available since October 2020. For this, we used GOCI-II imagery during the first 6 months and machine learning models including Fine Tree, a Fine Gaussian support vector machine (SVM), and Gentle adaptive boosting (GentleBoost). We trained the models with the GOCI-II Rayleigh-corrected reflectance (RhoC) image and a ground truth map extracted from high-resolution images as input and output, respectively. Qualitative and quantitative assessments were carried out using the three machine learning models and traditional methods such as Sargassum indexes. We found that GentleBoost showed a lower false positive (6.2%) and a high F-measure level (0.82), and a more appropriate Sargassum distribution compared to other methods. The application of the machine learning model to GOCI-II images in various atmospheric conditions is therefore considered successful for mapping Sargassum extent quickly, enabling reduction of laborers’ efforts to remove them.


2017 ◽  
Vol 218 ◽  
pp. 213-222 ◽  
Author(s):  
Xing Zhu ◽  
Qiang Xu ◽  
Minggao Tang ◽  
Wen Nie ◽  
Shuqi Ma ◽  
...  

2018 ◽  
Vol 211 ◽  
pp. 17009
Author(s):  
Natalia Espinoza Sepulveda ◽  
Jyoti Sinha

The development of technologies for the maintenance industry has taken an important role to meet the demanding challenges. One of the important challenges is to predict the defects, if any, in machines as early as possible to manage the machines downtime. The vibration-based condition monitoring (VCM) is well-known for this purpose but requires the human experience and expertise. The machine learning models using the intelligent systems and pattern recognition seem to be the future avenue for machine fault detection without the human expertise. Several such studies are published in the literature. This paper is also on the machine learning model for the different machine faults classification and detection. Here the time domain and frequency domain features derived from the measured machine vibration data are used separated in the development of the machine learning models using the artificial neutral network method. The effectiveness of both the time and frequency domain features based models are compared when they are applied to an experimental rig. The paper presents the proposed machine learning models and their performance in terms of the observations and results.


Diagnostics ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 40
Author(s):  
Meike Nauta ◽  
Ricky Walsh ◽  
Adam Dubowski ◽  
Christin Seifert

Machine learning models have been successfully applied for analysis of skin images. However, due to the black box nature of such deep learning models, it is difficult to understand their underlying reasoning. This prevents a human from validating whether the model is right for the right reasons. Spurious correlations and other biases in data can cause a model to base its predictions on such artefacts rather than on the true relevant information. These learned shortcuts can in turn cause incorrect performance estimates and can result in unexpected outcomes when the model is applied in clinical practice. This study presents a method to detect and quantify this shortcut learning in trained classifiers for skin cancer diagnosis, since it is known that dermoscopy images can contain artefacts. Specifically, we train a standard VGG16-based skin cancer classifier on the public ISIC dataset, for which colour calibration charts (elliptical, coloured patches) occur only in benign images and not in malignant ones. Our methodology artificially inserts those patches and uses inpainting to automatically remove patches from images to assess the changes in predictions. We find that our standard classifier partly bases its predictions of benign images on the presence of such a coloured patch. More importantly, by artificially inserting coloured patches into malignant images, we show that shortcut learning results in a significant increase in misdiagnoses, making the classifier unreliable when used in clinical practice. With our results, we, therefore, want to increase awareness of the risks of using black box machine learning models trained on potentially biased datasets. Finally, we present a model-agnostic method to neutralise shortcut learning by removing the bias in the training dataset by exchanging coloured patches with benign skin tissue using image inpainting and re-training the classifier on this de-biased dataset.


Data is the most crucial component of a successful ML system. Once a machine learning model is developed, it gets obsolete over time due to presence of new input data being generated every second. In order to keep our predictions accurate we need to find a way to keep our models up to date. Our research work involves finding a mechanism which can retrain the model with new data automatically. This research also involves exploring the possibilities of automating machine learning processes. We started this project by training and testing our model using conventional machine learning methods. The outcome was then compared with the outcome of those experiments conducted using the AutoML methods like TPOT. This helped us in finding an efficient technique to retrain our models. These techniques can be used in areas where people do not deal with the actual working of a ML model but only require the outputs of ML processes


2021 ◽  
Vol 263 (3) ◽  
pp. 3223-3234
Author(s):  
Merten Stender ◽  
Mathies Wedler ◽  
Norbert Hoffmann ◽  
Christian Adams

Machine learning (ML) techniques allow for finding hidden patterns and signatures in data. Currently, these methods are gaining increased interest in engineering in general and in vibroacoustics in particular. Although ML methods are successfully applied, it is hardly understood how these black box-type methods make their decisions. Explainable machine learning aims at overcoming this issue by deepening the understanding of the decision-making process through perturbation-based model diagnosis. This paper introduces machine learning methods and reviews recent techniques for explainability and interpretability. These methods are exemplified on sound absorption coefficient spectra of one sound absorbing foam material measured in an impedance tube. Variances of the absorption coefficient measurements as a function of the specimen thickness and the operator are modeled by univariate and multivariate machine learning models. In order to identify the driving patterns, i.e. how and in which frequency regime the measurements are affected by the setup specifications, Shapley additive explanations are derived for the ML models. It is demonstrated how explaining machine learning models can be used to discover and express complicated relations in experimental data, thereby paving the way to novel knowledge discovery strategies in evidence-based modeling.


Sign in / Sign up

Export Citation Format

Share Document