Deployment of a Machine Learning System for Predicting Lawsuits Against Power Companies: Lessons Learned from an Agile Testing Experience for Improving Software Quality

Context: Code smells are symptoms, that something may be wrong in software systems that can cause complications in maintaining software quality. In literature, there exists many code smells and their identification is far from trivial. Thus, several techniques have also been proposed to automate code smell detection in order to improve software quality. Objective: This paper presents an up-to-date review of simple and hybrid machine learning based code smell detection techniques and tools. Methods: We collected all the relevant research published in this field till 2020. We extracted the data from those articles and classified them into two major categories. In addition, we compared the selected studies based on several aspects like, code smells, machine learning techniques, datasets, programming languages used by datasets, dataset size, evaluation approach, and statistical testing. Results: Majority of empirical studies have proposed machine- learning based code smell detection tools. Support vector machine and decision tree algorithms are frequently used by the researchers. Along with this, a major proportion of research is conducted on Open Source Softwares (OSS) such as, Xerces, Gantt Project and ArgoUml. Furthermore, researchers paid more attention towards Feature Envy and Long Method code smells. Conclusion: We identified several areas of open research like, need of code smell detection techniques using hybrid approaches, need of validation employing industrial datasets, etc.

Download Full-text

A Robust Automated Machine Learning System with Pseudoinverse Learning

Cognitive Computation ◽

10.1007/s12559-021-09853-6 ◽

2021 ◽

Author(s):

Ke Wang ◽

Ping Guo

Keyword(s):

Machine Learning ◽

Learning System ◽

Automated Machine Learning

Download Full-text

Paper2Wire – A Case Study of User-Centred Development of Machine Learning Tools for UX Designers

i-com ◽

10.1515/icom-2021-0002 ◽

2021 ◽

Vol 20 (1) ◽

pp. 19-32

Author(s):

Daniel Buschek ◽

Charlotte Anlauff ◽

Florian Lachner

Keyword(s):

Machine Learning ◽

Development Process ◽

User Study ◽

Concept Development ◽

Lessons Learned ◽

Design Tool ◽

Learning Tools ◽

Interface Elements ◽

Industry Partner

Abstract This paper reflects on a case study of a user-centred concept development process for a Machine Learning (ML) based design tool, conducted at an industry partner. The resulting concept uses ML to match graphical user interface elements in sketches on paper to their digital counterparts to create consistent wireframes. A user study (N=20) with a working prototype shows that this concept is preferred by designers, compared to the previous manual procedure. Reflecting on our process and findings we discuss lessons learned for developing ML tools that respect practitioners’ needs and practices.

Download Full-text

The graph neural networking challenge

ACM SIGCOMM Computer Communication Review ◽

10.1145/3477482.3477485 ◽

2021 ◽

Vol 51 (3) ◽

pp. 9-16

Author(s):

José Suárez-Varela ◽

Miquel Ferriol-Galmés ◽

Albert López ◽

Paul Almasan ◽

Guillermo Bernárdez ◽

...

Keyword(s):

Machine Learning ◽

Computer Networks ◽

Real World ◽

Large Scale ◽

Lessons Learned ◽

Educational Resources ◽

Global Competition ◽

International Telecommunication Union ◽

International Telecommunication ◽

Broad Audience

During the last decade, Machine Learning (ML) has increasingly become a hot topic in the field of Computer Networks and is expected to be gradually adopted for a plethora of control, monitoring and management tasks in real-world deployments. This poses the need to count on new generations of students, researchers and practitioners with a solid background in ML applied to networks. During 2020, the International Telecommunication Union (ITU) has organized the "ITU AI/ML in 5G challenge", an open global competition that has introduced to a broad audience some of the current main challenges in ML for networks. This large-scale initiative has gathered 23 different challenges proposed by network operators, equipment manufacturers and academia, and has attracted a total of 1300+ participants from 60+ countries. This paper narrates our experience organizing one of the proposed challenges: the "Graph Neural Networking Challenge 2020". We describe the problem presented to participants, the tools and resources provided, some organization aspects and participation statistics, an outline of the top-3 awarded solutions, and a summary with some lessons learned during all this journey. As a result, this challenge leaves a curated set of educational resources openly available to anyone interested in the topic.

Download Full-text

Evaluation of an integrated multi-task machine learning system with humans in the loop

Proceedings of the 2007 Workshop on Performance Metrics for Intelligent Systems - PerMIS '07 ◽

10.1145/1660877.1660901 ◽

2007 ◽

Cited By ~ 3

Author(s):

Aaron Steinfeld ◽

Julie Fitzgerald ◽

Othar Hansson ◽

Mike Pool ◽

Mark Drummond ◽

...

Keyword(s):

Machine Learning ◽

Learning System

Download Full-text

Artificial Intelligence and Behavioral Science Through the Looking Glass: Challenges for Real-World Application

Annals of Behavioral Medicine ◽

10.1093/abm/kaaa095 ◽

2020 ◽

Vol 54 (12) ◽

pp. 942-947

Author(s):

Pol Mac Aonghusa ◽

Susan Michie

Keyword(s):

Climate Change ◽

Artificial Intelligence ◽

Machine Learning ◽

Behavior Change ◽

Behavioral Science ◽

Lessons Learned ◽

Learning Approaches ◽

Intervention Evaluation ◽

Research Activities ◽

Behavior Change Interventions

Abstract Background Artificial Intelligence (AI) is transforming the process of scientific research. AI, coupled with availability of large datasets and increasing computational power, is accelerating progress in areas such as genetics, climate change and astronomy [NeurIPS 2019 Workshop Tackling Climate Change with Machine Learning, Vancouver, Canada; Hausen R, Robertson BE. Morpheus: A deep learning framework for the pixel-level analysis of astronomical image data. Astrophys J Suppl Ser. 2020;248:20; Dias R, Torkamani A. AI in clinical and genomic diagnostics. Genome Med. 2019;11:70.]. The application of AI in behavioral science is still in its infancy and realizing the promise of AI requires adapting current practices. Purposes By using AI to synthesize and interpret behavior change intervention evaluation report findings at a scale beyond human capability, the HBCP seeks to improve the efficiency and effectiveness of research activities. We explore challenges facing AI adoption in behavioral science through the lens of lessons learned during the Human Behaviour-Change Project (HBCP). Methods The project used an iterative cycle of development and testing of AI algorithms. Using a corpus of published research reports of randomized controlled trials of behavioral interventions, behavioral science experts annotated occurrences of interventions and outcomes. AI algorithms were trained to recognize natural language patterns associated with interventions and outcomes from the expert human annotations. Once trained, the AI algorithms were used to predict outcomes for interventions that were checked by behavioral scientists. Results Intervention reports contain many items of information needing to be extracted and these are expressed in hugely variable and idiosyncratic language used in research reports to convey information makes developing algorithms to extract all the information with near perfect accuracy impractical. However, statistical matching algorithms combined with advanced machine learning approaches created reasonably accurate outcome predictions from incomplete data. Conclusions AI holds promise for achieving the goal of predicting outcomes of behavior change interventions, based on information that is automatically extracted from intervention evaluation reports. This information can be used to train knowledge systems using machine learning and reasoning algorithms.

Download Full-text

Machine-learning algorithm as a prognostic tool in non-obstructive acute-on-chronic kidney disease in the cat

Journal of Feline Medicine and Surgery ◽

10.1177/1098612x211001273 ◽

2021 ◽

pp. 1098612X2110012

Author(s):

Jade Renard ◽

Mathieu R Faucher ◽

Anaïs Combes ◽

Didier Concordet ◽

Brice S Reynolds

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Decision Tree ◽

Diagnostic Performance ◽

Clinical Signs ◽

Learning System ◽

Clinical Severity ◽

Medium Term ◽

Term Survival

Objectives The aim of this study was to develop an algorithm capable of predicting short- and medium-term survival in cases of intrinsic acute-on-chronic kidney disease (ACKD) in cats. Methods The medical record database was searched to identify cats hospitalised for acute clinical signs and azotaemia of at least 48 h duration and diagnosed to have underlying chronic kidney disease based on ultrasonographic renal abnormalities or previously documented azotaemia. Cases with postrenal azotaemia, exposure to nephrotoxicants, feline infectious peritonitis or neoplasia were excluded. Clinical variables were combined in a clinical severity score (CSS). Clinicopathological and ultrasonographic variables were also collected. The following variables were tested as inputs in a machine learning system: age, body weight (BW), CSS, identification of small kidneys or nephroliths by ultrasonography, serum creatinine at 48 h (Crea48), spontaneous feeding at 48 h (SpF48) and aetiology. Outputs were outcomes at 7, 30, 90 and 180 days. The machine-learning system was trained to develop decision tree algorithms capable of predicting outputs from inputs. Finally, the diagnostic performance of the algorithms was calculated. Results Crea48 was the best predictor of survival at 7 days (threshold 1043 µmol/l, sensitivity 0.96, specificity 0.53), 30 days (threshold 566 µmol/l, sensitivity 0.70, specificity 0.89) and 90 days (threshold 566 µmol/l, sensitivity 0.76, specificity 0.80), with fewer cats still alive when their Crea48 was above these thresholds. A short decision tree, including age and Crea48, predicted the 180-day outcome best. When Crea48 was excluded from the analysis, the generated decision trees included CSS, age, BW, SpF48 and identification of small kidneys with an overall diagnostic performance similar to that using Crea48. Conclusions and relevance Crea48 helps predict short- and medium-term survival in cats with ACKD. Secondary variables that helped predict outcomes were age, CSS, BW, SpF48 and identification of small kidneys.

Download Full-text

Individualized embryo selection strategy developed by stacking machine learning model for better in vitro fertilization outcomes: an application study

Reproductive Biology and Endocrinology ◽

10.1186/s12958-021-00734-z ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Qingsong Xi ◽

Qiyu Yang ◽

Meng Wang ◽

Bo Huang ◽

Bo Zhang ◽

...

Keyword(s):

Machine Learning ◽

In Vitro Fertilization ◽

Endometrial Thickness ◽

Learning System ◽

Embryo Selection ◽

Selection Strategy ◽

Application Study ◽

Vitro Fertilization ◽

The Impact

Abstract Background To minimize the rate of in vitro fertilization (IVF)- associated multiple-embryo gestation, significant efforts have been made. Previous studies related to machine learning in IVF mainly focused on selecting the top-quality embryos to improve outcomes, however, in patients with sub-optimal prognosis or with medium- or inferior-quality embryos, the selection between SET and DET could be perplexing. Methods This was an application study including 9211 patients with 10,076 embryos treated during 2016 to 2018, in Tongji Hospital, Wuhan, China. A hierarchical model was established using the machine learning system XGBoost, to learn embryo implantation potential and the impact of double embryos transfer (DET) simultaneously. The performance of the model was evaluated with the AUC of the ROC curve. Multiple regression analyses were also conducted on the 19 selected features to demonstrate the differences between feature importance for prediction and statistical relationship with outcomes. Results For a single embryo transfer (SET) pregnancy, the following variables remained significant: age, attempts at IVF, estradiol level on hCG day, and endometrial thickness. For DET pregnancy, age, attempts at IVF, endometrial thickness, and the newly added P1 + P2 remained significant. For DET twin risk, age, attempts at IVF, 2PN/ MII, and P1 × P2 remained significant. The algorithm was repeated 30 times, and averaged AUC of 0.7945, 0.8385, and 0.7229 were achieved for SET pregnancy, DET pregnancy, and DET twin risk, respectively. The trend of predictive and observed rates both in pregnancy and twin risk was basically identical. XGBoost outperformed the other two algorithms: logistic regression and classification and regression tree. Conclusion Artificial intelligence based on determinant-weighting analysis could offer an individualized embryo selection strategy for any given patient, and predict clinical pregnancy rate and twin risk, therefore optimizing clinical outcomes.

Download Full-text

Testing the Suitability of Automated Machine Learning for Weeds Identification

AI ◽

10.3390/ai2010004 ◽

2021 ◽

Vol 2 (1) ◽

pp. 34-47

Author(s):

Borja Espejo-Garcia ◽

Ioannis Malounas ◽

Eleanna Vali ◽

Spyros Fountas

Keyword(s):

Machine Learning ◽

Plant Protection ◽

Crop Protection ◽

Identification Problem ◽

Learning System ◽

Classifier Ensembles ◽

Automated Machine Learning ◽

A New Technique ◽

Plant Seedlings ◽

And Training

In the past years, several machine-learning-based techniques have arisen for providing effective crop protection. For instance, deep neural networks have been used to identify different types of weeds under different real-world conditions. However, these techniques usually require extensive involvement of experts working iteratively in the development of the most suitable machine learning system. To support this task and save resources, a new technique called Automated Machine Learning has started being studied. In this work, a complete open-source Automated Machine Learning system was evaluated with two different datasets, (i) The Early Crop Weeds dataset and (ii) the Plant Seedlings dataset, covering the weeds identification problem. Different configurations, such as the use of plant segmentation, the use of classifier ensembles instead of Softmax and training with noisy data, have been compared. The results showed promising performances of 93.8% and 90.74% F1 score depending on the dataset used. These performances were aligned with other related works in AutoML, but they are far from machine-learning-based systems manually fine-tuned by human experts. From these results, it can be concluded that finding a balance between manual expert work and Automated Machine Learning will be an interesting path to work in order to increase the efficiency in plant protection.

Download Full-text

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Download Full-text