Prediction of Change-Prone Classes Using Machine Learning and Statistical Techniques

Author(s):  
LinRuchika Malhotra ◽  
Ankita Jain Bansal

For software development, availability of resources is limited, thereby necessitating efficient and effective utilization of resources. This can be achieved through prediction of key attributes, which affect software quality such as fault proneness, change proneness, effort, maintainability, etc. The primary aim of this chapter is to investigate the relationship between object-oriented metrics and change proneness. Predicting the classes that are prone to changes can help in maintenance and testing. Developers can focus on the classes that are more change prone by appropriately allocating resources. This will help in reducing costs associated with software maintenance activities. The authors have constructed models to predict change proneness using various machine-learning methods and one statistical method. They have evaluated and compared the performance of these methods. The proposed models are validated using open source software, Frinika, and the results are evaluated using Receiver Operating Characteristic (ROC) analysis. The study shows that machine-learning methods are more efficient than regression techniques. Among the machine-learning methods, boosting technique (i.e. Logitboost) outperformed all the other models. Thus, the authors conclude that the developed models can be used to predict the change proneness of classes, leading to improved software quality.

Author(s):  
LinRuchika Malhotra ◽  
Ankita Jain Bansal

For software development, availability of resources is limited, thereby necessitating efficient and effective utilization of resources. This can be achieved through prediction of key attributes, which affect software quality such as fault proneness, change proneness, effort, maintainability, etc. The primary aim of this chapter is to investigate the relationship between object-oriented metrics and change proneness. Predicting the classes that are prone to changes can help in maintenance and testing. Developers can focus on the classes that are more change prone by appropriately allocating resources. This will help in reducing costs associated with software maintenance activities. The authors have constructed models to predict change proneness using various machine-learning methods and one statistical method. They have evaluated and compared the performance of these methods. The proposed models are validated using open source software, Frinika, and the results are evaluated using Receiver Operating Characteristic (ROC) analysis. The study shows that machine-learning methods are more efficient than regression techniques. Among the machine-learning methods, boosting technique (i.e. Logitboost) outperformed all the other models. Thus, the authors conclude that the developed models can be used to predict the change proneness of classes, leading to improved software quality.


2020 ◽  
Author(s):  
Hanxue Wang ◽  
Wenjuan Cui ◽  
Yunchang Guo ◽  
Yi Du ◽  
Yuanchun Zhou

BACKGROUND Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases. However, foodborne diseases caused by different pathogens lack specificity in the clinical features, then there is a low proportion of clinically actual pathogen detection in real life. OBJECTIVE Analyzing the data of foodborne disease cases, selecting appropriate features based on the analysis results, and using machine learning methods to classify foodborne disease pathogens, so as to predict the pathogens of foodborne diseases which have not been tested. METHODS Extracting features such as space, time, and food exposure from the data of foodborne disease cases, analyzing the relationship between these features and the pathogens of foodborne diseases, using a variety of machine learning methods to classify the pathogens of foodborne diseases, and comparing the results to obtain the optimal pathogen prediction model with the highest accuracy. RESULTS By comparing the results of four models we used, the GBDT model obtains the highest accuracy, which is almost 69% in identifying four pathogenic bacteria including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. And by evaluating the importance of features, we find that the time of illness, geographical longitude and latitude, diarrhea frequency and so on, play important roles in classifying the foodborne disease pathogens. CONCLUSIONS Related data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases.


Sign in / Sign up

Export Citation Format

Share Document