scholarly journals Low-Cost Machine Learning for Effective and Efficient Bad Smells Detection

2021 ◽  
Author(s):  
J. S. L. Figuerêdo ◽  
V. T. Sarinho ◽  
R. T. Calumby

Bad smells are characteristics of software that indicate a code or design problem which can make information system hard to understand, evolve, and maintain. To address this problem, different approaches, manual and automated, have been proposed over the years, including more recently machine learning alternatives. However, despite the advances achieved, some machine learning techniques have not yet been effectively explored, such as the use of feature selection techniques. Moreover, it is not clear to what extent the use of numerous source-code features are necessary for reasonable bad smell detection success. Therefore, in this work we propose an approach using low-cost machine learning for effective and efficient detection of bad smells, through explicit feature selection. Our results showed that the selection allowed to statistically improve the effectiveness of the models. For some cases, the models achieved statistical equivalence, but relying on a highly reduced set of features. Indeed, by using explicit feature selection, simpler models such as Naive Bayes became statistically equivalent to robust models such as Random Forest. Therefore, the selection of features allowed keeping competitive or even superior effectiveness while also improving the efficiency of the models, demanding less computational resources for source-code preprocessing, model training and bad smell detection.

2018 ◽  
Vol 7 (4.5) ◽  
pp. 654
Author(s):  
M. S. Satyanarayana ◽  
Aruna T.M ◽  
Divyaraj G.N

Accidents have become major issue in Developing countries like India now a day. As per the Surveys 60% of the accidents are happening due to over speed. Though the government has taken so many initiatives like Traffic Awareness & Driving Awareness Week etc.., but still the percentage of accidents are not getting reduced. In this paper a new technique has been introduced to reduce the percentage of accidents. The new technique is implemented using the concept of Machine Learning [1]. The Machine Learning based systems can be implemented in all vehicles to avoid the accidents at low cost [1]. The main objective of this system is to calculate the speed of the vehicle at three various locations based on the place where the vehicle speed must be controlled and if the speed is greater than the designated speed in that road then the vehicle automatically detects the problem and same will be intimated to the driver to control the speed of the vehicle. If the speed is less or equal to the designated speed in that road then the vehicle will be passed without any disturbance. The system will be giving beep sound along with color indication to driver in each and every scenario. The other option implemented in this system is if the driver is driving the vehicle in the night and if he feel drowsy the system detects it immediately and alarm sound will be initiated to wake up the driver. This system though it won’t avoid 100% accidents at least it will reduce the percentage of accidents. This system is not only to avoid accidents it will also intelligently control the speed of the vehicles and creates awareness amongst the drivers.  


2021 ◽  
Author(s):  
◽  
Cao Truong Tran

<p>Classification is a major task in machine learning and data mining. Many real-world datasets suffer from the unavoidable issue of missing values. Classification with incomplete data has to be carefully handled because inadequate treatment of missing values will cause large classification errors.    Existing most researchers working on classification with incomplete data focused on improving the effectiveness, but did not adequately address the issue of the efficiency of applying the classifiers to classify unseen instances, which is much more important than the act of creating classifiers. A common approach to classification with incomplete data is to use imputation methods to replace missing values with plausible values before building classifiers and classifying unseen instances. This approach provides complete data which can be then used by any classification algorithm, but sophisticated imputation methods are usually computationally intensive, especially for the application process of classification. Another approach to classification with incomplete data is to build a classifier that can directly work with missing values. This approach does not require time for estimating missing values, but it often generates inaccurate and complex classifiers when faced with numerous missing values. A recent approach to classification with incomplete data which also avoids estimating missing values is to build a set of classifiers which then is used to select applicable classifiers for classifying unseen instances. However, this approach is also often inaccurate and takes a long time to find applicable classifiers when faced with numerous missing values.   The overall goal of the thesis is to simultaneously improve the effectiveness and efficiency of classification with incomplete data by using evolutionary machine learning techniques for feature selection, clustering, ensemble learning, feature construction and constructing classifiers.   The thesis develops approaches for improving imputation for classification with incomplete data by integrating clustering and feature selection with imputation. The approaches improve both the effectiveness and the efficiency of using imputation for classification with incomplete data.   The thesis develops wrapper-based feature selection methods to improve input space for classification algorithms that are able to work directly with incomplete data. The methods not only improve the classification accuracy, but also reduce the complexity of classifiers able to work directly with incomplete data.   The thesis develops a feature construction method to improve input space for classification algorithms with incomplete data by proposing interval genetic programming-genetic programming with a set of interval functions. The method improves the classification accuracy and reduces the complexity of classifiers.   The thesis develops an ensemble approach to classification with incomplete data by integrating imputation, feature selection, and ensemble learning. The results show that the approach is more accurate, and faster than previous common methods for classification with incomplete data.   The thesis develops interval genetic programming to directly evolve classifiers for incomplete data. The results show that classifiers generated by interval genetic programming can be more effective and efficient than classifiers generated the combination of imputation and traditional genetic programming. Interval genetic programming is also more effective than common classification algorithms able to work directly with incomplete data.    In summary, the thesis develops a range of approaches for simultaneously improving the effectiveness and efficiency of classification with incomplete data by using a range of evolutionary machine learning techniques.</p>


2020 ◽  
pp. 1423-1439
Author(s):  
Zhiming Wu ◽  
Tao Lin ◽  
Ningjiu Tang

Mental workload is considered one of the most important factors in interaction design and how to detect a user's mental workload during tasks is still an open research question. Psychological evidence has already attributed a certain amount of variability and “drift” in an individual's handwriting pattern to mental stress, but this phenomenon has not been explored adequately. The intention of this paper is to explore the possibility of evaluating mental workload with handwriting information by machine learning techniques. Machine learning techniques such as decision trees, support vector machine (SVM), and artificial neural network were used to predict mental workload levels in the authors' research. Results showed that it was possible to make prediction of mental workload levels automatically based on handwriting patterns with relatively high accuracy, especially on patterns of children. In addition, the proposed approach is attractive because it requires no additional hardware, is unobtrusive, is adaptable to individual users, and is of very low cost.


Inventions ◽  
2020 ◽  
Vol 5 (4) ◽  
pp. 57
Author(s):  
Attique Ur Rehman ◽  
Tek Tjing Lie ◽  
Brice Vallès ◽  
Shafiqur Rahman Tito

The recent advancement in computational capabilities and deployment of smart meters have caused non-intrusive load monitoring to revive itself as one of the promising techniques of energy monitoring. Toward effective energy monitoring, this paper presents a non-invasive load inference approach assisted by feature selection and ensemble machine learning techniques. For evaluation and validation purposes of the proposed approach, one of the major residential load elements having solid potential toward energy efficiency applications, i.e., water heating, is considered. Moreover, to realize the real-life deployment, digital simulations are carried out on low-sampling real-world load measurements: New Zealand GREEN Grid Database. For said purposes, MATLAB and Python (Scikit-Learn) are used as simulation tools. The employed learning models, i.e., standalone and ensemble, are trained on a single household’s load data and later tested rigorously on a set of diverse households’ load data, to validate the generalization capability of the employed models. This paper presents a comprehensive performance evaluation of the presented approach in the context of event detection, feature selection, and learning models. Based on the presented study and corresponding analysis of the results, it is concluded that the proposed approach generalizes well to the unseen testing data and yields promising results in terms of non-invasive load inference.


2020 ◽  
Vol 22 (3) ◽  
pp. 27-29 ◽  
Author(s):  
Paula Ramos-Giraldo ◽  
Chris Reberg-Horton ◽  
Anna M. Locke ◽  
Steven Mirsky ◽  
Edgar Lobaton

Author(s):  
Amalu Michael ◽  
Deepa S S

Diabetic retinopathy is one of the common forms of diabetic eye disease. DR occurs due to a high ratio of glucose in the blood, which causes alterations in the retinal vessels. Machine learning may be a broad multidisciplinary field that has its roots in statistics, algebra, data processing, and information analytics, etc. Machine learning is used to discover patterns from medical data and provide an efficient way to predict diseases.ML is an application of artificial intelligence it collects information from training data. There are several machine learning techniques are used for the diagnosis of diabetic retinopathy. This paper mainly focuses on the survey of such techniques and also various feature selection mechanisms. This study provides the basic categorization of feature selection techniques and discussing their use.


Sign in / Sign up

Export Citation Format

Share Document