A Comparative Evaluation of Supervised Machine Learning Classification Techniques for Engineering Design Applications

Abstract Supervised machine learning techniques have proven to be effective tools for engineering design exploration and optimization applications, in which they are especially useful for mapping promising or feasible regions of the design space. The design space mappings can be used to inform early-stage design exploration, provide reliability assessments, and aid convergence in multiobjective or multilevel problems that require collaborative design teams. However, the accuracy of the mappings can vary based on problem factors such as the number of design variables, presence of discrete variables, multimodality of the underlying response function, and amount of training data available. Additionally, there are several useful machine learning algorithms available, and each has its own set of algorithmic hyperparameters that significantly affect accuracy and computational expense. This work elucidates the use of machine learning for engineering design exploration and optimization problems by investigating the performance of popular classification algorithms on a variety of example engineering optimization problems. The results are synthesized into a set of observations to provide engineers with intuition for applying these techniques to their own problems in the future, as well as recommendations based on problem type to aid engineers in algorithm selection and utilization.

Download Full-text

Structure label prediction using similarity-based retrieval and weakly supervised label mapping

Geophysics ◽

10.1190/geo2018-0028.1 ◽

2019 ◽

Vol 84 (1) ◽

pp. V67-V79 ◽

Cited By ~ 9

Author(s):

Yazeed Alaudah ◽

Motaz Alfarraj ◽

Ghassan AlRegib

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Seismic Interpretation ◽

Machine Learning Algorithms ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Subsurface Structures ◽

Weakly Supervised ◽

Seismic Volumes

Recently, there has been significant interest in various supervised machine learning techniques that can help reduce the time and effort consumed by manual interpretation workflows. However, most successful supervised machine learning algorithms require huge amounts of annotated training data. Obtaining these labels for large seismic volumes is a very time-consuming and laborious task. We have addressed this problem by presenting a weakly supervised approach for predicting the labels of various seismic structures. By having an interpreter select a very small number of exemplar images for every class of subsurface structures, we use a novel similarity-based retrieval technique to extract thousands of images that contain similar subsurface structures from the seismic volume. By assuming that similar images belong to the same class, we obtain thousands of image-level labels for these images; we validate this assumption. We have evaluated a novel weakly supervised algorithm for mapping these rough image-level labels into more accurate pixel-level labels that localize the different subsurface structures within the image. This approach dramatically simplifies the process of obtaining labeled data for training supervised machine learning algorithms on seismic interpretation tasks. Using our method, we generate thousands of automatically labeled images from the Netherlands Offshore F3 block with reasonably accurate pixel-level labels. We believe that this work will allow for more advances in machine learning-enabled seismic interpretation.

Download Full-text

Review on Various Machine Learning and Deep Learning Techniques for Prediction and Classification of Quotidian Datasets

Recent Advances in 3D Imaging, Modeling, and Reconstruction - Advances in Multimedia and Interactive Technologies ◽

10.4018/978-1-5225-5294-9.ch014 ◽

2020 ◽

pp. 296-323

Author(s):

Anisha M. Lal ◽

B. Koushik Reddy ◽

Aju D.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Supervised Methods ◽

Regression Techniques

Machine learning can be defined as the ability of a computer to learn and solve a problem without being explicitly coded. The efficiency of the program increases with experience through the task specified. In traditional programming, the program and the input are specified to get the output, but in the case of machine learning, the targets and predictors are provided to the algorithm make the process trained. This chapter focuses on various machine learning techniques and their performance with commonly used datasets. A supervised learning algorithm consists of a target variable that is to be predicted from a given set of predictors. Using these established targets is a function that plots targets to a given set of predictors. The training process allows the system to train the unknown data and continues until the model achieves a desired level of accuracy on the training data. The supervised methods can be usually categorized as classification and regression. This chapter discourses some of the popular supervised machine learning algorithms and their performances using quotidian datasets. This chapter also discusses some of the non-linear regression techniques and some insights on deep learning with respect to object recognition.

Download Full-text

Toward Palmprint Recognition Methodology Based Machine Learning Techniques

European Journal of Electrical Engineering and Computer Science ◽

10.24018/ejece.2020.4.4.225 ◽

2020 ◽

Vol 4 (4) ◽

Author(s):

M. M. Ata ◽

K. M. Elgamily ◽

M. A. Mohamed

Keyword(s):

Machine Learning ◽

Hough Transform ◽

Recognition Accuracy ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Palmprint Recognition ◽

Feature Vectors ◽

Learning Techniques

The presented paper proposes an algorithm for palmprint recognition using seven different machine learning algorithms. First of all, we have proposed a region of interest (ROI) extraction methodology which is a two key points technique. Secondly, we have performed some image enhancement techniques such as edge detection and morphological operations in order to make the ROI image more suitable for the Hough transform. In addition, we have applied the Hough transform in order to extract all the possible principle lines on the ROI images. We have extracted the most salient morphological features of those lines; slope and length. Furthermore, we have applied the invariant moments algorithm in order to produce 7 appropriate hues of interest. Finally, after performing a complete hybrid feature vectors, we have applied different machine learning algorithms in order to recognize palmprints effectively. Recognition accuracy have been tested by calculating precision, sensitivity, specificity, accuracy, dice, Jaccard coefficients, correlation coefficients, and training time. Seven different supervised machine learning algorithms have been implemented and utilized. The effect of forming the proposed hybrid feature vectors between Hough transform and Invariant moment have been utilized and tested. Experimental results show that the feed forward neural network with back propagation has achieved about 99.99% recognition accuracy among all tested machine learning techniques.

Download Full-text

Artificially Generated Training Data-sets for Supervised Machine Learning Techniques in Magnetic Resonance Imaging: An Example in Myocardial Segmentation

2019 Computing in Cardiology Conference (CinC) ◽

10.22489/cinc.2019.220 ◽

2019 ◽

Author(s):

Christos Xanthis ◽

Kostas Haris ◽

Dimitrios Filos ◽

Anthony Aletras

Keyword(s):

Magnetic Resonance Imaging ◽

Machine Learning ◽

Magnetic Resonance ◽

Training Data ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Data Sets ◽

Resonance Imaging ◽

Learning Techniques ◽

Myocardial Segmentation

Download Full-text

Android Malware Detection using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1011.0982s1219 ◽

2020 ◽

Vol 8 (2S12) ◽

pp. 65-70

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Training Data ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

User Interest ◽

Android Malware ◽

Android Malware Detection

Machine Learning is empowering many aspects of day-to-day lives from filtering the content on social networks to suggestions of products that we may be looking for. This technology focuses on taking objects as image input to find new observations or show items based on user interest. The major discussion here is the Machine Learning techniques where we use supervised learning where the computer learns by the input data/training data and predict result based on experience. We also discuss the machine learning algorithms: Naïve Bayes Classifier, K-Nearest Neighbor, Random Forest, Decision Tress, Boosted Trees, Support Vector Machine, and use these classifiers on a dataset Malgenome and Drebin which are the Android Malware Dataset. Android is an operating system that is gaining popularity these days and with a rise in demand of these devices the rise in Android Malware. The traditional techniques methods which were used to detect malware was unable to detect unknown applications. We have run this dataset on different machine learning classifiers and have recorded the results. The experiment result provides a comparative analysis that is based on performance, accuracy, and cost.

Download Full-text

PSIX-15 Assessment of machine learning algorithms for prediction of Aleutian disease in American mink

Journal of Animal Science ◽

10.1093/jas/skab235.484 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 264-265

Author(s):

Duy Ngoc Do ◽

Guoyu Hu ◽

Younes Miar

Keyword(s):

Machine Learning ◽

Random Forest ◽

Linear Models ◽

American Mink ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Training Data ◽

Enzyme Linked Immunosorbent Assay ◽

Linear Discriminant ◽

Machine Learning Classification

Abstract American mink (Neovison vison) is the major source of fur for the fur industries worldwide and Aleutian disease (AD) is causing severe financial losses to the mink industry. Different methods have been used to diagnose the AD in mink, but the combination of several methods can be the most appropriate approach for the selection of AD resilient mink. Iodine agglutination test (IAT) and counterimmunoelectrophoresis (CIEP) methods are commonly employed in test-and-remove strategy; meanwhile, enzyme-linked immunosorbent assay (ELISA) and packed-cell volume (PCV) methods are complementary. However, using multiple methods are expensive; and therefore, hindering the corrected use of AD tests in selection. This research presented the assessments of the AD classification based on machine learning algorithms. The Aleutian disease was tested on 1,830 individuals using these tests in an AD positive mink farm (Canadian Centre for Fur Animal Research, NS, Canada). The accuracy of classification for CIEP was evaluated based on the sex information, and IAT, ELISA and PCV test results implemented in seven machine learning classification algorithms (Random Forest, Artificial Neural Networks, C50Tree, Naive Bayes, Generalized Linear Models, Boost, and Linear Discriminant Analysis) using the Caret package in R. The accuracy of prediction varied among the methods. Overall, the Random Forest was the best-performing algorithm for the current dataset with an accuracy of 0.89 in the training data and 0.94 in the testing data. Our work demonstrated the utility and relative ease of using machine learning algorithms to assess the CIEP information, and consequently reducing the cost of AD tests. However, further works require the inclusion of production and reproduction information in the models and extension of phenotypic collection to increase the accuracy of current methods.

Download Full-text

Classification of Sleep Apnea Using ECG Signals With Machine Learning Techniques

Advances in Medical Technologies and Clinical Practice - Advancing the Investigation and Treatment of Sleep Disorders Using AI ◽

10.4018/978-1-7998-8018-9.ch010 ◽

2021 ◽

pp. 184-203

Author(s):

Karthik R. ◽

Ifrah Alam ◽

Bandaru Umamadhuri ◽

Bharath K. P. ◽

Rajesh Kumar M.

Keyword(s):

Machine Learning ◽

Sleep Apnea ◽

Soft Tissues ◽

Amplitude Distribution ◽

Feature Reduction ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Feed Forward Neural Network ◽

Ecg Signals

In this chapter, the authors use various signal processing techniques to analyze and gain insights on how ECG signals for patients suffering from sleep apnea (sleep apnea or obstructive sleep apnea occurs when the muscles that support the soft tissues in the throat, such as tongue and soft palate, relax temporarily) disease vary with respect to a normal person's ECG. The work has three stages: firstly, to identify waves, complexes, morphology in an ECG which reflect the presence of the disease; second, feature extraction techniques to extract features of ECG such as duration of the wave, amplitude distribution, and morphology classes; and third, detailed clustering (unsupervised) algorithm analysis of the extracted features with efficient feature reduction methodologies such as PCA and LDA. Finally, the authors use supervised machine learning algorithms (SVM, naive Bayes classifier, feed forward neural network, and decision tree) to distinguish between ECG signals with sleep apnea and normal ECG signals.

Download Full-text

Design-Oriented Multifidelity Fluid Simulation Using Machine Learned Fidelity Mapping

ASME 2019 Conference on Smart Materials, Adaptive Structures and Intelligent Systems ◽

10.1115/smasis2019-5515 ◽

2019 ◽

Cited By ~ 1

Author(s):

Kazuko Fuchi ◽

Eric M. Wolf ◽

David S. Makhija ◽

Nathan A. Wukie ◽

Christopher R. Schrock ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Fluid Simulation ◽

Machine Learning Algorithms ◽

Training Data ◽

Supervised Machine Learning ◽

High Fidelity ◽

Computational Domain ◽

Symmetry Properties ◽

High Fidelity Simulations

Abstract A machine learning algorithm that performs multifidelity domain decomposition is introduced. While the design of complex systems can be facilitated by numerical simulations, the determination of appropriate physics couplings and levels of model fidelity can be challenging. The proposed method automatically divides the computational domain into subregions and assigns required fidelity level, using a small number of high fidelity simulations to generate training data and low fidelity solutions as input data. Unsupervised and supervised machine learning algorithms are used to correlate features from low fidelity solutions to fidelity assignment. The effectiveness of the method is demonstrated in a problem of viscous fluid flow around a cylinder at Re ≈ 20. Ling et al. built physics-informed invariance and symmetry properties into machine learning models and demonstrated improved model generalizability. Along these lines, we avoid using problem dependent features such as coordinates of sample points, object geometry or flow conditions as explicit inputs to the machine learning model. Use of pointwise flow features generates large data sets from only one or two high fidelity simulations, and the fidelity predictor model achieved 99.5% accuracy at training points. The trained model was shown to be capable of predicting a fidelity map for a problem with an altered cylinder radius. A significant improvement in the prediction performance was seen when inputs are expanded to include multiscale features that incorporate neighborhood information.

Download Full-text

Insider Threat Detection Using Supervised Machine Learning Algorithms on an Extremely Imbalanced Dataset

International Journal of Cyber Warfare and Terrorism ◽

10.4018/ijcwt.2020040101 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1-26

Author(s):

Naghmeh Moradpoor Sheykhkanloo ◽

Adam Hall

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Third Party ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Insider Threat ◽

Threat Detection ◽

Imbalanced Dataset ◽

The Impact

An insider threat can take on many forms and fall under different categories. This includes malicious insider, careless/unaware/uneducated/naïve employee, and the third-party contractor. Machine learning techniques have been studied in published literature as a promising solution for such threats. However, they can be biased and/or inaccurate when the associated dataset is hugely imbalanced. Therefore, this article addresses the insider threat detection on an extremely imbalanced dataset which includes employing a popular balancing technique known as spread subsample. The results show that although balancing the dataset using this technique did not improve performance metrics, it did improve the time taken to build the model and the time taken to test the model. Additionally, the authors realised that running the chosen classifiers with parameters other than the default ones has an impact on both balanced and imbalanced scenarios, but the impact is significantly stronger when using the imbalanced dataset.

Download Full-text

Automated Tongue Feature Extraction for ZHENG Classification in Traditional Chinese Medicine

Evidence-based Complementary and Alternative Medicine ◽

10.1155/2012/912852 ◽

2012 ◽

Vol 2012 ◽

pp. 1-14 ◽

Cited By ~ 33

Author(s):

Ratchadaporn Kanawong ◽

Tayo Obafemi-Ajayi ◽

Tao Ma ◽

Dong Xu ◽

Shao Li ◽

...

Keyword(s):

Machine Learning ◽

Chinese Medicine ◽

Traditional Chinese Medicine ◽

Color Space ◽

Image Features ◽

Machine Learning Algorithms ◽

Disease Classification ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Control Group

ZHENG, Traditional Chinese Medicine syndrome, is an integral and essential part of Traditional Chinese Medicine theory. It defines the theoretical abstraction of the symptom profiles of individual patients and thus, used as a guideline in disease classification in Chinese medicine. For example, patients suffering from gastritis may be classified as Cold or Hot ZHENG, whereas patients with different diseases may be classified under the same ZHENG. Tongue appearance is a valuable diagnostic tool for determining ZHENG in patients. In this paper, we explore new modalities for the clinical characterization of ZHENG using various supervised machine learning algorithms. We propose a novel-color-space-based feature set, which can be extracted from tongue images of clinical patients to build an automated ZHENG classification system. Given that Chinese medical practitioners usually observe the tongue color and coating to determine a ZHENG type and to diagnose different stomach disorders including gastritis, we propose using machine-learning techniques to establish the relationship between the tongue image features and ZHENG by learning through examples. The experimental results obtained over a set of 263 gastritis patients, most of whom suffering Cold Zheng or Hot ZHENG, and a control group of 48 healthy volunteers demonstrate an excellent performance of our proposed system.

Download Full-text