Machine Learning-Enriched Lamb Wave Approaches for Automated Damage Detection

Lamb wave approaches have been accepted as efficiently non-destructive evaluations in structural health monitoring for identifying damage in different states. Despite significant efforts in signal process of Lamb waves, physics-based prediction is still a big challenge due to complexity nature of the Lamb wave when it propagates, scatters and disperses. Machine learning in recent years has created transformative opportunities for accelerating knowledge discovery and accurately disseminating information where conventional Lamb wave approaches cannot work. Therefore, the learning framework was proposed with a workflow from dataset generation, to sensitive feature extraction, to prediction model for lamb-wave-based damage detection. A total of 17 damage states in terms of different damage type, sizes and orientations were designed to train the feature extraction and sensitive feature selection. A machine learning method, support vector machine (SVM), was employed for the learning model. A grid searching (GS) technique was adopted to optimize the parameters of the SVM model. The results show that the machine learning-enriched Lamb wave-based damage detection method is an efficient and accuracy wave to identify the damage severity and orientation. Results demonstrated that different features generated from different domains had certain levels of sensitivity to damage, while the feature selection method revealed that time-frequency features and wavelet coefficients exhibited the highest damage-sensitivity. These features were also much more robust to noise. With increase of noise, the accuracy of the classification dramatically dropped.

Download Full-text

Design of Text Categorization System Based on SVM

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1191 ◽

2012 ◽

Vol 532-533 ◽

pp. 1191-1195 ◽

Cited By ~ 1

Author(s):

Zhen Yan Liu ◽

Wei Ping Wang ◽

Yong Wang

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Extraction Methods ◽

Support Vector ◽

Text Representation ◽

Text Feature ◽

Categorization System ◽

Classifier Training

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Download Full-text

Diagnostic Performance of 2D and 3D T2WI-Based Radiomics Features With Machine Learning Algorithms to Distinguish Solid Solitary Pulmonary Lesion

Frontiers in Oncology ◽

10.3389/fonc.2021.683587 ◽

2021 ◽

Vol 11 ◽

Author(s):

Qi Wan ◽

Jiaxuan Zhou ◽

Xiaoying Xia ◽

Jianfeng Hu ◽

Peng Wang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Diagnostic Performance ◽

Feature Selection Method ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

Selection Methods ◽

Linear Discriminant ◽

2D And 3D

ObjectiveTo evaluate the performance of 2D and 3D radiomics features with different machine learning approaches to classify SPLs based on magnetic resonance(MR) T2 weighted imaging (T2WI).Material and MethodsA total of 132 patients with pathologically confirmed SPLs were examined and randomly divided into training (n = 92) and test datasets (n = 40). A total of 1692 3D and 1231 2D radiomics features per patient were extracted. Both radiomics features and clinical data were evaluated. A total of 1260 classification models, comprising 3 normalization methods, 2 dimension reduction algorithms, 3 feature selection methods, and 10 classifiers with 7 different feature numbers (confined to 3–9), were compared. The ten-fold cross-validation on the training dataset was applied to choose the candidate final model. The area under the receiver operating characteristic curve (AUC), precision-recall plot, and Matthews Correlation Coefficient were used to evaluate the performance of machine learning approaches.ResultsThe 3D features were significantly superior to 2D features, showing much more machine learning combinations with AUC greater than 0.7 in both validation and test groups (129 vs. 11). The feature selection method Analysis of Variance(ANOVA), Recursive Feature Elimination(RFE) and the classifier Logistic Regression(LR), Linear Discriminant Analysis(LDA), Support Vector Machine(SVM), Gaussian Process(GP) had relatively better performance. The best performance of 3D radiomics features in the test dataset (AUC = 0.824, AUC-PR = 0.927, MCC = 0.514) was higher than that of 2D features (AUC = 0.740, AUC-PR = 0.846, MCC = 0.404). The joint 3D and 2D features (AUC=0.813, AUC-PR = 0.926, MCC = 0.563) showed similar results as 3D features. Incorporating clinical features with 3D and 2D radiomics features slightly improved the AUC to 0.836 (AUC-PR = 0.918, MCC = 0.620) and 0.780 (AUC-PR = 0.900, MCC = 0.574), respectively.ConclusionsAfter algorithm optimization, 2D feature-based radiomics models yield favorable results in differentiating malignant and benign SPLs, but 3D features are still preferred because of the availability of more machine learning algorithmic combinations with better performance. Feature selection methods ANOVA and RFE, and classifier LR, LDA, SVM and GP are more likely to demonstrate better diagnostic performance for 3D features in the current study.

Download Full-text

Implementasi teknik seleksi fitur pada klasifikasi malware Android menggunakan support vector machine (SVM)

Repositor ◽

10.22219/repositor.v1i1.1 ◽

2019 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Hendra Saputra ◽

Setio Basuki ◽

Mahar Faiqurahman

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Chi Square ◽

Android Malware ◽

Correlation Based Feature Selection ◽

Selection Of

AbstrakPertumbuhan Malware Android telah meningkat secara signifikan seiring dengan majunya jaman dan meninggkatnya keragaman teknik dalam pengembangan Android. Teknik Machine Learning adalah metode yang saat ini bisa kita gunakan dalam memodelkan pola fitur statis dan dinamis dari Malware Android. Dalam tingkat keakurasian dari klasifikasi jenis Malware peneliti menghubungkan antara fitur aplikasi dengan fitur yang dibutuhkan dari setiap jenis kategori Malware. Kategori jenis Malware yang digunakan merupakan jenis Malware yang banyak beredar saat ini. Untuk mengklasifikasi jenis Malware pada penelitian ini digunakan Support Vector Machine (SVM). Jenis SVM yang akan digunakan adalah class SVM one against one menggunakan Kernel RBF. Fitur yang akan dipakai dalam klasifikasi ini adalah Permission dan Broadcast Receiver. Untuk meningkatkan akurasi dari hasil klasifikasi pada penelitian ini digunakan metode Seleksi Fitur. Seleksi Fitur yang digunakan ialah Correlation-based Feature Selection (CSF), Gain Ratio (GR) dan Chi-Square (CHI). Hasil dari Seleksi Fitur akan di evaluasi bersama dengan hasil yang tidak menggunakan Seleksi Fitur. Akurasi klasifikasi Seleksi Fitur CFS menghasilkan akurasi sebesar 90.83% , GR dan CHI sebesar 91.25% dan data yang tidak menggunakan Seleksi Fitur sebesar 91.67%. Hasil dari pengujian menunjukan bahwa Permission dan Broadcast Receiver bisa digunakan dalam mengklasifikasi jenis Malware, akan tetapi metode Seleksi Fitur yang digunakan mempunyai akurasi yang berada sedikit dibawah data yang tidak menggunakan Seleksi Fitur. Kata kunci: klasifikasi malware android, seleksi fitur, SVM dan multi class SVM one agains one Abstract Android Malware has growth significantly along with the advance of the times and the increasing variety of technique in the development of Android. Machine Learning technique is a method that now we can use in the modeling the pattern of a static and dynamic feature of Android Malware. In the level of accuracy of the Malware type classification, the researcher connect between the application feature with the feature required by each types of Malware category. The category of malware used is a type of Malware that many circulating today, to classify the type of Malware in this study used Support Vector Machine (SVM). The SVM type wiil be used is class SVM one against one using the RBF Kernel. The feature will be used in this classification are the Permission and Broadcast Receiver. To improve the accuracy of the classification result in this study used Feature Selection method. Selection of feature used are Correlation-based Feature Selection (CFS), Gain Ratio (GR) and Chi-Square (CHI). Result from Feature Selection will be evaluated together with result that not use Feature Selection. Accuracy Classification Feature Selection CFS result accuracy of 90.83%, GR and CHI of 91.25% and data that not use Feature Selection of 91.67%. The result of testing indicate that permission and broadcast receiver can be used in classyfing type of Malware, but the Feature Selection method that used have accuracy is a little below the data that are not using Feature Selection. Keywords: Classification Android Malware, Feature Selection, SVM and Multi Class SVM one against one

Download Full-text

Classification of Crop Residue Cover in High-Resolution RGB Images Using Machine Learning

Journal of the ASABE ◽

10.13031/ja.14572 ◽

2022 ◽

Vol 65 (1) ◽

pp. 75-86

Author(s):

Parth C. Upadhyay ◽

John A. Lory ◽

Guilherme N. DeSouza ◽

Timotius A. P. Lagaunne ◽

Christine M. Spinka

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Feature Selection Method ◽

Texture Features ◽

Ground Truth ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Aerial Vehicle ◽

Rgb Images

HighlightsA machine learning framework estimated residue cover in RGB images taken at three resolutions from 88 locations.The best results primarily used texture features, the RFE-SVM feature selection method, and the SVM classifier.Accounting for shadows and plants plus modifying and optimizing the texture features may improve performance.An automated system developed using machine learning is a viable strategy to estimate residue cover from RGB images obtained with handheld or UAV platforms.Abstract. Maintaining plant residue on the soil surface contributes to sustainable cultivation of arable land. Applying machine learning methods to RGB images of residue could overcome the subjectivity of manual methods. The objectives of this study were to use supervised machine learning while identifying the best feature selection method, the best classifier, and the most effective image feature types for classifying residue levels in RGB imagery. Imagery was collected from 88 locations in 40 row-crop fields in five Missouri counties between early May and late June in 2018 and 2019 using a tripod-mounted camera (0.014 cm pixel-1 ground sampling distance, GSD) and an unmanned aerial vehicle (UAV, 0.05 and 0.14 GSD). At each field location, 50 contiguous 0.3 × 0.2 m region of interest (ROI) images were extracted from the imagery, resulting in a dataset of 4,400 ROI images at each GSD. Residue percentages for ground truth were estimated using a bullseye grid method (n = 100 points) based on the 0.014 GSD images. Representative color, texture, and shape features were extracted and evaluated using four feature selection methods and two classifiers. Recursive feature elimination using support vector machine (RFE-SVM) was the best feature selection method, and the SVM classifier performed best for classifying the amount of residue as a three-class problem. The best features for this application were associated with texture, with local binary pattern (LBP) features being the most prevalent for all three GSDs. Shape features were irrelevant. The three residue classes were correctly identified with 88%, 84%, and 81% 10-fold cross-validation scores for the 2018 training data and 81%, 69%, and 65% accuracy for the 2019 testing data in decreasing resolution order. Converting image-wise data (0.014 GSD) to location residue estimates using a Bayesian model showed good agreement with the location-based ground truth (r2 = 0.90). This initial assessment documents the use of RGB images to match other methods of estimating residue, with potential to replace or be used as a quality control for line-transect assessments. Keywords: Feature selection, Soil erosion, Support vector machine, Texture features, Unmanned aerial vehicle.

Download Full-text

A Hybrid Approach for Feature Selection Based on Genetic Algorithm and Recursive Feature Elimination

International Journal of Information System Modeling and Design ◽

10.4018/ijismd.2021040102 ◽

2021 ◽

Vol 12 (2) ◽

pp. 17-38

Author(s):

Pooja Rani ◽

Rajneesh Kumar ◽

Anurag Jain ◽

Sunil Kumar Chawla

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Feature Selection ◽

Hybrid Approach ◽

Feature Selection Method ◽

Classification Systems ◽

Recursive Feature Elimination ◽

Support Vector ◽

Sensitivity Specificity ◽

Selection Of

Machine learning has become an integral part of our life in today's world. Machine learning when applied to real-world applications suffers from the problem of high dimensional data. Data can have unnecessary and redundant features. These unnecessary features affect the performance of classification systems used in prediction. Selection of important features is the first step in developing any decision support system. In this paper, the authors have proposed a hybrid feature selection method GARFE by integrating GA (genetic algorithm) and RFE (recursive feature elimination) algorithms. Efficiency of proposed method is analyzed using support vector machine classifier on the scale of accuracy, sensitivity, specificity, precision, F-measure, and execution time parameters. Proposed GARFE method is also compared to eight other feature selection methods. Results demonstrate that the proposed GARFE method has increased the performance of classification systems by removing irrelevant and redundant features.

Download Full-text

Commercial Video Evaluation via Low-Level Feature Extraction and Selection

Advances in Multimedia ◽

10.1155/2018/2056381 ◽

2018 ◽

Vol 2018 ◽

pp. 1-20

Author(s):

Xiangmin Lun ◽

Mingxuan Wang ◽

Zhenglin Yu ◽

Yimin Hou

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Experiment Data ◽

Low Level ◽

Feature Extraction And Selection ◽

Source Data ◽

Correlation Based Feature Selection

To discover the influence of the commercial videos’ low-level features on the popularity of the videos, the feature selection method should be used to get the video features influencing the videos’ evaluation mostly after analyzing the source data and the audiences’ evaluations of the videos. After extracting the low-level features of the videos, this paper improved the Correlation-Based Feature Selection (CFS) method which is widely used and proposed an algorithm named CFS-Spearmen which combined the Spearmen correlation coefficient and the classical CFS to select features. The 4 datasets in UCI machine learning database were employed as the experiment data. The experiment results were compared with the results using traditional CFS, Minimum Redundancy and Maximum Relevance (mRMR). The SVM was used to test the method in this paper. Finally, the proposed method was used in commercial videos’ feature selection and the most influential feature set was obtained.

Download Full-text

Analysis of Expression Pattern of snoRNAs in Different Cancer Types with Machine Learning Algorithms

International Journal of Molecular Sciences ◽

10.3390/ijms20092185 ◽

2019 ◽

Vol 20 (9) ◽

pp. 2185 ◽

Cited By ~ 18

Author(s):

Xiaoyong Pan ◽

Lei Chen ◽

Kai-Yan Feng ◽

Xiao-Hua Hu ◽

Yu-Hang Zhang ◽

...

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Expression Pattern ◽

Learning Algorithms ◽

Expression Patterns ◽

Feature Selection Method ◽

Machine Learning Algorithms ◽

Support Vector ◽

Feature List ◽

Cancer Types

Small nucleolar RNAs (snoRNAs) are a new type of functional small RNAs involved in the chemical modifications of rRNAs, tRNAs, and small nuclear RNAs. It is reported that they play important roles in tumorigenesis via various regulatory modes. snoRNAs can both participate in the regulation of methylation and pseudouridylation and regulate the expression pattern of their host genes. This research investigated the expression pattern of snoRNAs in eight major cancer types in TCGA via several machine learning algorithms. The expression levels of snoRNAs were first analyzed by a powerful feature selection method, Monte Carlo feature selection (MCFS). A feature list and some informative features were accessed. Then, the incremental feature selection (IFS) was applied to the feature list to extract optimal features/snoRNAs, which can make the support vector machine (SVM) yield best performance. The discriminative snoRNAs included HBII-52-14, HBII-336, SNORD123, HBII-85-29, HBII-420, U3, HBI-43, SNORD116, SNORA73B, SCARNA4, HBII-85-20, etc., on which the SVM can provide a Matthew’s correlation coefficient (MCC) of 0.881 for predicting these eight cancer types. On the other hand, the informative features were fed into the Johnson reducer and repeated incremental pruning to produce error reduction (RIPPER) algorithms to generate classification rules, which can clearly show different snoRNAs expression patterns in different cancer types. The analysis results indicated that extracted discriminative snoRNAs can be important for identifying cancer samples in different types and the expression pattern of snoRNAs in different cancer types can be partly uncovered by quantitative recognition rules.

Download Full-text

Machine Learning and Feature Selection Methods for EGFR Mutation Status Prediction in Lung Cancer

Applied Sciences ◽

10.3390/app11073273 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3273

Author(s):

Joana Morgado ◽

Tania Pereira ◽

Francisco Silva ◽

Cláudia Freitas ◽

Eduardo Negrão ◽

...

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Feature Selection ◽

Egfr Mutation ◽

Feature Selection Method ◽

Principal Component ◽

Image Features ◽

Support Vector ◽

Selection Methods ◽

Mutation Status

The evolution of personalized medicine has changed the therapeutic strategy from classical chemotherapy and radiotherapy to a genetic modification targeted therapy, and although biopsy is the traditional method to genetically characterize lung cancer tumor, it is an invasive and painful procedure for the patient. Nodule image features extracted from computed tomography (CT) scans have been used to create machine learning models that predict gene mutation status in a noninvasive, fast, and easy-to-use manner. However, recent studies have shown that radiomic features extracted from an extended region of interest (ROI) beyond the tumor, might be more relevant to predict the mutation status in lung cancer, and consequently may be used to significantly decrease the mortality rate of patients battling this condition. In this work, we investigated the relation between image phenotypes and the mutation status of Epidermal Growth Factor Receptor (EGFR), the most frequently mutated gene in lung cancer with several approved targeted-therapies, using radiomic features extracted from the lung containing the nodule. A variety of linear, nonlinear, and ensemble predictive classification models, along with several feature selection methods, were used to classify the binary outcome of wild-type or mutant EGFR mutation status. The results show that a comprehensive approach using a ROI that included the lung with nodule can capture relevant information and successfully predict the EGFR mutation status with increased performance compared to local nodule analyses. Linear Support Vector Machine, Elastic Net, and Logistic Regression, combined with the Principal Component Analysis feature selection method implemented with 70% of variance in the feature set, were the best-performing classifiers, reaching Area Under the Curve (AUC) values ranging from 0.725 to 0.737. This approach that exploits a holistic analysis indicates that information from more extensive regions of the lung containing the nodule allows a more complete lung cancer characterization and should be considered in future radiogenomic studies.

Download Full-text

Damage detection of bridges based on combining efficient cepstral coefficients

Journal of Vibration and Control ◽

10.1177/1077546320958348 ◽

2020 ◽

pp. 107754632095834

Author(s):

Hossein Babajanian Bisheh ◽

Gholamreza Ghodrati Amiri ◽

Masoud Nekooei ◽

Ehsan Darvishan

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Damage Detection ◽

Structural Damage ◽

Damage Index ◽

Support Vector ◽

Selection Methods ◽

Significant Information ◽

Effective Coefficients ◽

Cepstral Coefficients

In this article, a novel vibration-based damage detection approach is proposed based on selecting effective cepstral coefficients, consisting of three main stages: (1) signal processing and feature extraction, (2) damage detection by combining effective cepstral coefficients through feature selection methods, and (3) performance evaluation. First, two feature extraction techniques are used in damage identification systems, including linear prediction cepstral coefficients and mel frequency cepstral coefficients. Second, to improve the performance of damage detection, the combination of the effective cepstral coefficients is proposed as a damage index. By applying several feature selection methods, the most effective coefficients are found and then combined to create a subset that carries the most significant information about the structural damage. Finally, the support vector machine classifier is performed to evaluate the proposed approach in detecting the structural damage. The proposed technique is verified using a suite of numerical and full-scale studies. Results confirm that the proposed method achieves a significant performance with great accuracy and reduces false alarms.

Download Full-text

Breast Cancer Prediction using SVM with PCA Feature Selection Method

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1952277 ◽

2019 ◽

pp. 969-978

Author(s):

Akshya Yadav ◽

Imlikumla Jamir ◽

Raj Rajeshwari Jain ◽

Mayank Sohani

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Learning Algorithm ◽

Feature Selection Method ◽

Selection Method ◽

Training Dataset ◽

Support Vector ◽

Improved Accuracy

Cancer has been characterized as one of the leading diseases that cause death in humans. Breast cancer, being a subtype of cancer, causes death in one out of every eight women worldwide. The solution to counter this is by conducting early and accurate diagnosis for faster treatment. To achieve such accuracy in a short span of time proves difficult with existing techniques. Also, the medical tests conducted in hospitals for detecting cancer is expensive and is difficult for any common man to afford. To counter these problems, in this paper, we use the concept of applying Support Vector machine a Machine Learning algorithm to predict whether a person is prone to breast cancer. We evaluate the performance of this algorithm by calculating its accuracy and apply a min-max scaling method so as to counter and overcome the problem of overfitting and outliers. After scaling of the dataset, we apply a feature selection method called Principle component analysis to improve the algorithms accuracy by decreasing the number of parameters. The final algorithm has improved accuracy with the absence of overfitting and outliers, thus this algorithm can be used to develop and build systems that can be deployed in clinics, hospitals and medical centers for early and quick diagnosis of breast cancer. The training dataset is from the University of Wisconsin (UCI) Machine Learning Repository which is used to evaluate the performance of the Support vector machine by calculating its accuracy.

Download Full-text