scholarly journals Amino Acid k-mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights

Biology ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 365
Author(s):  
Taha ValizadehAslani ◽  
Zhengqiao Zhao ◽  
Bahrad A. Sokhansanj ◽  
Gail L. Rosen

Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide k-mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid k-mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide k-mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately.

2021 ◽  
pp. 1-15
Author(s):  
Mohammed Ayub ◽  
El-Sayed M. El-Alfy

Web technology has become an indispensable part in human’s life for almost all activities. On the other hand, the trend of cyberattacks is on the rise in today’s modern Web-driven world. Therefore, effective countermeasures for the analysis and detection of malicious websites is crucial to combat the rising threats to the cyber world security. In this paper, we systematically reviewed the state-of-the-art techniques and identified a total of about 230 features of malicious websites, which are classified as internal and external features. Moreover, we developed a toolkit for the analysis and modeling of malicious websites. The toolkit has implemented several types of feature extraction methods and machine learning algorithms, which can be used to analyze and compare different approaches to detect malicious URLs. Moreover, the toolkit incorporates several other options such as feature selection and imbalanced learning with flexibility to be extended to include more functionality and generalization capabilities. Moreover, some use cases are demonstrated for different datasets.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5779
Author(s):  
Runqiong Wang ◽  
Qinghua Song ◽  
Zhanqiang Liu ◽  
Haifeng Ma ◽  
Munish Kumar Gupta ◽  
...  

Data-driven chatter detection techniques avoid complex physical modeling and provide the basis for industrial applications of cutting process monitoring. Among them, feature extraction is the key step of chatter detection, which can compensate for the accuracy disadvantage of machine learning algorithms to some extent if the extracted features are highly correlated with the milling condition. However, the classification accuracy of the current feature extraction methods is not satisfactory, and a combination of multiple features is required to identify the chatter. This limits the development of unsupervised machine learning algorithms for chattering detection, which further affects the application in practical processing. In this paper, the fractal feature of the signal is extracted by structure function method (SFM) for the first time, which solves the problem that the features are easily affected by process parameters. Milling chatter is identified based on k-means algorithm, which avoids the complex process of training model, and the judgment method of milling chatter is also discussed. The proposed method can achieve 94.4% identification accuracy by using only one single signal feature, which is better than other feature extraction methods, and even better than some supervised machine learning algorithms. Moreover, experiments show that chatter will affect the distribution of cutting bending moment, and it is not reliable to monitor tool wear through the polar plot of the bending moment. This provides a theoretical basis for the application of unsupervised machine learning algorithms in chatter detection.


Entropy ◽  
2020 ◽  
Vol 22 (11) ◽  
pp. 1310
Author(s):  
Ioannis Triantafyllou ◽  
Ioannis C. Drivas ◽  
Georgios Giannakopoulos

Acquiring knowledge about users’ opinion and what they say regarding specific features within an app, constitutes a solid steppingstone for understanding their needs and concerns. App review utilization helps project management teams to identify threads and opportunities for app software maintenance, optimization and strategic marketing purposes. Nevertheless, app user review classification for identifying valuable gems of information for app software improvement, is a complex and multidimensional issue. It requires foresight and multiple combinations of sophisticated text pre-processing, feature extraction and machine learning methods to efficiently classify app reviews into specific topics. Against this backdrop, we propose a novel feature engineering classification schema that is capable to identify more efficiently and earlier terms-words within reviews that could be classified into specific topics. For this reason, we present a novel feature extraction method, the DEVMAX.DF combined with different machine learning algorithms to propose a solution in app review classification problems. One step further, a simulation of a real case scenario takes place to validate the effectiveness of the proposed classification schema into different apps. After multiple experiments, results indicate that the proposed schema outperforms other term extraction methods such as TF.IDF and χ2 to classify app reviews into topics. To this end, the paper contributes to the knowledge expansion of research and practitioners with the purpose to reinforce their decision-making process within the realm of app reviews utilization.


Sign in / Sign up

Export Citation Format

Share Document