scholarly journals Malware Variant Identification Using Incremental Clustering

Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1628
Author(s):  
Paul Black ◽  
Iqbal Gondal ◽  
Adil Bagirov ◽  
Md Moniruzzaman

Dynamic analysis and pattern matching techniques are widely used in industry, and they provide a straightforward method for the identification of malware samples. Yara is a pattern matching technique that can use sandbox memory dumps for the identification of malware families. However, pattern matching techniques fail silently due to minor code variations, leading to unidentified malware samples. This paper presents a two-layered Malware Variant Identification using Incremental Clustering (MVIIC) process and proposes clustering of unidentified malware samples to enable the identification of malware variants and new malware families. The novel incremental clustering algorithm is used in the identification of new malware variants from the unidentified malware samples. This research shows that clustering can provide a higher level of performance than Yara rules, and that clustering is resistant to small changes introduced by malware variants. This paper proposes a hybrid approach, using Yara scanning to eliminate known malware, followed by clustering, acting in concert, to allow the identification of new malware variants. F1 score and V-Measure clustering metrics are used to evaluate our results.

2021 ◽  
Vol 13 (9) ◽  
pp. 4648
Author(s):  
Rana Muhammad Adnan ◽  
Kulwinder Singh Parmar ◽  
Salim Heddam ◽  
Shamsuddin Shahid ◽  
Ozgur Kisi

The accurate estimation of suspended sediments (SSs) carries significance in determining the volume of dam storage, river carrying capacity, pollution susceptibility, soil erosion potential, aquatic ecological impacts, and the design and operation of hydraulic structures. The presented study proposes a new method for accurately estimating daily SSs using antecedent discharge and sediment information. The novel method is developed by hybridizing the multivariate adaptive regression spline (MARS) and the Kmeans clustering algorithm (MARS–KM). The proposed method’s efficacy is established by comparing its performance with the adaptive neuro-fuzzy system (ANFIS), MARS, and M5 tree (M5Tree) models in predicting SSs at two stations situated on the Yangtze River of China, according to the three assessment measurements, RMSE, MAE, and NSE. Two modeling scenarios are employed; data are divided into 50–50% for model training and testing in the first scenario, and the training and test data sets are swapped in the second scenario. In Guangyuan Station, the MARS–KM showed a performance improvement compared to ANFIS, MARS, and M5Tree methods in term of RMSE by 39%, 30%, and 18% in the first scenario and by 24%, 22%, and 8% in the second scenario, respectively, while the improvement in RMSE of ANFIS, MARS, and M5Tree was 34%, 26%, and 27% in the first scenario and 7%, 16%, and 6% in the second scenario, respectively, at Beibei Station. Additionally, the MARS–KM models provided much more satisfactory estimates using only discharge values as inputs.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Monika Jurkeviciute ◽  
Amia Enam ◽  
Johanna Torres-Bonilla ◽  
Henrik Eriksson

Abstract Background Summative eHealth evaluations frequently lack quality, which affects the generalizability of the evidence, and its use in practice and further research. To guarantee quality, a number of activities are recommended in the guidelines for evaluation planning. This study aimed to examine a case of an eHealth evaluation planning in a multi-national and interdisciplinary setting and to provide recommendations for eHealth evaluation planning guidelines. Methods An empirical eHealth evaluation process was developed through a case study. The empirical process was compared with selected guidelines for eHealth evaluation planning using a pattern-matching technique. Results Planning in the interdisciplinary and multi-national team demanded extensive negotiation and alignment to support the future use of the evidence created. The evaluation planning guidelines did not provide specific strategies for different set-ups of the evaluation teams. Further, they did not address important aspects of quality evaluation, such as feasibility analysis of the outcome measures and data collection, monitoring of data quality, and consideration of the methods and measures employed in similar evaluations. Conclusions Activities to prevent quality problems need to be incorporated in the guidelines for evaluation planning. Additionally, evaluators could benefit from guidance in evaluation planning related to the different set-ups of the evaluation teams.


2019 ◽  
Author(s):  
Suhas Srinivasan ◽  
Nathan T. Johnson ◽  
Dmitry Korkin

AbstractSingle-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine-grained discovery of cellular subtypes and specific cell states. It routinely uses machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering novel information from scRNA-seq data. However, current methods are not well suited to deal with the substantial amounts of noise that is created by the experiments or the variation that occurs due to differences in the cells of the same type. Here, we develop a new hybrid approach, Deep Unsupervised Single-cell Clustering (DUSC), that integrates feature generation based on a deep learning architecture with a model-based clustering algorithm, to find a compact and informative representation of the single-cell transcriptomic data generating robust clusters. We also include a technique to estimate an efficient number of latent features in the deep learning model. Our method outperforms both classical and state-of-the-art feature learning and clustering methods, approaching the accuracy of supervised learning. The method is freely available to the community and will hopefully facilitate our understanding of the cellular atlas of living organisms as well as provide the means to improve patient diagnostics and treatment.


Sign in / Sign up

Export Citation Format

Share Document