scholarly journals Heterogeneous Graph Matching Networks for Unknown Malware Detection

Author(s):  
Shen Wang ◽  
Zhengzhang Chen ◽  
Xiao Yu ◽  
Ding Li ◽  
Jingchao Ni ◽  
...  

Information systems have widely been the target of malware attacks. Traditional signature-based malicious program detection algorithms can only detect known malware and are prone to evasion techniques such as binary obfuscation, while behavior-based approaches highly rely on the malware training samples and incur prohibitively high training cost. To address the limitations of existing techniques, we propose MatchGNet, a heterogeneous Graph Matching Network model to learn the graph representation and similarity metric simultaneously based on the invariant graph modeling of the program's execution behaviors. We conduct a systematic evaluation of our model and show that it is accurate in detecting malicious program behavior and can help detect malware attacks with less false positives. MatchGNet outperforms the state-of-the-art algorithms in malware detection by generating 50% less false positives while keeping zero false negatives.

Entropy ◽  
2021 ◽  
Vol 23 (4) ◽  
pp. 395
Author(s):  
Héctor D. Menéndez ◽  
David Clark ◽  
Earl T. Barr

Malware detection is in a coevolutionary arms race where the attackers and defenders are constantly seeking advantage. This arms race is asymmetric: detection is harder and more expensive than evasion. White hats must be conservative to avoid false positives when searching for malicious behaviour. We seek to redress this imbalance. Most of the time, black hats need only make incremental changes to evade them. On occasion, white hats make a disruptive move and find a new technique that forces black hats to work harder. Examples include system calls, signatures and machine learning. We present a method, called Hothouse, that combines simulation and search to accelerate the white hat’s ability to counter the black hat’s incremental moves, thereby forcing black hats to perform disruptive moves more often. To realise Hothouse, we evolve EEE, an entropy-based polymorphic packer for Windows executables. Playing the role of a black hat, EEE uses evolutionary computation to disrupt the creation of malware signatures. We enter EEE into the detection arms race with VirusTotal, the most prominent cloud service for running anti-virus tools on software. During our 6 month study, we continually improved EEE in response to VirusTotal, eventually learning a packer that produces packed malware whose evasiveness goes from an initial 51.8% median to 19.6%. We report both how well VirusTotal learns to detect EEE-packed binaries and how well VirusTotal forgets in order to reduce false positives. VirusTotal’s tools learn and forget fast, actually in about 3 days. We also show where VirusTotal focuses its detection efforts, by analysing EEE’s variants.


2021 ◽  
Vol 11 (19) ◽  
pp. 9243
Author(s):  
Jože Rožanec ◽  
Elena Trajkova ◽  
Klemen Kenda ◽  
Blaž Fortuna ◽  
Dunja Mladenić

While increasing empirical evidence suggests that global time series forecasting models can achieve better forecasting performance than local ones, there is a research void regarding when and why the global models fail to provide a good forecast. This paper uses anomaly detection algorithms and explainable artificial intelligence (XAI) to answer when and why a forecast should not be trusted. To address this issue, a dashboard was built to inform the user regarding (i) the relevance of the features for that particular forecast, (ii) which training samples most likely influenced the forecast outcome, (iii) why the forecast is considered an outlier, and (iv) provide a range of counterfactual examples to understand how value changes in the feature vector can lead to a different outcome. Moreover, a modular architecture and a methodology were developed to iteratively remove noisy data instances from the train set, to enhance the overall global time series forecasting model performance. Finally, to test the effectiveness of the proposed approach, it was validated on two publicly available real-world datasets.


2020 ◽  
Vol 14 (3) ◽  
pp. 95-114
Author(s):  
Ravi Kiran Varma Penmatsa ◽  
Akhila Kalidindi ◽  
S. Kumar Reddy Mallidi

Malware is a malicious program that can cause a security breach of a system. Malware detection and classification is one of the burning topics of research in information security. Executable files are the major source of input for static malware detection. Machine learning techniques are very efficient in behavioral-based malware detection and need a dataset of malware with different features. In windows, malware can be detected by analyzing the portable executable (PE) files. This work contributes to identifying the minimum feature set for malware detection employing a rough set dependent feature significance combined with Ant Colony Optimization (ACO) as the heuristic-search technique. A malware dataset named claMP with both integrated features and raw features was considered as the benchmark dataset for this work. The analytical results prove that 97.15% and 92.8% data size optimization has been achieved with a minimum loss of accuracy for claMP integrated and raw datasets, respectively.


2020 ◽  
Vol 34 (06) ◽  
pp. 10369-10376
Author(s):  
Peng Gao ◽  
Hao Zhang

Loop closure detection is a fundamental problem for simultaneous localization and mapping (SLAM) in robotics. Most of the previous methods only consider one type of information, based on either visual appearances or spatial relationships of landmarks. In this paper, we introduce a novel visual-spatial information preserving multi-order graph matching approach for long-term loop closure detection. Our approach constructs a graph representation of a place from an input image to integrate visual-spatial information, including visual appearances of the landmarks and the background environment, as well as the second and third-order spatial relationships between two and three landmarks, respectively. Furthermore, we introduce a new formulation that formulates loop closure detection as a multi-order graph matching problem to compute a similarity score directly from the graph representations of the query and template images, instead of performing conventional vector-based image matching. We evaluate the proposed multi-order graph matching approach based on two public long-term loop closure detection benchmark datasets, including the St. Lucia and CMU-VL datasets. Experimental results have shown that our approach is effective for long-term loop closure detection and it outperforms the previous state-of-the-art methods.


2018 ◽  
Vol 12 (3) ◽  
pp. 599-607 ◽  
Author(s):  
Daniel P. Howsmon ◽  
Nihat Baysal ◽  
Bruce A. Buckingham ◽  
Gregory P. Forlenza ◽  
Trang T. Ly ◽  
...  

Background: As evidence emerges that artificial pancreas systems improve clinical outcomes for patients with type 1 diabetes, the burden of this disease will hopefully begin to be alleviated for many patients and caregivers. However, reliance on automated insulin delivery potentially means patients will be slower to act when devices stop functioning appropriately. One such scenario involves an insulin infusion site failure, where the insulin that is recorded as delivered fails to affect the patient’s glucose as expected. Alerting patients to these events in real time would potentially reduce hyperglycemia and ketosis associated with infusion site failures. Methods: An infusion site failure detection algorithm was deployed in a randomized crossover study with artificial pancreas and sensor-augmented pump arms in an outpatient setting. Each arm lasted two weeks. Nineteen participants wore infusion sets for up to 7 days. Clinicians contacted patients to confirm infusion site failures detected by the algorithm and instructed on set replacement if failure was confirmed. Results: In real time and under zone model predictive control, the infusion site failure detection algorithm achieved a sensitivity of 88.0% (n = 25) while issuing only 0.22 false positives per day, compared with a sensitivity of 73.3% (n = 15) and 0.27 false positives per day in the SAP arm (as indicated by retrospective analysis). No association between intervention strategy and duration of infusion sets was observed ( P = .58). Conclusions: As patient burden is reduced by each generation of advanced diabetes technology, fault detection algorithms will help ensure that patients are alerted when they need to manually intervene. Clinical Trial Identifier: www.clinicaltrials.gov,NCT02773875


Author(s):  
Adam M. Pike ◽  
Jordan Whitney ◽  
Thomas Hedblom ◽  
Susannah Clear

This study is a preliminary investigation of the effects of levels of wet retroreflectivity of pavement markings on factors that determine robust feature detection in machine vision and light detection and ranging (LiDAR) systems in continuously wet road conditions. Luminance and Weber contrast of a range of pavement markings were characterized as functions of wet retroreflectivity and distance based on calibrated charge-coupled device (CCD) camera measurements. Both were found to trend with wet retroflectivity over the range of distances considered in this study. Artifacts arising from glare sources in wet conditions and their intensities relative to pavement markings of different wet retroreflectivity levels were demonstrated. Image data suggests that markings with high wet retroreflectivity may help to mitigate identification of these artifacts as false positives in lane awareness/lane detection algorithms. As LiDAR presents a viable sensor fusion approach to identifying and avoiding these false positives and artifacts in both nighttime wet and daytime wet road conditions, LiDAR return was characterized on pavement markings comprising both optics designed only for dry retroreflectivity and optics designed to be retroreflective in both dry and wet conditions. Preliminary results suggest that for common pavement marking constructions based on exposed beaded optics that might be completely immersed by a rainstorm or puddling, incorporation of high index (n~2.4) wet retroreflective beaded optics is likely to be advantageous to both visible machine vision systems and LiDAR for detection of those retroreflective markings in both night and day.


2020 ◽  
Vol 17 (4A) ◽  
pp. 607-614
Author(s):  
Mohammad Abuthawabeh ◽  
Khaled Mahmoud

Signature-based malware detection algorithms are facing challenges to cope with the massive number of threats in the Android environment. In this paper, conversation-level network traffic features are extracted and used in a supervised-based model. This model was used to enhance the process of Android malware detection, categorization, and family classification. The model employs the ensemble learning technique in order to select the most useful features among the extracted features. A real-world dataset called CICAndMal2017 was used in this paper. The results show that Extra-trees classifier had achieved the highest weighted accuracy percentage among the other classifiers by 87.75%, 79.97%, and 66.71%for malware detection, malware categorization, and malware family classification respectively. A comparison with another study that uses the same dataset was made. This study has achieved a significant enhancement in malware family classification and malware categorization. For malware family classification, the enhancement was 39.71% for precision and 41.09% for recall. The rate of enhancement for the Android malware categorization was 30.2% and 31.14‬% for precision and recall, respectively


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Chao Wang ◽  
Zhizhong Wu ◽  
Xi Li ◽  
Xuehai Zhou ◽  
Aili Wang ◽  
...  

This paper presents SmartMal—a novel service-oriented behavioral malware detection framework for vehicular and mobile devices. The highlight of SmartMal is to introduce service-oriented architecture (SOA) concepts and behavior analysis into the malware detection paradigms. The proposed framework relies on client-server architecture, the client continuously extracts various features and transfers them to the server, and the server’s main task is to detect anomalies using state-of-art detection algorithms. Multiple distributed servers simultaneously analyze the feature vector using various detectors and information fusion is used to concatenate the results of detectors. We also propose a cycle-based statistical approach for mobile device anomaly detection. We accomplish this by analyzing the users’ regular usage patterns. Empirical results suggest that the proposed framework and novel anomaly detection algorithm are highly effective in detecting malware on Android devices.


2021 ◽  
Vol 50 (3) ◽  
pp. 27-28
Author(s):  
Immanuel Trummer

Introduction. We have seen significant advances in the state of the art in natural language processing (NLP) over the past few years [20]. These advances have been driven by new neural network architectures, in particular the Transformer model [19], as well as the successful application of transfer learning approaches to NLP [13]. Typically, training for specific NLP tasks starts from large language models that have been pre-trained on generic tasks (e.g., predicting obfuscated words in text [5]) for which large amounts of training data are available. Using such models as a starting point reduces task-specific training cost as well as the number of required training samples by orders of magnitude [7]. These advances motivate new use cases for NLP methods in the context of databases.


Sign in / Sign up

Export Citation Format

Share Document