A Deep Learning Approach for Detection of Application Layer Attacks in Internet

Handling Priority Inversion in Time-Constrained Distributed Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2491-6.ch010 ◽

2020 ◽

pp. 175-188

Author(s):

V. Punitha ◽

C. Mala

Keyword(s):

Deep Learning ◽

Transport Layer ◽

Classification Model ◽

Learning Approach ◽

Ddos Attacks ◽

Application Layer ◽

Learning Models ◽

Technological Transformation ◽

Application Deployment ◽

Machine Learning Models

The recent technological transformation in application deployment, with the enriched availability of applications, induces the attackers to shift the target of the attack to the services provided by the application layer. Application layer DoS or DDoS attacks are launched only after establishing the connection to the server. They are stealthier than network or transport layer attacks. The existing defence mechanisms are unproductive in detecting application layer DoS or DDoS attacks. Hence, this chapter proposes a novel deep learning classification model using an autoencoder to detect application layer DDoS attacks by measuring the deviations in the incoming network traffic. The experimental results show that the proposed deep autoencoder model detects application layer attacks in HTTP traffic more proficiently than existing machine learning models.

Download Full-text

Understanding Natural Disaster Scenes from Mobile Images Using Deep Learning

Applied Sciences ◽

10.3390/app11093952 ◽

2021 ◽

Vol 11 (9) ◽

pp. 3952

Author(s):

Shimin Tang ◽

Zhiqiang Chen

Keyword(s):

Deep Learning ◽

Natural Disaster ◽

Scene Understanding ◽

Computing Methods ◽

Classification Model ◽

Learning Approach ◽

Learning Models ◽

Damage Level ◽

Feature Extractor ◽

Mobile Imaging

With the ubiquitous use of mobile imaging devices, the collection of perishable disaster-scene data has become unprecedentedly easy. However, computing methods are unable to understand these images with significant complexity and uncertainties. In this paper, the authors investigate the problem of disaster-scene understanding through a deep-learning approach. Two attributes of images are concerned, including hazard types and damage levels. Three deep-learning models are trained, and their performance is assessed. Specifically, the best model for hazard-type prediction has an overall accuracy (OA) of 90.1%, and the best damage-level classification model has an explainable OA of 62.6%, upon which both models adopt the Faster R-CNN architecture with a ResNet50 network as a feature extractor. It is concluded that hazard types are more identifiable than damage levels in disaster-scene images. Insights are revealed, including that damage-level recognition suffers more from inter- and intra-class variations, and the treatment of hazard-agnostic damage leveling further contributes to the underlying uncertainties.

Download Full-text

A Radiogenomics Ensemble to Predict EGFR and KRAS Mutations in NSCLC

Tomography ◽

10.3390/tomography7020014 ◽

2021 ◽

Vol 7 (2) ◽

pp. 154-168

Author(s):

Silvia Moreno ◽

Mario Bonfante ◽

Eduardo Zurek ◽

Dmitry Cherezov ◽

Dmitry Goldgof ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Kras Mutation ◽

Learning Approach ◽

Learning Models ◽

Kras Mutations ◽

Machine Learning Approach ◽

Class Average ◽

Public Datasets ◽

Machine Learning Models

Lung cancer causes more deaths globally than any other type of cancer. To determine the best treatment, detecting EGFR and KRAS mutations is of interest. However, non-invasive ways to obtain this information are not available. Furthermore, many times there is a lack of big enough relevant public datasets, so the performance of single classifiers is not outstanding. In this paper, an ensemble approach is applied to increase the performance of EGFR and KRAS mutation prediction using a small dataset. A new voting scheme, Selective Class Average Voting (SCAV), is proposed and its performance is assessed both for machine learning models and CNNs. For the EGFR mutation, in the machine learning approach, there was an increase in the sensitivity from 0.66 to 0.75, and an increase in AUC from 0.68 to 0.70. With the deep learning approach, an AUC of 0.846 was obtained, and with SCAV, the accuracy of the model was increased from 0.80 to 0.857. For the KRAS mutation, both in the machine learning models (0.65 to 0.71 AUC) and the deep learning models (0.739 to 0.778 AUC), a significant increase in performance was found. The results obtained in this work show how to effectively learn from small image datasets to predict EGFR and KRAS mutations, and that using ensembles with SCAV increases the performance of machine learning classifiers and CNNs. The results provide confidence that as large datasets become available, tools to augment clinical capabilities can be fielded.

Download Full-text

A Deep Learning Approach with Feature Derivation and Selection for Overdue Repayment Forecasting

Applied Sciences ◽

10.3390/app10238491 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8491

Author(s):

Bin Liu ◽

Zhexi Zhang ◽

Junchi Yan ◽

Ning Zhang ◽

Hongyuan Zha ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Short Term Memory ◽

Critical Time ◽

Learning Approach ◽

Learning Models ◽

Comparison Results ◽

Online Lending ◽

Machine Learning Models

Risk control has always been a major challenge in finance. Overdue repayment is a frequently encountered discreditable behavior in online lending. Motivated by the powerful capabilities of deep neural networks, we propose a fusion deep learning approach, namely AD-MBLSTM, based on the deep neural network (DNN), multi-layer bi-directional long short-term memory (LSTM) (BiLSTM) and the attention mechanism for overdue repayment behavior forecasting according to historical repayment records. Furthermore, we present a novel feature derivation and selection method for the procedure of data preprocessing. Visualization and interpretability improvement work is also implemented to explore the critical time points and causes of overdue repayment behavior. In addition, we present a new dataset originating from a practical application scenario in online lending. We evaluate our proposed framework on the dataset and compare the performance with various general machine learning models and neural network models. Comparison results and the ablation study demonstrate that our proposed model outperforms many effective general machine learning models by a large margin, and each indispensable sub-component takes an active role.

Download Full-text

Towards Generative Design of Computationally Efficient Mathematical Models with Evolutionary Learning

Entropy ◽

10.3390/e23010028 ◽

2020 ◽

Vol 23 (1) ◽

pp. 28

Author(s):

Anna V. Kalyuzhnaya ◽

Nikolay O. Nikitin ◽

Alexander Hvatov ◽

Mikhail Maslyaev ◽

Mikhail Yachmenkov ◽

...

Keyword(s):

Mathematical Models ◽

Learning Approach ◽

Model Structure ◽

Evolutionary Learning ◽

Learning Models ◽

Computationally Efficient ◽

Performance Models ◽

Generative Design ◽

Computational Resources ◽

Machine Learning Models

In this paper, we describe the concept of generative design approach applied to the automated evolutionary learning of mathematical models in a computationally efficient way. To formalize the problems of models’ design and co-design, the generalized formulation of the modeling workflow is proposed. A parallelized evolutionary learning approach for the identification of model structure is described for the equation-based model and composite machine learning models. Moreover, the involvement of the performance models in the design process is analyzed. A set of experiments with various models and computational resources is conducted to verify different aspects of the proposed approach.

Download Full-text

Machine Learning-Based Malicious X.509 Certificates’ Detection

Applied Sciences ◽

10.3390/app11052164 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2164

Author(s):

Jiaxin Li ◽

Zhaoxin Zhang ◽

Changyong Guo

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Ensemble Learning ◽

Traffic Analysis ◽

Learning Models ◽

Detection Model ◽

Analysis Tools ◽

Average Accuracy ◽

Machine Learning Models

X.509 certificates play an important role in encrypting the transmission of data on both sides under HTTPS. With the popularization of X.509 certificates, more and more criminals leverage certificates to prevent their communications from being exposed by malicious traffic analysis tools. Phishing sites and malware are good examples. Those X.509 certificates found in phishing sites or malware are called malicious X.509 certificates. This paper applies different machine learning models, including classical machine learning models, ensemble learning models, and deep learning models, to distinguish between malicious certificates and benign certificates with Verification for Extraction (VFE). The VFE is a system we design and implement for obtaining plentiful characteristics of certificates. The result shows that ensemble learning models are the most stable and efficient models with an average accuracy of 95.9%, which outperforms many previous works. In addition, we obtain an SVM-based detection model with an accuracy of 98.2%, which is the highest accuracy. The outcome indicates the VFE is capable of capturing essential and crucial characteristics of malicious X.509 certificates.

Download Full-text

A Physics-Infused Deep Learning Model for the Prediction of Refractive Indices and Its Use for the Large-Scale Screening of Organic Compound Space

10.26434/chemrxiv.8796950 ◽

2019 ◽

Author(s):

Mojtaba Haghighatlari ◽

Gaurav Vishwakarma ◽

Mohammad Atif Faiz Afzal ◽

Johannes Hachmann

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Organic Molecules ◽

Learning Model ◽

Training Data ◽

Refractive Indices ◽

Learning Models ◽

Deep Learning Model ◽

Machine Learning Models

<div><div><div><p>We present a multitask, physics-infused deep learning model to accurately and efficiently predict refractive indices (RIs) of organic molecules, and we apply it to a library of 1.5 million compounds. We show that it outperforms earlier machine learning models by a significant margin, and that incorporating known physics into data-derived models provides valuable guardrails. Using a transfer learning approach, we augment the model to reproduce results consistent with higher-level computational chemistry training data, but with a considerably reduced number of corresponding calculations. Prediction errors of machine learning models are typically smallest for commonly observed target property values, consistent with the distribution of the training data. However, since our goal is to identify candidates with unusually large RI values, we propose a strategy to boost the performance of our model in the remoter areas of the RI distribution: We bias the model with respect to the under-represented classes of molecules that have values in the high-RI regime. By adopting a metric popular in web search engines, we evaluate our effectiveness in ranking top candidates. We confirm that the models developed in this study can reliably predict the RIs of the top 1,000 compounds, and are thus able to capture their ranking. We believe that this is the first study to develop a data-derived model that ensures the reliability of RI predictions by model augmentation in the extrapolation region on such a large scale. These results underscore the tremendous potential of machine learning in facilitating molecular (hyper)screening approaches on a massive scale and in accelerating the discovery of new compounds and materials, such as organic molecules with high-RI for applications in opto-electronics.</p></div></div></div>

Download Full-text

Product Review Ranking in e-Commerce using Urgency Level Classification Approach

Jurnal Online Informatika ◽

10.15575/join.v5i2.612 ◽

2020 ◽

Vol 5 (2) ◽

pp. 212

Author(s):

Hamdi Ahmad Zuhri ◽

Nur Ulfa Maulidevi

Keyword(s):

Deep Learning ◽

Classification Model ◽

Support Vector ◽

Learning Models ◽

Classification Approach ◽

Value Range ◽

High Bias ◽

Product Domains ◽

Urgency Level ◽

Bayesian Support

Review ranking is useful to give users a better experience. Review ranking studies commonly use upvote value, which does not represent urgency, and it causes problems in prediction. In contrast, manual labeling as wide as the upvote value range provides a high bias and inconsistency. The proposed solution is to use a classification approach to rank the review where the labels are ordinal urgency class. The experiment involved shallow learning models (Logistic Regression, Naïve Bayesian, Support Vector Machine, and Random Forest), and deep learning models (LSTM and CNN). In constructing a classification model, the problem is broken down into several binary classifications that predict tendencies of urgency depending on the separation of classes. The result shows that deep learning models outperform other models in classification dan ranking evaluation. In addition, the review data used tend to contain vocabulary of certain product domains, so further research is needed on data with more diverse vocabulary.

Download Full-text

Application of Bioactivity Profile Based Fingerprints for Building Machine Learning Models

10.26434/chemrxiv.6969584 ◽

2018 ◽

Cited By ~ 1

Author(s):

Noé Sturm ◽

Jiangming Sun ◽

Yves Vandriessche ◽

Andreas Mayr ◽

Günter Klambauer ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Throughput ◽

Scaffold Hopping ◽

Learning Models ◽

Industrial Data ◽

Structural Descriptors ◽

Bioactivity Profile ◽

Machine Learning Models

<div>This article describes an application of high-throughput fingerprints (HTSFP) built upon industrial data accumulated over the years. </div><div>The fingerprint was used to build machine learning models (multi-task deep learning + SVM) for compound activity predictions towards a panel of 131 targets. </div><div>Quality of the predictions and the scaffold hopping potential of the HTSFP were systematically compared to traditional structural descriptors ECFP. </div><div><br></div>

Download Full-text

A Mask-guided Attention Deep Learning Model for COVID-19 Diagnosis based on an Integrated CT Scan Images Database

10.36227/techrxiv.18166667.v1 ◽

2022 ◽

Author(s):

Maede Maftouni ◽

Bo Shen ◽

Andrew Chung Chee Law ◽

Niloofar Ayoobi Yazdi ◽

Zhenyu Kong

Keyword(s):

Deep Learning ◽

Ct Scan ◽

Imaging Modality ◽

Learning Model ◽

Classification Performance ◽

Computer Assisted ◽

Learning Approach ◽

Learning Models ◽

Task Learning ◽

Data Efficiency

<p>The global extent of COVID-19 mutations and the consequent depletion of hospital resources highlighted the necessity of effective computer-assisted medical diagnosis. COVID-19 detection mediated by deep learning models can help diagnose this highly contagious disease and lower infectivity and mortality rates. Computed tomography (CT) is the preferred imaging modality for building automatic COVID-19 screening and diagnosis models. It is well-known that the training set size significantly impacts the performance and generalization of deep learning models. However, accessing a large dataset of CT scan images from an emerging disease like COVID-19 is challenging. Therefore, data efficiency becomes a significant factor in choosing a learning model. To this end, we present a multi-task learning approach, namely, a mask-guided attention (MGA) classifier, to improve the generalization and data efficiency of COVID-19 classification on lung CT scan images.</p><p>The novelty of this method is compensating for the scarcity of data by employing more supervision with lesion masks, increasing the sensitivity of the model to COVID-19 manifestations, and helping both generalization and classification performance. Our proposed model achieves better overall performance than the single-task baseline and state-of-the-art models, as measured by various popular metrics. In our experiment with different percentages of data from our curated dataset, the classification performance gain from this multi-task learning approach is more significant for the smaller training sizes. Furthermore, experimental results demonstrate that our method enhances the focus on the lesions, as witnessed by both</p><p>attention and attribution maps, resulting in a more interpretable model.</p>

Download Full-text

Predictive modelling of turbofan engine components condition using machine and deep learning methods

Eksploatacja i Niezawodnosc - Maintenance and Reliability ◽

10.17531/ein.2021.2.16 ◽

2021 ◽

Vol 23 (2) ◽

pp. 359-370

Author(s):

Michał Matuszczak ◽

Mateusz Żbikowski ◽

Andrzej Teodorczyk

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Predictive Modelling ◽

Continuous Variable ◽

Environmental Data ◽

Bayesian Optimization ◽

Aviation Industry ◽

Learning Models ◽

Turbofan Engine ◽

Machine Learning Models

The article proposes an approach based on deep and machine learning models to predict a component failure as an enhancement of condition based maintenance scheme of a turbofan engine and reviews currently used prognostics approaches in the aviation industry. Component degradation scale representing its life consumption is proposed and such collected condition data are combined with engines sensors and environmental data. With use of data manipulation techniques, a framework for models training is created and models' hyperparameters obtained through Bayesian optimization. Models predict the continuous variable representing condition based on the input. Best performed model is identified by detemining its score on the holdout set. Deep learning models achieved 0.71 MSE score (ensemble meta-model of neural networks) and outperformed significantly machine learning models with their best score at 1.75. The deep learning models shown their feasibility to predict the component condition within less than 1 unit of the error in the rank scale.

Download Full-text