scholarly journals An Effective Phishing Detection Model Based on Character Level Convolutional Neural Network from URL

Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1514
Author(s):  
Ali Aljofey ◽  
Qingshan Jiang ◽  
Qiang Qu ◽  
Mingqing Huang ◽  
Jean-Pierre Niyigena

Phishing is the easiest way to use cybercrime with the aim of enticing people to give accurate information such as account IDs, bank details, and passwords. This type of cyberattack is usually triggered by emails, instant messages, or phone calls. The existing anti-phishing techniques are mainly based on source code features, which require to scrape the content of web pages, and on third-party services which retard the classification process of phishing URLs. Although the machine learning techniques have lately been used to detect phishing, they require essential manual feature engineering and are not an expert at detecting emerging phishing offenses. Due to the recent rapid development of deep learning techniques, many deep learning-based methods have also been introduced to enhance the classification performance. In this paper, a fast deep learning-based solution model, which uses character-level convolutional neural network (CNN) for phishing detection based on the URL of the website, is proposed. The proposed model does not require the retrieval of target website content or the use of any third-party services. It captures information and sequential patterns of URL strings without requiring a prior knowledge about phishing, and then uses the sequential pattern features for fast classification of the actual URL. For evaluations, comparisons are provided between different traditional machine learning models and deep learning models using various feature sets such as hand-crafted, character embedding, character level TF-IDF, and character level count vectors features. According to the experiments, the proposed model achieved an accuracy of 95.02% on our dataset and an accuracy of 98.58%, 95.46%, and 95.22% on benchmark datasets which outperform the existing phishing URL models.

Sensors ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 210 ◽  
Author(s):  
Zied Tayeb ◽  
Juri Fedjaev ◽  
Nejla Ghaboosi ◽  
Christoph Richter ◽  
Lukas Everding ◽  
...  

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g., hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause by the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: (1) A long short-term memory (LSTM); (2) a spectrogram-based convolutional neural network model (CNN); and (3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (any manual) feature engineering. Results were evaluated on our own publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from “BCI Competition IV”. Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.


2021 ◽  
Vol 9 ◽  
Author(s):  
Ashwini K ◽  
P. M. Durai Raj Vincent ◽  
Kathiravan Srinivasan ◽  
Chuan-Yu Chang

Neonatal infants communicate with us through cries. The infant cry signals have distinct patterns depending on the purpose of the cries. Preprocessing, feature extraction, and feature selection need expert attention and take much effort in audio signals in recent days. In deep learning techniques, it automatically extracts and selects the most important features. For this, it requires an enormous amount of data for effective classification. This work mainly discriminates the neonatal cries into pain, hunger, and sleepiness. The neonatal cry auditory signals are transformed into a spectrogram image by utilizing the short-time Fourier transform (STFT) technique. The deep convolutional neural network (DCNN) technique takes the spectrogram images for input. The features are obtained from the convolutional neural network and are passed to the support vector machine (SVM) classifier. Machine learning technique classifies neonatal cries. This work combines the advantages of machine learning and deep learning techniques to get the best results even with a moderate number of data samples. The experimental result shows that CNN-based feature extraction and SVM classifier provides promising results. While comparing the SVM-based kernel techniques, namely radial basis function (RBF), linear and polynomial, it is found that SVM-RBF provides the highest accuracy of kernel-based infant cry classification system provides 88.89% accuracy.


2020 ◽  
Vol 79 (47-48) ◽  
pp. 36063-36075 ◽  
Author(s):  
Valentina Franzoni ◽  
Giulio Biondi ◽  
Alfredo Milani

AbstractCrowds express emotions as a collective individual, which is evident from the sounds that a crowd produces in particular events, e.g., collective booing, laughing or cheering in sports matches, movies, theaters, concerts, political demonstrations, and riots. A critical question concerning the innovative concept of crowd emotions is whether the emotional content of crowd sounds can be characterized by frequency-amplitude features, using analysis techniques similar to those applied on individual voices, where deep learning classification is applied to spectrogram images derived by sound transformations. In this work, we present a technique based on the generation of sound spectrograms from fragments of fixed length, extracted from original audio clips recorded in high-attendance events, where the crowd acts as a collective individual. Transfer learning techniques are used on a convolutional neural network, pre-trained on low-level features using the well-known ImageNet extensive dataset of visual knowledge. The original sound clips are filtered and normalized in amplitude for a correct spectrogram generation, on which we fine-tune the domain-specific features. Experiments held on the finally trained Convolutional Neural Network show promising performances of the proposed model to classify the emotions of the crowd.


Author(s):  
Zied Tayeb ◽  
Juri Fedjaev ◽  
Nejla Ghaboosi ◽  
Christoph Richter ◽  
Lukas Everding ◽  
...  

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g. hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause for the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: 1) a long short-term memory (LSTM); 2) a proposed spectrogram-based convolutional neural network model (pCNN); and 3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (manual) feature engineering. Results were evaluated on our own, publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from "BCI Competition IV". Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.


Webology ◽  
2021 ◽  
Vol 18 (Special Issue 04) ◽  
pp. 944-962
Author(s):  
K. Niha ◽  
Dr.S. Amutha ◽  
Dr. Aisha Banu

Disease in plants are a great challenge in the advancement of agriculture which affects farmers yield and the plants. In this modern research deep learning models got a spot light by increasing plant detection accuracy and classification. The proposed CNN (Convolutional Neural Network) model detect seven plant diseases out of healthy leaf, where the dataset considered in this work contain 8685 leaf images from Plant Village Dataset. The proposed modals performance are evaluated with respect to the performance metrics (F1 score, Precision and Recall) and are compared with SVM and ANN. Where the proposed CNN model outperforms the rest with the accuracy of 96.2% and the F1 score greater than 95%. The feasibility of the proposed model in plant detection and classification may provide a solution to the problem faced by farmers.


2019 ◽  
Vol 147 (8) ◽  
pp. 2827-2845 ◽  
Author(s):  
David John Gagne II ◽  
Sue Ellen Haupt ◽  
Douglas W. Nychka ◽  
Gregory Thompson

Abstract Deep learning models, such as convolutional neural networks, utilize multiple specialized layers to encode spatial patterns at different scales. In this study, deep learning models are compared with standard machine learning approaches on the task of predicting the probability of severe hail based on upper-air dynamic and thermodynamic fields from a convection-allowing numerical weather prediction model. The data for this study come from patches surrounding storms identified in NCAR convection-allowing ensemble runs from 3 May to 3 June 2016. The machine learning models are trained to predict whether the simulated surface hail size from the Thompson hail size diagnostic exceeds 25 mm over the hour following storm detection. A convolutional neural network is compared with logistic regressions using input variables derived from either the spatial means of each field or principal component analysis. The convolutional neural network statistically significantly outperforms all other methods in terms of Brier skill score and area under the receiver operator characteristic curve. Interpretation of the convolutional neural network through feature importance and feature optimization reveals that the network synthesized information about the environment and storm morphology that is consistent with our understanding of hail growth, including large lapse rates and a wind shear profile that favors wide updrafts. Different neurons in the network also record different storm modes, and the magnitude of the output of those neurons is used to analyze the spatiotemporal distributions of different storm modes in the NCAR ensemble.


Over the past few years there has been a tremendous developments observed in the field of computer technology and artificial intelligence, especially the use of machine learning concepts in Research and Industries. The human effort can be further more reduced in recognition, learning, predicting and many other areas using machine learning and deep learning. Any information which has been handwritten documents consisting of digits in digital form like images, recognizing such digits is a challenging task. The proposed system can recognize any handwritten digits in the document which has been converted into digital format. The proposed model includes Convolutional Neural Network (CNN), a deep learning approach with Linear Binary Pattern (LBP) used for feature extraction. In order to classify more effectively we also have used Support Vector Machine to recognize mere similar digits like 1 and 7, 5 and 6 and many others. The proposed system CNN and LBP is implemented on python language; also the system is tested with different images of handwritten digits taken from MNIST dataset. By using proposed model we could able to achieve 98.74% accuracy in predicting the digits in image format.


Author(s):  
Dr. K. Naveen Kumar

Abstract: Recently, a machine learning (ML) area called deep learning emerged in the computer-vision field and became very popular in many fields. It started from an event in late 2012, when a deep-learning approach based on a convolutional neural network (CNN) won an overwhelming victory in the best-known worldwide computer vision competition, ImageNet Classification. Since then, researchers in many fields, including medical image analysis, have started actively participating in the explosively growing field of deep learning. In this paper, deep learning techniques and their applications to medical image analysis are surveyed. This survey overviewed 1) standard ML techniques in the computer-vision field, 2) what has changed in ML before and after the introduction of deep learning, 3) ML models in deep learning, and 4) applications of deep learning to medical image analysis. The comparisons between MLs before and after deep learning revealed that ML with feature input (or feature-based ML) was dominant before the introduction of deep learning, and that the major and essential difference between ML before and after deep learning is learning image data directly without object segmentation or feature extraction; thus, it is the source of the power of deep learning, although the depth of the model is an important attribute. The survey of deep learningalso revealed that there is a long history of deep-learning techniques in the class of ML with image input, except a new term, “deep learning”. “Deep learning” even before the term existed, namely, the class of ML with image input was applied to various problems in medical image analysis including classification between lesions and nonlesions, classification between lesion types, segmentation of lesions or organs, and detection of lesions. ML with image input including deep learning is a verypowerful, versatile technology with higher performance, which can bring the current state-ofthe-art performance level of medical image analysis to the next level, and it is expected that deep learning will be the mainstream technology in medical image analysis in the next few decades. “Deep learning”, or ML with image input, in medical image analysis is an explosively growing, promising field. It is expected that ML with image input will be the mainstream area in the field of medical image analysis in the next few decades. Keywords: Deep learning, Convolutional neural network, Massive-training artificial neural network, Computer-aided diagnosis, Medical image analysis, Classification (key words)


Author(s):  
Joy Iong-Zong Chen ◽  
Kong-Long Lai

With the exponential increase in the usage of the internet, numerous organisations, including the financial industry, have operationalized online services. The massive financial losses occur as a result of the global growth in financial fraud. Henceforth, devising advanced financial fraud detection systems can actively detect the risks such as illegal transactions and irregular attacks. Over the recent years, these issues are tackled to a larger extent by means of data mining and machine learning techniques. However, in terms of unknown attack pattern identification, big data analytics and speed computation, several improvements must be performed in these techniques. The Deep Convolution Neural Network (DCNN) scheme based financial fraud detection scheme using deep learning algorithm is proposed in this paper. When large volume of data is involved, the detection accuracy can be enhanced by using this technique. The existing machine learning models, auto-encoder model and other deep learning models are compared with the proposed model to evaluate the performance by using a real-time credit card fraud dataset. Over a time duration of 45 seconds, a detection accuracy of 99% has been obtained by using the proposed model as observed in the experimental results.


2021 ◽  
Vol 11 (3) ◽  
pp. 194-201
Author(s):  
Van-Tu Nguyen ◽  
◽  
Anh-Cuong Le ◽  
Ha-Nam Nguyen

Automatically determining similar questions and ranking the obtained questions according to their similarities to each input question is a very important task to any community Question Answering system (cQA). Various methods have applied for this task including conventional machine learning methods with feature extraction and some recent studies using deep learning methods. This paper addresses the problem of how to combine advantages of different methods into one unified model. Moreover, deep learning models are usually only effective for large data, while training data sets in cQA problems are often small, so the idea of integrating external knowledge into deep learning models for this cQA problem becomes more important. To this objective, we propose a neural network-based model which combines a Convolutional Neural Network (CNN) with features from other methods so that the deep learning model is enhanced with addtional knowledge sources. In our proposed model, the CNN component will learn the representation of two given questions, then combined additional features through a Multilayer Perceptron (MLP) to measure similarity between the two questions. We tested our proposed model on the SemEval 2016 task-3 data set and obtain better results in comparison with previous studies on the same task.


Sign in / Sign up

Export Citation Format

Share Document