A Malicious URL Detection Model Based on Convolutional Neural Network
With the development of Internet technology, network security is under diverse threats. In particular, attackers can spread malicious uniform resource locators (URL) to carry out attacks such as phishing and spam. The research on malicious URL detection is significant for defending against these attacks. However, there are still some problems in the current research. For instance, malicious features cannot be extracted efficiently. Some existing detection methods are easy to evade by attackers. We design a malicious URL detection model based on a dynamic convolutional neural network (DCNN) to solve these problems. A new folding layer is added to the original multilayer convolution network. It replaces the pooling layer with the k-max-pooling layer. In the dynamic convolution algorithm, the width of feature mapping in the middle layer depends on the vector input dimension. Moreover, the pooling layer parameters are dynamically adjusted according to the length of the URL input and the depth of the current convolution layer, which is beneficial to extracting more in-depth features in a wider range. In this paper, we propose a new embedding method in which word embedding based on character embedding is leveraged to learn the vector representation of a URL. Meanwhile, we conduct two groups of comparative experiments. First, we conduct three contrast experiments, which adopt the same network structure and different embedding methods. The results prove that word embedding based on character embedding can achieve higher accuracy. We then conduct the other three experiences, which use the same embedding method proposed in this paper and use different network structures to determine which network is most suitable for our model. We verify that the model designed in this paper has the highest accuracy (98%) in detecting malicious URL through these experiences.