EmailDetective: An Email Authorship Identification And Verification Model

2020 ◽  
Vol 63 (11) ◽  
pp. 1775-1787
Author(s):  
Yong Fang ◽  
Yue Yang ◽  
Cheng Huang

Abstract Emails are often used to illegal cybercrime today, so it is important to verify the identity of the email author. This paper proposes a general model for solving the problem of anonymous email author attribution, which can be used in email authorship identification and email authorship verification. The first situation is to find the author of an anonymous email among the many suspected targets. Another situation is to verify if an email was written by the sender. This paper extracts features from the email header and email body and analyzes the writing style and other behaviors of email authors. The behaviors of email authors are extracted through a statistical algorithm from email headers. Moreover, the author’s writing style in the email body is extracted by a sequence-to-sequence bidirectional long short-term memory (BiLSTM) algorithm. This model combines multiple factors to solve the problem of anonymous email author attribution. The experiments proved that the accuracy and other indicators of proposed model are better than other methods. In email authorship verification experiment, our average accuracy, average recall and average F1-score reached 89.9%. In email authorship identification experiment, our model’s accuracy rate is 98.9% for 10 authors, 92.9% for 25 authors and 89.5% for 50 authors.

2021 ◽  
Vol 11 (19) ◽  
pp. 8995
Author(s):  
Eunju Lee ◽  
Dohee Kim ◽  
Hyerim Bae

The purpose of this study is to improve the prediction of container volumes in Busan ports by applying external variables and time-series data decomposition methods to deep learning prediction models. Previous studies on container volume forecasting were based on traditional statistical methodologies, such as ARIMA, SARIMA, and regression. However, these methods do not explain the complexity and variability of data caused by changes in the external environment, such as the global financial crisis and economic fluctuations. Deep learning can explore the inherent patterns of data and analyze the characteristics (time series, external environmental variables, and outliers); hence, the accuracy of deep learning-based volume prediction models is better than that of traditional models. However, this does not include the study of overall trends (upward, steady, or downward). In this study, a novel deep learning prediction model is proposed that combines prediction and trend identification of container volume. The proposed model explores external variables that are related to container volume, combining port volume time-series decomposition with external variables and deep learning-based multivariate long short-term memory (LSTM) prediction. The results indicate that the proposed model performs better than the traditional LSTM model and follows the trend simultaneously.


Author(s):  
Nujud Aloshban ◽  
Anna Esposito ◽  
Alessandro Vinciarelli

AbstractDepression is one of the most common mental health issues. (It affects more than 4% of the world’s population, according to recent estimates.) This article shows that the joint analysis of linguistic and acoustic aspects of speech allows one to discriminate between depressed and nondepressed speakers with an accuracy above 80%. The approach used in the work is based on networks designed for sequence modeling (bidirectional Long-Short Term Memory networks) and multimodal analysis methodologies (late fusion, joint representation and gated multimodal units). The experiments were performed over a corpus of 59 interviews (roughly 4 hours of material) involving 29 individuals diagnosed with depression and 30 control participants. In addition to an accuracy of 80%, the results show that multimodal approaches perform better than unimodal ones owing to people’s tendency to manifest their condition through one modality only, a source of diversity across unimodal approaches. In addition, the experiments show that it is possible to measure the “confidence” of the approach and automatically identify a subset of the test data in which the performance is above a predefined threshold. It is possible to effectively detect depression by using unobtrusive and inexpensive technologies based on the automatic analysis of speech and language.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 495
Author(s):  
Imayanmosha Wahlang ◽  
Arnab Kumar Maji ◽  
Goutam Saha ◽  
Prasun Chakrabarti ◽  
Michal Jasinski ◽  
...  

This article experiments with deep learning methodologies in echocardiogram (echo), a promising and vigorously researched technique in the preponderance field. This paper involves two different kinds of classification in the echo. Firstly, classification into normal (absence of abnormalities) or abnormal (presence of abnormalities) has been done, using 2D echo images, 3D Doppler images, and videographic images. Secondly, based on different types of regurgitation, namely, Mitral Regurgitation (MR), Aortic Regurgitation (AR), Tricuspid Regurgitation (TR), and a combination of the three types of regurgitation are classified using videographic echo images. Two deep-learning methodologies are used for these purposes, a Recurrent Neural Network (RNN) based methodology (Long Short Term Memory (LSTM)) and an Autoencoder based methodology (Variational AutoEncoder (VAE)). The use of videographic images distinguished this work from the existing work using SVM (Support Vector Machine) and also application of deep-learning methodologies is the first of many in this particular field. It was found that deep-learning methodologies perform better than SVM methodology in normal or abnormal classification. Overall, VAE performs better in 2D and 3D Doppler images (static images) while LSTM performs better in the case of videographic images.


Author(s):  
Azim Heydari ◽  
Meysam Majidi Nezhad ◽  
Davide Astiaso Garcia ◽  
Farshid Keynia ◽  
Livio De Santoli

AbstractAir pollution monitoring is constantly increasing, giving more and more attention to its consequences on human health. Since Nitrogen dioxide (NO2) and sulfur dioxide (SO2) are the major pollutants, various models have been developed on predicting their potential damages. Nevertheless, providing precise predictions is almost impossible. In this study, a new hybrid intelligent model based on long short-term memory (LSTM) and multi-verse optimization algorithm (MVO) has been developed to predict and analysis the air pollution obtained from Combined Cycle Power Plants. In the proposed model, long short-term memory model is a forecaster engine to predict the amount of produced NO2 and SO2 by the Combined Cycle Power Plant, where the MVO algorithm is used to optimize the LSTM parameters in order to achieve a lower forecasting error. In addition, in order to evaluate the proposed model performance, the model has been applied using real data from a Combined Cycle Power Plant in Kerman, Iran. The datasets include wind speed, air temperature, NO2, and SO2 for five months (May–September 2019) with a time step of 3-h. In addition, the model has been tested based on two different types of input parameters: type (1) includes wind speed, air temperature, and different lagged values of the output variables (NO2 and SO2); type (2) includes just lagged values of the output variables (NO2 and SO2). The obtained results show that the proposed model has higher accuracy than other combined forecasting benchmark models (ENN-PSO, ENN-MVO, and LSTM-PSO) considering different network input variables. Graphic abstract


2021 ◽  
pp. 1-10
Author(s):  
Hye-Jeong Song ◽  
Tak-Sung Heo ◽  
Jong-Dae Kim ◽  
Chan-Young Park ◽  
Yu-Seop Kim

Sentence similarity evaluation is a significant task used in machine translation, classification, and information extraction in the field of natural language processing. When two sentences are given, an accurate judgment should be made whether the meaning of the sentences is equivalent even if the words and contexts of the sentences are different. To this end, existing studies have measured the similarity of sentences by focusing on the analysis of words, morphemes, and letters. To measure sentence similarity, this study uses Sent2Vec, a sentence embedding, as well as morpheme word embedding. Vectors representing words are input to the 1-dimension convolutional neural network (1D-CNN) with various sizes of kernels and bidirectional long short-term memory (Bi-LSTM). Self-attention is applied to the features transformed through Bi-LSTM. Subsequently, vectors undergoing 1D-CNN and self-attention are converted through global max pooling and global average pooling to extract specific values, respectively. The vectors generated through the above process are concatenated to the vector generated through Sent2Vec and are represented as a single vector. The vector is input to softmax layer, and finally, the similarity between the two sentences is determined. The proposed model can improve the accuracy by up to 5.42% point compared with the conventional sentence similarity estimation models.


2021 ◽  
Vol 13 (2) ◽  
pp. 1-12
Author(s):  
Sumit Das ◽  
Manas Kumar Sanyal ◽  
Sarbajyoti Mallik

There is a lot of fake news roaming around various mediums, which misleads people. It is a big issue in this advanced intelligent era, and there is a need to find some solution to this kind of situation. This article proposes an approach that analyzes fake and real news. This analysis is focused on sentiment, significance, and novelty, which are a few characteristics of this news. The ability to manipulate daily information mathematically and statistically is allowed by expressing news reports as numbers and metadata. The objective of this article is to analyze and filter out the fake news that makes trouble. The proposed model is amalgamated with the web application; users can get real data and fake data by using this application. The authors have used the AI (artificial intelligence) algorithms, specifically logistic regression and LSTM (long short-term memory), so that the application works well. The results of the proposed model are compared with existing models.


2021 ◽  
pp. 1-17
Author(s):  
Enda Du ◽  
Yuetian Liu ◽  
Ziyan Cheng ◽  
Liang Xue ◽  
Jing Ma ◽  
...  

Summary Accurate production forecasting is an essential task and accompanies the entire process of reservoir development. With the limitation of prediction principles and processes, the traditional approaches are difficult to make rapid predictions. With the development of artificial intelligence, the data-driven model provides an alternative approach for production forecasting. To fully take the impact of interwell interference on production into account, this paper proposes a deep learning-based hybrid model (GCN-LSTM), where graph convolutional network (GCN) is used to capture complicated spatial patterns between each well, and long short-term memory (LSTM) neural network is adopted to extract intricate temporal correlations from historical production data. To implement the proposed model more efficiently, two data preprocessing procedures are performed: Outliers in the data set are removed by using a box plot visualization, and measurement noise is reduced by a wavelet transform. The robustness and applicability of the proposed model are evaluated in two scenarios of different data types with the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE). The results show that the proposed model can effectively capture spatial and temporal correlations to make a rapid and accurate oil production forecast.


Author(s):  
Preethi D. ◽  
Neelu Khare

This chapter presents an ensemble-based feature selection with long short-term memory (LSTM) model. A deep recurrent learning model is proposed for classifying network intrusion. This model uses ensemble-based feature selection (EFS) for selecting the appropriate features from the dataset and long short-term memory for the classification of network intrusions. The EFS combines five feature selection techniques, namely information gain, gain ratio, chi-square, correlation-based feature selection, and symmetric uncertainty-based feature selection. The experiments were conducted using the standard benchmark NSL-KDD dataset and implemented using tensor flow and python. The proposed model is evaluated using the classification performance metrics and also compared with all the 41 features without any feature selection as well as with each individual feature selection technique and classified using LSTM. The performance study showed that the proposed model performs better, with 99.8% accuracy, with a higher detection and lower false alarm rates.


2020 ◽  
pp. 1-15
Author(s):  
Hongchang Sun ◽  
Yadong wang ◽  
Lanqiang Niu ◽  
Fengyu Zhou ◽  
Heng Li

Building energy consumption (BEC) prediction is very important for energy management and conservation. This paper presents a short-term energy consumption prediction method that integrates the Fuzzy Rough Set (FRS) theory and the Long Short-Term Memory (LSTM) model, and is thus named FRS-LSTM. This method can find the most directly related factors from the complex and diverse factors influencing the energy consumption, which improves the prediction accuracy and efficiency. First, the FRS is used to reduce the redundancy of the input features by the attribute reduction of the factors affecting the energy consumption forecasting, and solves the data loss problem caused by the data discretization of a classical rough set. Then, the final attribute set after reduction is taken as the input of the LSTM networks to obtain the final prediction results. To validate the effectiveness of the proposed model, this study used the actual data of a public building to predict the building’s energy consumption, and compared the proposed model with the LSTM, Levenberg-Marquardt Back Propagation (LM-BP), and Support Vector Regression (SVR) models. The experimental results reveal that the presented FRS-LSTM model achieves higher prediction accuracy compared with other comparative models.


Information ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 145 ◽  
Author(s):  
Zhenglong Xiang ◽  
Xialei Dong ◽  
Yuanxiang Li ◽  
Fei Yu ◽  
Xing Xu ◽  
...  

Most of the existing research papers study the emotion recognition of Minnan songs from the perspectives of music analysis theory and music appreciation. However, these investigations do not explore any possibility of carrying out an automatic emotion recognition of Minnan songs. In this paper, we propose a model that consists of four main modules to classify the emotion of Minnan songs by using the bimodal data—song lyrics and audio. In the proposed model, an attention-based Long Short-Term Memory (LSTM) neural network is applied to extract lyrical features, and a Convolutional Neural Network (CNN) is used to extract the audio features from the spectrum. Then, two kinds of extracted features are concatenated by multimodal compact bilinear pooling, and finally, the concatenated features are input to the classifying module to determine the song emotion. We designed three experiment groups to investigate the classifying performance of combinations of the four main parts, the comparisons of proposed model with the current approaches and the influence of a few key parameters on the performance of emotion recognition. The results show that the proposed model exhibits better performance over all other experimental groups. The accuracy, precision and recall of the proposed model exceed 0.80 in a combination of appropriate parameters.


Sign in / Sign up

Export Citation Format

Share Document