scholarly journals Geospatial Data Analytics- A Deep Learning Perspective

2020 ◽  
Author(s):  
Saneev Kumar Das ◽  
Sujit Bebortta

To visualize high-dimensional geospatial data has achieved much importance in last decades. But to analyze it, the technologies used of machine learning are not so convincing and thus it is high time to switch to a sub-domain of machine learning called deep learning, which has gained popularity because of its accuracy in processing and analyzing high-dimensional data. The convergence of deep learning with geospatial data analytics shall prove to be a boon to those who actually has a need to predict specific outputs over geospatial data. In this paper, we have presented some geospatial data generated using the GIS (Geographic Information Systems) technology and proposed ways to implement deep learning over these data. GIS technology is a mapping technology which signifies high-dimensional geospatial data and our aim is to propose a model where GIS converges with highly efficient deep neural networks such as CNN (Convolutional Neural Network), RNN (Recurrent Neural Network) and LSTM (Long Short-Term Memory Network). We have provided geospatial as well as statistical results in this paper to visualize practically the GIS technology. This paper further provides future scope of the proposed model which shall present the challenges that needs to be tackled in future and its applicability in many relevant domains.

Author(s):  
Surenthiran Krishnan ◽  
Pritheega Magalingam ◽  
Roslina Ibrahim

<span>This paper proposes a new hybrid deep learning model for heart disease prediction using recurrent neural network (RNN) with the combination of multiple gated recurrent units (GRU), long short-term memory (LSTM) and Adam optimizer. This proposed model resulted in an outstanding accuracy of 98.6876% which is the highest in the existing model of RNN. The model was developed in Python 3.7 by integrating RNN in multiple GRU that operates in Keras and Tensorflow as the backend for deep learning process, supported by various Python libraries. The recent existing models using RNN have reached an accuracy of 98.23% and deep neural network (DNN) has reached 98.5%. The common drawbacks of the existing models are low accuracy due to the complex build-up of the neural network, high number of neurons with redundancy in the neural network model and imbalance datasets of Cleveland. Experiments were conducted with various customized model, where results showed that the proposed model using RNN and multiple GRU with synthetic minority oversampling technique (SMOTe) has reached the best performance level. This is the highest accuracy result for RNN using Cleveland datasets and much promising for making an early heart disease prediction for the patients.</span>


2021 ◽  
Vol 7 ◽  
pp. e365
Author(s):  
Nikita Bhandari ◽  
Satyajeet Khare ◽  
Rahee Walambe ◽  
Ketan Kotecha

Gene promoters are the key DNA regulatory elements positioned around the transcription start sites and are responsible for regulating gene transcription process. Various alignment-based, signal-based and content-based approaches are reported for the prediction of promoters. However, since all promoter sequences do not show explicit features, the prediction performance of these techniques is poor. Therefore, many machine learning and deep learning models have been proposed for promoter prediction. In this work, we studied methods for vector encoding and promoter classification using genome sequences of three distinct higher eukaryotes viz. yeast (Saccharomyces cerevisiae), A. thaliana (plant) and human (Homo sapiens). We compared one-hot vector encoding method with frequency-based tokenization (FBT) for data pre-processing on 1-D Convolutional Neural Network (CNN) model. We found that FBT gives a shorter input dimension reducing the training time without affecting the sensitivity and specificity of classification. We employed the deep learning techniques, mainly CNN and recurrent neural network with Long Short Term Memory (LSTM) and random forest (RF) classifier for promoter classification at k-mer sizes of 2, 4 and 8. We found CNN to be superior in classification of promoters from non-promoter sequences (binary classification) as well as species-specific classification of promoter sequences (multiclass classification). In summary, the contribution of this work lies in the use of synthetic shuffled negative dataset and frequency-based tokenization for pre-processing. This study provides a comprehensive and generic framework for classification tasks in genomic applications and can be extended to various classification problems.


Information ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 374
Author(s):  
Babacar Gaye ◽  
Dezheng Zhang ◽  
Aziguli Wulamu

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.


In this paper we propose a novel supervised machine learning model to predict the polarity of sentiments expressed in microblogs. The proposed model has a stacked neural network structure consisting of Long Short Term Memory (LSTM) and Convolutional Neural Network (CNN) layers. In order to capture the long-term dependencies of sentiments in the text ordering of a microblog, the proposed model employs an LSTM layer. The encodings produced by the LSTM layer are then fed to a CNN layer, which generates localized patterns of higher accuracy. These patterns are capable of capturing both local and global long-term dependences in the text of the microblogs. It was observed that the proposed model performs better and gives improved prediction accuracy when compared to semantic, machine learning and deep neural network approaches such as SVM, CNN, LSTM, CNN-LSTM, etc. This paper utilizes the benchmark Stanford Large Movie Review dataset to show the significance of the new approach. The prediction accuracy of the proposed approach is comparable to other state-of-art approaches.


Author(s):  
Hoseok Choi ◽  
Seokbeen Lim ◽  
Kyeongran Min ◽  
Kyoung-ha Ahn ◽  
Kyoung-Min Lee ◽  
...  

Abstract Objective: With the development in the field of neural networks, Explainable AI (XAI), is being studied to ensure that artificial intelligence models can be explained. There are some attempts to apply neural networks to neuroscientific studies to explain neurophysiological information with high machine learning performances. However, most of those studies have simply visualized features extracted from XAI and seem to lack an active neuroscientific interpretation of those features. In this study, we have tried to actively explain the high-dimensional learning features contained in the neurophysiological information extracted from XAI, compared with the previously reported neuroscientific results. Approach: We designed a deep neural network classifier using 3D information (3D DNN) and a 3D class activation map (3D CAM) to visualize high-dimensional classification features. We used those tools to classify monkey electrocorticogram (ECoG) data obtained from the unimanual and bimanual movement experiment. Main results: The 3D DNN showed better classification accuracy than other machine learning techniques, such as 2D DNN. Unexpectedly, the activation weight in the 3D CAM analysis was high in the ipsilateral motor and somatosensory cortex regions, whereas the gamma-band power was activated in the contralateral areas during unimanual movement, which suggests that the brain signal acquired from the motor cortex contains information about both contralateral movement and ipsilateral movement. Moreover, the hand-movement classification system used critical temporal information at movement onset and offset when classifying bimanual movements. Significance: As far as we know, this is the first study to use high-dimensional neurophysiological information (spatial, spectral, and temporal) with the deep learning method, reconstruct those features, and explain how the neural network works. We expect that our methods can be widely applied and used in neuroscience and electrophysiology research from the point of view of the explainability of XAI as well as its performance.


2021 ◽  
Vol 11 (17) ◽  
pp. 7940
Author(s):  
Mohammed Al-Sarem ◽  
Abdullah Alsaeedi ◽  
Faisal Saeed ◽  
Wadii Boulila ◽  
Omair AmeerBakhsh

Spreading rumors in social media is considered under cybercrimes that affect people, societies, and governments. For instance, some criminals create rumors and send them on the internet, then other people help them to spread it. Spreading rumors can be an example of cyber abuse, where rumors or lies about the victim are posted on the internet to send threatening messages or to share the victim’s personal information. During pandemics, a large amount of rumors spreads on social media very fast, which have dramatic effects on people’s health. Detecting these rumors manually by the authorities is very difficult in these open platforms. Therefore, several researchers conducted studies on utilizing intelligent methods for detecting such rumors. The detection methods can be classified mainly into machine learning-based and deep learning-based methods. The deep learning methods have comparative advantages against machine learning ones as they do not require preprocessing and feature engineering processes and their performance showed superior enhancements in many fields. Therefore, this paper aims to propose a Novel Hybrid Deep Learning Model for Detecting COVID-19-related Rumors on Social Media (LSTM–PCNN). The proposed model is based on a Long Short-Term Memory (LSTM) and Concatenated Parallel Convolutional Neural Networks (PCNN). The experiments were conducted on an ArCOV-19 dataset that included 3157 tweets; 1480 of them were rumors (46.87%) and 1677 tweets were non-rumors (53.12%). The findings of the proposed model showed a superior performance compared to other methods in terms of accuracy, recall, precision, and F-score.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1514
Author(s):  
Ali Aljofey ◽  
Qingshan Jiang ◽  
Qiang Qu ◽  
Mingqing Huang ◽  
Jean-Pierre Niyigena

Phishing is the easiest way to use cybercrime with the aim of enticing people to give accurate information such as account IDs, bank details, and passwords. This type of cyberattack is usually triggered by emails, instant messages, or phone calls. The existing anti-phishing techniques are mainly based on source code features, which require to scrape the content of web pages, and on third-party services which retard the classification process of phishing URLs. Although the machine learning techniques have lately been used to detect phishing, they require essential manual feature engineering and are not an expert at detecting emerging phishing offenses. Due to the recent rapid development of deep learning techniques, many deep learning-based methods have also been introduced to enhance the classification performance. In this paper, a fast deep learning-based solution model, which uses character-level convolutional neural network (CNN) for phishing detection based on the URL of the website, is proposed. The proposed model does not require the retrieval of target website content or the use of any third-party services. It captures information and sequential patterns of URL strings without requiring a prior knowledge about phishing, and then uses the sequential pattern features for fast classification of the actual URL. For evaluations, comparisons are provided between different traditional machine learning models and deep learning models using various feature sets such as hand-crafted, character embedding, character level TF-IDF, and character level count vectors features. According to the experiments, the proposed model achieved an accuracy of 95.02% on our dataset and an accuracy of 98.58%, 95.46%, and 95.22% on benchmark datasets which outperform the existing phishing URL models.


2021 ◽  
Vol 7 (2) ◽  
pp. 113-121
Author(s):  
Firman Pradana Rachman

Setiap orang mempunyai pendapat atau opini terhadap suatu produk, tokoh masyarakat, atau pun sebuah kebijakan pemerintah yang tersebar di media sosial. Pengolahan data opini itu di sebut dengan sentiment analysis. Dalam pengolahan data opini yang besar tersebut tidak hanya cukup menggunakan machine learning, namun bisa juga menggunakan deep learning yang di kombinasikan dengan teknik NLP (Natural Languange Processing). Penelitian ini membandingkan beberapa model deep learning seperti CNN (Convolutional Neural Network), RNN (Recurrent Neural Networks), LSTM (Long Short-Term Memory) dan beberapa variannya untuk mengolah data sentiment analysis dari review produk amazon dan yelp.


2020 ◽  
Vol 10 (23) ◽  
pp. 8491
Author(s):  
Bin Liu ◽  
Zhexi Zhang ◽  
Junchi Yan ◽  
Ning Zhang ◽  
Hongyuan Zha ◽  
...  

Risk control has always been a major challenge in finance. Overdue repayment is a frequently encountered discreditable behavior in online lending. Motivated by the powerful capabilities of deep neural networks, we propose a fusion deep learning approach, namely AD-MBLSTM, based on the deep neural network (DNN), multi-layer bi-directional long short-term memory (LSTM) (BiLSTM) and the attention mechanism for overdue repayment behavior forecasting according to historical repayment records. Furthermore, we present a novel feature derivation and selection method for the procedure of data preprocessing. Visualization and interpretability improvement work is also implemented to explore the critical time points and causes of overdue repayment behavior. In addition, we present a new dataset originating from a practical application scenario in online lending. We evaluate our proposed framework on the dataset and compare the performance with various general machine learning models and neural network models. Comparison results and the ablation study demonstrate that our proposed model outperforms many effective general machine learning models by a large margin, and each indispensable sub-component takes an active role.


Over the past few years there has been a tremendous developments observed in the field of computer technology and artificial intelligence, especially the use of machine learning concepts in Research and Industries. The human effort can be further more reduced in recognition, learning, predicting and many other areas using machine learning and deep learning. Any information which has been handwritten documents consisting of digits in digital form like images, recognizing such digits is a challenging task. The proposed system can recognize any handwritten digits in the document which has been converted into digital format. The proposed model includes Convolutional Neural Network (CNN), a deep learning approach with Linear Binary Pattern (LBP) used for feature extraction. In order to classify more effectively we also have used Support Vector Machine to recognize mere similar digits like 1 and 7, 5 and 6 and many others. The proposed system CNN and LBP is implemented on python language; also the system is tested with different images of handwritten digits taken from MNIST dataset. By using proposed model we could able to achieve 98.74% accuracy in predicting the digits in image format.


Sign in / Sign up

Export Citation Format

Share Document