scholarly journals Using deep learning for trajectory classification in imbalanced dataset

Author(s):  
Nicksson Ckayo Arrais de Freitas ◽  
Ticiana L. Coelho Da Silva ◽  
José Antônio Fernandes De Macêdo ◽  
Leopoldo Melo Júnioer

Deep learning has gained much popularity in the past years due to GPU advancements, cloud computing improvements, and its supremacy, considering the accuracy results when trained on massive datasets. As with machine learning, deep learning models may experience low performance when handled with imbalanced datasets. In this paper, we focus on the trajectory classification problem, and we examine deep learning techniques for coping with imbalanced class data. We extend a deep learning model, called DeepeST (Deep Learning for Sub-Trajectory classification), to predict the class or label for sub-trajectories from imbalanced datasets. DeepeST is the first deep learning model for trajectory classification that provides approaches for coping with imbalanced dataset problems from the authors' knowledge. In this paper, we perform the experiments with three real datasets from LBSN (Location-Based Social Network) trajectories to identify who is the user of a sub-trajectory (similar to the Trajectory-User Linking problem). We show that DeepeST outperforms other deep learning approaches from state-of-the-art concerning the accuracy, precision, recall, and F1-score.

Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 344
Author(s):  
Jeyaprakash Hemalatha ◽  
S. Abijah Roseline ◽  
Subbiah Geetha ◽  
Seifedine Kadry ◽  
Robertas Damaševičius

Recently, there has been a huge rise in malware growth, which creates a significant security threat to organizations and individuals. Despite the incessant efforts of cybersecurity research to defend against malware threats, malware developers discover new ways to evade these defense techniques. Traditional static and dynamic analysis methods are ineffective in identifying new malware and pose high overhead in terms of memory and time. Typical machine learning approaches that train a classifier based on handcrafted features are also not sufficiently potent against these evasive techniques and require more efforts due to feature-engineering. Recent malware detectors indicate performance degradation due to class imbalance in malware datasets. To resolve these challenges, this work adopts a visualization-based method, where malware binaries are depicted as two-dimensional images and classified by a deep learning model. We propose an efficient malware detection system based on deep learning. The system uses a reweighted class-balanced loss function in the final classification layer of the DenseNet model to achieve significant performance improvements in classifying malware by handling imbalanced data issues. Comprehensive experiments performed on four benchmark malware datasets show that the proposed approach can detect new malware samples with higher accuracy (98.23% for the Malimg dataset, 98.46% for the BIG 2015 dataset, 98.21% for the MaleVis dataset, and 89.48% for the unseen Malicia dataset) and reduced false-positive rates when compared with conventional malware mitigation techniques while maintaining low computational time. The proposed malware detection solution is also reliable and effective against obfuscation attacks.


2021 ◽  
Vol 11 (11) ◽  
pp. 4753
Author(s):  
Gen Ye ◽  
Chen Du ◽  
Tong Lin ◽  
Yan Yan ◽  
Jack Jiang

(1) Background: Deep learning has become ubiquitous due to its impressive performance in various domains, such as varied as computer vision, natural language and speech processing, and game-playing. In this work, we investigated the performance of recent deep learning approaches on the laryngopharyngeal reflux (LPR) diagnosis task. (2) Methods: Our dataset is composed of 114 subjects with 37 pH-positive cases and 77 control cases. In contrast to prior work based on either reflux finding score (RFS) or pH monitoring, we directly take laryngoscope images as inputs to neural networks, as laryngoscopy is the most common and simple diagnostic method. The diagnosis task is formulated as a binary classification problem. We first tested a powerful backbone network that incorporates residual modules, attention mechanism and data augmentation. Furthermore, recent methods in transfer learning and few-shot learning were investigated. (3) Results: On our dataset, the performance is the best test classification accuracy is 73.4%, while the best AUC value is 76.2%. (4) Conclusions: This study demonstrates that deep learning techniques can be applied to classify LPR images automatically. Although the number of pH-positive images used for training is limited, deep network can still be capable of learning discriminant features with the advantage of technique.


Water ◽  
2021 ◽  
Vol 13 (19) ◽  
pp. 2664
Author(s):  
Sunil Saha ◽  
Jagabandhu Roy ◽  
Tusar Kanti Hembram ◽  
Biswajeet Pradhan ◽  
Abhirup Dikshit ◽  
...  

The efficiency of deep learning and tree-based machine learning approaches has gained immense popularity in various fields. One deep learning model viz. convolution neural network (CNN), artificial neural network (ANN) and four tree-based machine learning models, namely, alternative decision tree (ADTree), classification and regression tree (CART), functional tree and logistic model tree (LMT), were used for landslide susceptibility mapping in the East Sikkim Himalaya region of India, and the results were compared. Landslide areas were delimited and mapped as landslide inventory (LIM) after gathering information from historical records and periodic field investigations. In LIM, 91 landslides were plotted and classified into training (64 landslides) and testing (27 landslides) subsets randomly to train and validate the models. A total of 21 landslide conditioning factors (LCFs) were considered as model inputs, and the results of each model were categorised under five susceptibility classes. The receiver operating characteristics curve and 21 statistical measures were used to evaluate and prioritise the models. The CNN deep learning model achieved the priority rank 1 with area under the curve of 0.918 and 0.933 by using the training and testing data, quantifying 23.02% and 14.40% area as very high and highly susceptible followed by ANN, ADtree, CART, FTree and LMT models. This research might be useful in landslide studies, especially in locations with comparable geophysical and climatological characteristics, to aid in decision making for land use planning.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Noha E. El-Attar ◽  
Mohamed K. Hassan ◽  
Othman A. Alghamdi ◽  
Wael A. Awad

AbstractReliance on deep learning techniques has become an important trend in several science domains including biological science, due to its proven efficiency in manipulating big data that are often characterized by their non-linear processes and complicated relationships. In this study, Convolutional Neural Networks (CNN) has been recruited, as one of the deep learning techniques, to be used in classifying and predicting the biological activities of the essential oil-producing plant/s through their chemical compositions. The model is established based on the available chemical composition’s information of a set of endemic Egyptian plants and their biological activities. Another type of machine learning algorithms, Multiclass Neural Network (MNN), has been applied on the same Essential Oils (EO) dataset. This aims to fairly evaluate the performance of the proposed CNN model. The recorded accuracy in the testing process for both CNN and MNN is 98.13% and 81.88%, respectively. Finally, the CNN technique has been adopted as a reliable model for classifying and predicting the bioactivities of the Egyptian EO-containing plants. The overall accuracy for the final prediction process is reported as approximately 97%. Hereby, the proposed deep learning model could be utilized as an efficient model in predicting the bioactivities of, at least Egyptian, EOs-producing plants.


Author(s):  
P. Nagaraj ◽  
P. Deepalakshmi

Diabetes, caused by the rise in level of glucose in blood, has many latest devices to identify from blood samples. Diabetes, when unnoticed, may bring many serious diseases like heart attack, kidney disease. In this way, there is a requirement for solid research and learning model’s enhancement in the field of gestational diabetes identification and analysis. SVM is one of the powerful classification models in machine learning, and similarly, Deep Neural Network is powerful under deep learning models. In this work, we applied Enhanced Support Vector Machine and Deep Learning model Deep Neural Network for diabetes prediction and screening. The proposed method uses Deep Neural Network obtaining its input from the output of Enhanced Support Vector Machine, thus having a combined efficacy. The dataset we considered includes 768 patients’ data with eight major features and a target column with result “Positive” or “Negative”. Experiment is done with Python and the outcome of our demonstration shows that the deep Learning model gives more efficiency for diabetes prediction.


2021 ◽  
Vol 11 (24) ◽  
pp. 11659
Author(s):  
Sheng-Chieh Hung ◽  
Hui-Ching Wu ◽  
Ming-Hseng Tseng

Through the continued development of technology, applying deep learning to remote sensing scene classification tasks is quite mature. The keys to effective deep learning model training are model architecture, training strategies, and image quality. From previous studies of the author using explainable artificial intelligence (XAI), image cases that have been incorrectly classified can be improved when the model has adequate capacity to correct the classification after manual image quality correction; however, the manual image quality correction process takes a significant amount of time. Therefore, this research integrates technologies such as noise reduction, sharpening, partial color area equalization, and color channel adjustment to evaluate a set of automated strategies for enhancing image quality. These methods can enhance details, light and shadow, color, and other image features, which are beneficial for extracting image features from the deep learning model to further improve the classification efficiency. In this study, we demonstrate that the proposed image quality enhancement strategy and deep learning techniques can effectively improve the scene classification performance of remote sensing images and outperform previous state-of-the-art approaches.


Diagnostics ◽  
2021 ◽  
Vol 11 (9) ◽  
pp. 1732
Author(s):  
Gurmail Singh ◽  
Kin-Choong Yow

The new strains of the pandemic Covid-19 are still looming. It is important to develop multiple approaches for timely and accurate detection of Covid-19 and its variants. Deep learning techniques are well proved for their efficiency in providing solutions to many social and economic problems. However, the transparency of the reasoning process of a deep learning model related to a high stake decision is a necessity. In this work, we propose an interpretable deep learning model Ps-ProtoPNet to detect Covid-19 from the medical images. Ps-ProtoPNet classifies the images by recognizing the objects rather than their background in the images. We demonstrate our model on the dataset of the chest CT-scan images. The highest accuracy that our model achieves is 99.29%


2022 ◽  
Vol 30 (1) ◽  
pp. 641-654
Author(s):  
Ali Abd Almisreb ◽  
Nooritawati Md Tahir ◽  
Sherzod Turaev ◽  
Mohammed A. Saleh ◽  
Syed Abdul Mutalib Al Junid

Arabic handwriting is slightly different from the handwriting of other languages; hence it is possible to distinguish the handwriting written by the native or non-native writer based on their handwriting. However, classifying Arabic handwriting is challenging using traditional text recognition algorithms. Thus, this study evaluated and validated the utilisation of deep transfer learning models to overcome such issues. Hence, seven types of deep learning transfer models, namely the AlexNet, GoogleNet, ResNet18, ResNet50, ResNet101, VGG16, and VGG19, were used to determine the most suitable model for classifying the handwritten images written by the native or non-native. Two datasets comprised of Arabic handwriting images were used to evaluate and validate the newly developed deep learning models used to classify each model’s output as either native or foreign (non-native) writers. The training and validation sets were conducted using both original and augmented datasets. Results showed that the highest accuracy is using the GoogleNet deep learning model for both normal and augmented datasets, with the highest accuracy attained as 93.2% using normal data and 95.5% using augmented data in classifying the native handwriting.


2021 ◽  
Vol 13 (21) ◽  
pp. 11889
Author(s):  
Inchoon Yeo ◽  
Yunsoo Choi

This paper proposes a deep learning model that integrates a convolutional neural network with a gate circulation unit that captures patterns of high-peak PM2.5 concentrations. The purpose is to accurately predict high-peak PM2.5 concentration data that cannot be trained in general deep learning models. For the training of the proposed model, we used all available weather and air quality data for three years from 2015 to 2017 from 25 stations of the National Institute of Environmental Research (NIER) and the Korea Meteorological Administration (KMA) observatory in Seoul, South Korea. Our model trained three years of data and predicted high-peak PM2.5 concentrations for the year 2018. In addition, we propose a Gaussian filter algorithm as a preprocessing method for capturing high concentrations of PM2.5 in the Seoul area and predicting them more accurately. This model overcomes the limitations of conventional deep learning approaches that are unable to predict high peak PM2.5 concentrations. Comparing model measurements at each of the 25 monitoring sites in 2018, we found that the deep learning model with a Gaussian filter achieved an index of agreement of 0.73–0.89 and a proportion of correctness of 0.89–0.96, and compared to the conventional deep learning method (average POC = 0.85), the Gaussian filter algorithm (average POC = 0.94) improved the accuracy of high-concentration PM2.5 prediction by an average of about 9%. Applying this algorithm in the preprocessing stage could be updated to predict the risk of high PM2.5 concentrations in real time.


2019 ◽  
Vol 26 (2) ◽  
pp. 945-962 ◽  
Author(s):  
Okyaz Eminaga ◽  
Omran Al-Hamad ◽  
Martin Boegemann ◽  
Bernhard Breil ◽  
Axel Semjonow

This study aims to introduce as proof of concept a combination model for classification of prostate cancer using deep learning approaches. We utilized patients with prostate cancer who underwent surgical treatment representing the various conditions of disease progression. All possible combinations of significant variables from logistic regression and correlation analyses were determined from study data sets. The combination possibility and deep learning model was developed to predict these combinations that represented clinically meaningful patient’s subgroups. The observed relative frequencies of different tumor stages and Gleason score Gls changes from biopsy to prostatectomy were available for each group. Deep learning models and seven machine learning approaches were compared for the classification performance of Gleason score changes and pT2 stage. Deep models achieved the highest F1 scores by pT2 tumors (0.849) and Gls change (0.574). Combination possibility and deep learning model is a useful decision-aided tool for prostate cancer and to group patients with prostate cancer into clinically meaningful groups.


Sign in / Sign up

Export Citation Format

Share Document