Application of Data Augmentation Techniques for Hate Speech Detection with Deep Learning

Digital breast tomosynthesis (DBT) is one of the powerful breast cancer screening technologies. DBT can improve the ability of radiologists to detect breast cancer, especially in the case of dense breasts, where it beats mammography. Although many automated methods were proposed to detect breast lesions in mammographic images, very few methods were proposed for DBT due to the unavailability of enough annotated DBT images for training object detectors. In this paper, we present fully automated deep-learning breast lesion detection methods. Specifically, we study the effectiveness of two data augmentation techniques (channel replication and channel-concatenation) with five state-of-the-art deep learning detection models. Our preliminary results on a challenging publically available DBT dataset showed that the channel-concatenation data augmentation technique can significantly improve the breast lesion detection results for deep learning-based breast lesion detectors.

Download Full-text

Deep Learning for Hate Speech Detection in Tweets

Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion ◽

10.1145/3041021.3054223 ◽

2017 ◽

Cited By ~ 136

Author(s):

Pinkesh Badjatiya ◽

Shashank Gupta ◽

Manish Gupta ◽

Vasudeva Varma

Keyword(s):

Deep Learning ◽

Hate Speech ◽

Speech Detection

Download Full-text

A Review on Finding Efficient Approach to Detect Customer Emotion Analysis using Deep Learning Analysis

Journal of Trends in Computer Science and Smart Technology - September 2019 ◽

10.36548/jtcsst.2021.2.003 ◽

2021 ◽

Vol 3 (2) ◽

pp. 95-113

Author(s):

Kottilingam Kottursamy

Keyword(s):

Deep Learning ◽

Facial Expression Recognition ◽

Data Augmentation ◽

Expression Recognition ◽

Generalization Error ◽

Augmentation Techniques ◽

Parallel Feature ◽

Improved Accuracy ◽

Learning Analysis

The role of facial expression recognition in social science and human-computer interaction has received a lot of attention. Deep learning advancements have resulted in advances in this field, which go beyond human-level accuracy. This article discusses various common deep learning algorithms for emotion recognition, all while utilising the eXnet library for achieving improved accuracy. Memory and computation, on the other hand, have yet to be overcome. Overfitting is an issue with large models. One solution to this challenge is to reduce the generalization error. We employ a novel Convolutional Neural Network (CNN) named eXnet to construct a new CNN model utilising parallel feature extraction. The most recent eXnet (Expression Net) model improves on the previous model's inaccuracy while having many fewer parameters. Data augmentation techniques that have been in use for decades are being utilized with the generalized eXnet. It employs effective ways to reduce overfitting while maintaining overall size under control.

Download Full-text

Fire Detection Method in Smart City Environments Using a Deep-Learning-Based Approach

Electronics ◽

10.3390/electronics11010073 ◽

2021 ◽

Vol 11 (1) ◽

pp. 73

Author(s):

Kuldoshbay Avazov ◽

Mukhriddin Mukhiddinov ◽

Fazliddin Makhmudov ◽

Young Im Cho

Keyword(s):

Deep Learning ◽

Urban Areas ◽

High Speed ◽

Data Augmentation ◽

Smart Cities ◽

Fire Detection ◽

Training Dataset ◽

Digital Cameras ◽

Augmentation Techniques ◽

Fire Accidents

In the construction of new smart cities, traditional fire-detection systems can be replaced with vision-based systems to establish fire safety in society using emerging technologies, such as digital cameras, computer vision, artificial intelligence, and deep learning. In this study, we developed a fire detector that accurately detects even small sparks and sounds an alarm within 8 s of a fire outbreak. A novel convolutional neural network was developed to detect fire regions using an enhanced You Only Look Once (YOLO) v4network. Based on the improved YOLOv4 algorithm, we adapted the network to operate on the Banana Pi M3 board using only three layers. Initially, we examined the originalYOLOv4 approach to determine the accuracy of predictions of candidate fire regions. However, the anticipated results were not observed after several experiments involving this approach to detect fire accidents. We improved the traditional YOLOv4 network by increasing the size of the training dataset based on data augmentation techniques for the real-time monitoring of fire disasters. By modifying the network structure through automatic color augmentation, reducing parameters, etc., the proposed method successfully detected and notified the incidence of disastrous fires with a high speed and accuracy in different weather environments—sunny or cloudy, day or night. Experimental results revealed that the proposed method can be used successfully for the protection of smart cities and in monitoring fires in urban areas. Finally, we compared the performance of our method with that of recently reported fire-detection approaches employing widely used performance matrices to test the fire classification results achieved.

Download Full-text

Making Radiomics More Reproducible across Scanner and Imaging Protocol Variations: A Review of Harmonization Methods

Journal of Personalized Medicine ◽

10.3390/jpm11090842 ◽

2021 ◽

Vol 11 (9) ◽

pp. 842

Author(s):

Shruti Atul Mali ◽

Abdalla Ibrahim ◽

Henry C. Woodruff ◽

Vincent Andrearczyk ◽

Henning Müller ◽

...

Keyword(s):

Deep Learning ◽

Domain Knowledge ◽

Data Augmentation ◽

Image Data ◽

Clinical Decision ◽

Generative Adversarial Networks ◽

Imaging Protocol ◽

Image Domain ◽

Style Transfer ◽

Augmentation Techniques

Radiomics converts medical images into mineable data via a high-throughput extraction of quantitative features used for clinical decision support. However, these radiomic features are susceptible to variation across scanners, acquisition protocols, and reconstruction settings. Various investigations have assessed the reproducibility and validation of radiomic features across these discrepancies. In this narrative review, we combine systematic keyword searches with prior domain knowledge to discuss various harmonization solutions to make the radiomic features more reproducible across various scanners and protocol settings. Different harmonization solutions are discussed and divided into two main categories: image domain and feature domain. The image domain category comprises methods such as the standardization of image acquisition, post-processing of raw sensor-level image data, data augmentation techniques, and style transfer. The feature domain category consists of methods such as the identification of reproducible features and normalization techniques such as statistical normalization, intensity harmonization, ComBat and its derivatives, and normalization using deep learning. We also reflect upon the importance of deep learning solutions for addressing variability across multi-centric radiomic studies especially using generative adversarial networks (GANs), neural style transfer (NST) techniques, or a combination of both. We cover a broader range of methods especially GANs and NST methods in more detail than previous reviews.

Download Full-text

Hate Speech Detection for Indonesia Tweets Using Word Embedding And Gated Recurrent Unit

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) ◽

10.22146/ijccs.40125 ◽

2019 ◽

Vol 13 (1) ◽

pp. 43

Author(s):

Junanda Patihullah ◽

Edi Winarko

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Feature Extraction ◽

Deep Learning ◽

Hate Speech ◽

Support Vector ◽

Speech Detection ◽

The People ◽

Gated Recurrent Unit ◽

Rule Out

Social media has changed the people mindset to express thoughts and moods. As the activity of social media users increases, it does not rule out the possibility of crimes of spreading hate speech can spread quickly and widely. So that it is not possible to detect hate speech manually. GRU is one of the deep learning methods that has the ability to learn information relations from the previous time to the present time. In this research feature extraction used is word2vec, because it has the ability to learn semantics between words. In this research the GRU performance will be compared with other supervision methods such as support vector machine, naive bayes, decision tree and logistic regression. The results obtained show that the best accuracy is 92.96% by the GRU model with word2vec feature extraction. The use of word2vec in the comparison supervision method is not good enough from tf and tf-idf.

Download Full-text

Detecting hate speech against politicians in Arabic community on social media

International Journal of Web Information Systems ◽

10.1108/ijwis-08-2019-0036 ◽

2020 ◽

Vol 16 (3) ◽

pp. 295-313

Author(s):

Imane Guellil ◽

Ahsan Adeel ◽

Faical Azouaou ◽

Sara Chennoufi ◽

Hanene Maafi ◽

...

Keyword(s):

Social Media ◽

Deep Learning ◽

Hate Speech ◽

Short Term Memory ◽

Arabic Language ◽

Short Term ◽

Speech Corpus ◽

Term Memory ◽

Content Type ◽

Speech Detection

Purpose This paper aims to propose an approach for hate speech detection against politicians in Arabic community on social media (e.g. Youtube). In the literature, similar works have been presented for other languages such as English. However, to the best of the authors’ knowledge, not much work has been conducted in the Arabic language. Design/methodology/approach This approach uses both classical algorithms of classification and deep learning algorithms. For the classical algorithms, the authors use Gaussian NB (GNB), Logistic Regression (LR), Random Forest (RF), SGD Classifier (SGD) and Linear SVC (LSVC). For the deep learning classification, four different algorithms (convolutional neural network (CNN), multilayer perceptron (MLP), long- or short-term memory (LSTM) and bi-directional long- or short-term memory (Bi-LSTM) are applied. For extracting features, the authors use both Word2vec and FastText with their two implementations, namely, Skip Gram (SG) and Continuous Bag of Word (CBOW). Findings Simulation results demonstrate the best performance of LSVC, BiLSTM and MLP achieving an accuracy up to 91%, when it is associated to SG model. The results are also shown that the classification that has been done on balanced corpus are more accurate than those done on unbalanced corpus. Originality/value The principal originality of this paper is to construct a new hate speech corpus (Arabic_fr_en) which was annotated by three different annotators. This corpus contains the three languages used by Arabic people being Arabic, French and English. For Arabic, the corpus contains both script Arabic and Arabizi (i.e. Arabic words written with Latin letters). Another originality is to rely on both shallow and deep leaning classification by using different model for extraction features such as Word2vec and FastText with their two implementation SG and CBOW.

Download Full-text