An Algorithm for the Detection of Hidden Propaganda in Mixed-Code Text over the Internet

Internet-based communication systems have become an increasing tool for spreading misinformation and propaganda. Even though there exist mechanisms that are able to track unwarranted information and messages, users made up different ways to avoid their scrutiny and detection. An example is represented by the mixed-code language, that is text written in an unconventional form by combining different languages, symbols, scripts and shapes. It aims to make more difficult the detection of specific content, due to its custom and ever changing appearance, by using special characters to substitute for alphabet letters. Indeed, such substitute combinations of symbols, which tries to resemble the shape of the intended alphabet’s letter, makes it still intuitively readable to humans, however nonsensical to machines. In this context, the paper explores the possibility of identifying propaganda in such mixed-code texts over the Internet, centred on a machine learning based approach. In particular, an algorithm in combination with a deep learning models for character identification is proposed in order to detect and analyse whether an element contains propaganda related content. The overall approach is presented, the results gathered from its experimentation are discussed and the achieved performances are compared with the related works.

Download Full-text

Identification of Sarcasm in Textual Data: A Comparative Study

Journal of Data and Information Science ◽

10.2478/jdis-2019-0021 ◽

2019 ◽

Vol 4 (4) ◽

pp. 56-83

Author(s):

Pulkit Mehndiratta ◽

Devpriya Soni

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Hybrid Methods ◽

Public Domain ◽

Hybrid Models ◽

The Internet ◽

Correct Prediction ◽

Learning Models ◽

Textual Data ◽

Practical Implications

Abstract Purpose Ever increasing penetration of the Internet in our lives has led to an enormous amount of multimedia content generation on the internet. Textual data contributes a major share towards data generated on the world wide web. Understanding people’s sentiment is an important aspect of natural language processing, but this opinion can be biased and incorrect, if people use sarcasm while commenting, posting status updates or reviewing any product or a movie. Thus, it is of utmost importance to detect sarcasm correctly and make a correct prediction about the people’s intentions. Design/methodology/approach This study tries to evaluate various machine learning models along with standard and hybrid deep learning models across various standardized datasets. We have performed vectorization of text using word embedding techniques. This has been done to convert the textual data into vectors for analytical purposes. We have used three standardized datasets available in public domain and used three word embeddings i.e Word2Vec, GloVe and fastText to validate the hypothesis. Findings The results were analyzed and conclusions are drawn. The key finding is: the hybrid models that include Bidirectional LongTerm Short Memory (Bi-LSTM) and Convolutional Neural Network (CNN) outperform others conventional machine learning as well as deep learning models across all the datasets considered in this study, making our hypothesis valid. Research limitations Using the data from different sources and customizing the models according to each dataset, slightly decreases the usability of the technique. But, overall this methodology provides effective measures to identify the presence of sarcasm with a minimum average accuracy of 80% or above for one dataset and better than the current baseline results for the other datasets. Practical implications The results provide solid insights for the system developers to integrate this model into real-time analysis of any review or comment posted in the public domain. This study has various other practical implications for businesses that depend on user ratings and public opinions. This study also provides a launching platform for various researchers to work on the problem of sarcasm identification in textual data. Originality/value This is a first of its kind study, to provide us the difference between conventional and the hybrid methods of prediction of sarcasm in textual data. The study also provides possible indicators that hybrid models are better when applied to textual data for analysis of sarcasm.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text

Utilizing Blockchain Technology in Social Media Bot Identification

10.36227/techrxiv.12049374 ◽

2020 ◽

Author(s):

Shreya Reddy ◽

Lisa Ewen ◽

Pankti Patel ◽

Prerak Patel ◽

Ankit Kundal ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Gold Standard ◽

The Internet ◽

Learning Models ◽

Current Time ◽

Machine Learning Methods ◽

Blockchain Technology ◽

Modern Age ◽

Machine Learning Models

<p>As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.<i></i></p>

Download Full-text

Deep Learning in Disease Diagnosis: Models and Datasets

Current Bioinformatics ◽

10.2174/1574893615999201002124021 ◽

2020 ◽

Vol 15 ◽

Author(s):

Deeksha Saxena ◽

Mohammed Haris Siddiqui ◽

Rajnish Kumar

Keyword(s):

Biological Sciences ◽

Machine Learning ◽

Deep Learning ◽

Disease Diagnosis ◽

Learning Models ◽

Data Types ◽

Related Data ◽

Abstract Level ◽

Experimental Validations ◽

Selection Of

Background: Deep learning (DL) is an Artificial neural network-driven framework with multiple levels of representation for which non-linear modules combined in such a way that the levels of representation can be enhanced from lower to a much abstract level. Though DL is used widely in almost every field, it has largely brought a breakthrough in biological sciences as it is used in disease diagnosis and clinical trials. DL can be clubbed with machine learning, but at times both are used individually as well. DL seems to be a better platform than machine learning as the former does not require an intermediate feature extraction and works well with larger datasets. DL is one of the most discussed fields among the scientists and researchers these days for diagnosing and solving various biological problems. However, deep learning models need some improvisation and experimental validations to be more productive. Objective: To review the available DL models and datasets that are used in disease diagnosis. Methods: Available DL models and their applications in disease diagnosis were reviewed discussed and tabulated. Types of datasets and some of the popular disease related data sources for DL were highlighted. Results: We have analyzed the frequently used DL methods, data types and discussed some of the recent deep learning models used for solving different biological problems. Conclusion: The review presents useful insights about DL methods, data types, selection of DL models for the disease diagnosis.

Download Full-text

Machine Learning-Based Malicious X.509 Certificates’ Detection

Applied Sciences ◽

10.3390/app11052164 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2164

Author(s):

Jiaxin Li ◽

Zhaoxin Zhang ◽

Changyong Guo

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Ensemble Learning ◽

Traffic Analysis ◽

Learning Models ◽

Detection Model ◽

Analysis Tools ◽

Average Accuracy ◽

Machine Learning Models

X.509 certificates play an important role in encrypting the transmission of data on both sides under HTTPS. With the popularization of X.509 certificates, more and more criminals leverage certificates to prevent their communications from being exposed by malicious traffic analysis tools. Phishing sites and malware are good examples. Those X.509 certificates found in phishing sites or malware are called malicious X.509 certificates. This paper applies different machine learning models, including classical machine learning models, ensemble learning models, and deep learning models, to distinguish between malicious certificates and benign certificates with Verification for Extraction (VFE). The VFE is a system we design and implement for obtaining plentiful characteristics of certificates. The result shows that ensemble learning models are the most stable and efficient models with an average accuracy of 95.9%, which outperforms many previous works. In addition, we obtain an SVM-based detection model with an accuracy of 98.2%, which is the highest accuracy. The outcome indicates the VFE is capable of capturing essential and crucial characteristics of malicious X.509 certificates.

Download Full-text

Quantum algorithm for quicker clinical prognostic analysis: an application and experimental study using CT scan images of COVID-19 patients

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01588-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Kinshuk Sengupta ◽

Praveen Ranjan Srivastava

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Image Classification ◽

Machine Learning Algorithms ◽

Classification Task ◽

Clinical Image ◽

Prototype Model ◽

Learning Models ◽

Accuracy Measure ◽

Quantum Machine Learning

Abstract Background In medical diagnosis and clinical practice, diagnosing a disease early is crucial for accurate treatment, lessening the stress on the healthcare system. In medical imaging research, image processing techniques tend to be vital in analyzing and resolving diseases with a high degree of accuracy. This paper establishes a new image classification and segmentation method through simulation techniques, conducted over images of COVID-19 patients in India, introducing the use of Quantum Machine Learning (QML) in medical practice. Methods This study establishes a prototype model for classifying COVID-19, comparing it with non-COVID pneumonia signals in Computed tomography (CT) images. The simulation work evaluates the usage of quantum machine learning algorithms, while assessing the efficacy for deep learning models for image classification problems, and thereby establishes performance quality that is required for improved prediction rate when dealing with complex clinical image data exhibiting high biases. Results The study considers a novel algorithmic implementation leveraging quantum neural network (QNN). The proposed model outperformed the conventional deep learning models for specific classification task. The performance was evident because of the efficiency of quantum simulation and faster convergence property solving for an optimization problem for network training particularly for large-scale biased image classification task. The model run-time observed on quantum optimized hardware was 52 min, while on K80 GPU hardware it was 1 h 30 min for similar sample size. The simulation shows that QNN outperforms DNN, CNN, 2D CNN by more than 2.92% in gain in accuracy measure with an average recall of around 97.7%. Conclusion The results suggest that quantum neural networks outperform in COVID-19 traits’ classification task, comparing to deep learning w.r.t model efficacy and training time. However, a further study needs to be conducted to evaluate implementation scenarios by integrating the model within medical devices.

Download Full-text

Reviewing the relationship between machines and radiology: the application of artificial intelligence

Acta Radiologica Open ◽

10.1177/2058460121990296 ◽

2021 ◽

Vol 10 (2) ◽

pp. 205846012199029

Author(s):

Rani Ahmad

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Health Care Professionals ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Health Science ◽

Computer Algorithms ◽

Learning Models ◽

Specificity And Sensitivity

Background The scope and productivity of artificial intelligence applications in health science and medicine, particularly in medical imaging, are rapidly progressing, with relatively recent developments in big data and deep learning and increasingly powerful computer algorithms. Accordingly, there are a number of opportunities and challenges for the radiological community. Purpose To provide review on the challenges and barriers experienced in diagnostic radiology on the basis of the key clinical applications of machine learning techniques. Material and Methods Studies published in 2010–2019 were selected that report on the efficacy of machine learning models. A single contingency table was selected for each study to report the highest accuracy of radiology professionals and machine learning algorithms, and a meta-analysis of studies was conducted based on contingency tables. Results The specificity for all the deep learning models ranged from 39% to 100%, whereas sensitivity ranged from 85% to 100%. The pooled sensitivity and specificity were 89% and 85% for the deep learning algorithms for detecting abnormalities compared to 75% and 91% for radiology experts, respectively. The pooled specificity and sensitivity for comparison between radiology professionals and deep learning algorithms were 91% and 81% for deep learning models and 85% and 73% for radiology professionals (p < 0.000), respectively. The pooled sensitivity detection was 82% for health-care professionals and 83% for deep learning algorithms (p < 0.005). Conclusion Radiomic information extracted through machine learning programs form images that may not be discernible through visual examination, thus may improve the prognostic and diagnostic value of data sets.

Download Full-text

Statistical Boosting: A Preprocessing Technique to Enhance Performance of Machine Learning and Deep Learning Models on Partially Occluded Traffic Signs

2020 IEEE 23rd International Multitopic Conference (INMIC) ◽

10.1109/inmic50486.2020.9318174 ◽

2020 ◽

Author(s):

Abdul Mannan ◽

Kashif Javed ◽

Serosh Karim Noon

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Models ◽

Traffic Signs ◽

Preprocessing Technique ◽

Partially Occluded

Download Full-text

Implementation of conventional communication system in deep learning

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i1.1.10835 ◽

2017 ◽

Vol 7 (1.1) ◽

pp. 696

Author(s):

Satyanarayana P ◽

Charishma Devi. V ◽

Sowjanya P ◽

Satish Babu ◽

N Syam Kumar ◽

...

Keyword(s):

Machine Learning ◽

Computer Vision ◽

Communication Network ◽

Deep Learning ◽

Communication System ◽

Communication Systems ◽

Physical Layer ◽

Learning Capacity ◽

Constrained Learning ◽

Complex Channel

Machine learning (ML) has been broadly connected to the upper layers of communication systems for different purposes, for example, arrangement of cognitive radio and communication network. Nevertheless, its application to the physical layer is hindered by complex channel conditions and constrained learning capacity of regular ML algorithms. Deep learning (DL) has been as of late connected for some fields, for example, computer vision and normal dialect preparing, given its expressive limit and advantageous enhancement ability. This paper describes about a novel use of DL for the physical layer. By deciphering a communication system as an auto encoder, we build up an essential better approach to consider communication system outline as a conclusion to-end reproduction undertaking that tries to together enhance transmitter and receiver in a solitary procedure. This DL based technique demonstrates promising execution change than traditional communication system.

Download Full-text

Validating Deep Neural Networks for Online Decoding of Motor Imagery Movements from EEG Signals

Sensors ◽

10.3390/s19010210 ◽

2019 ◽

Vol 19 (1) ◽

pp. 210 ◽

Cited By ~ 32

Author(s):

Zied Tayeb ◽

Juri Fedjaev ◽

Nejla Ghaboosi ◽

Christoph Richter ◽

Lukas Everding ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Convolutional Neural Network ◽

Motor Imagery ◽

Classification Performance ◽

Feature Engineering ◽

Learning Models ◽

Eeg Signals ◽

Learning Methods

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g., hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause by the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: (1) A long short-term memory (LSTM); (2) a spectrogram-based convolutional neural network model (CNN); and (3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (any manual) feature engineering. Results were evaluated on our own publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from “BCI Competition IV”. Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.

Download Full-text