scholarly journals Social Media Cross-Source and Cross-Domain Sentiment Classification

2019 ◽  
Vol 18 (05) ◽  
pp. 1469-1499 ◽  
Author(s):  
Paola Zola ◽  
Paulo Cortez ◽  
Costantino Ragno ◽  
Eugenio Brentari

Due to the expansion of Internet and Web 2.0 phenomenon, there is a growing interest in sentiment analysis of freely opinionated text. In this paper, we propose a novel cross-source cross-domain sentiment classification, in which cross-domain-labeled Web sources (Amazon and Tripadvisor) are used to train supervised learning models (including two deep learning algorithms) that are tested on typically nonlabeled social media reviews (Facebook and Twitter). We explored a three-step methodology, in which distinct balanced training, text preprocessing and machine learning methods were tested, using two languages: English and Italian. The best results were achieved using undersampling training and a Convolutional Neural Network. Interesting cross-source classification performances were achieved, in particular when using Amazon and Tripadvisor reviews to train a model that is tested on Facebook data for both English and Italian.

Sensors ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 210 ◽  
Author(s):  
Zied Tayeb ◽  
Juri Fedjaev ◽  
Nejla Ghaboosi ◽  
Christoph Richter ◽  
Lukas Everding ◽  
...  

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g., hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause by the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: (1) A long short-term memory (LSTM); (2) a spectrogram-based convolutional neural network model (CNN); and (3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (any manual) feature engineering. Results were evaluated on our own publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from “BCI Competition IV”. Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1514
Author(s):  
Ali Aljofey ◽  
Qingshan Jiang ◽  
Qiang Qu ◽  
Mingqing Huang ◽  
Jean-Pierre Niyigena

Phishing is the easiest way to use cybercrime with the aim of enticing people to give accurate information such as account IDs, bank details, and passwords. This type of cyberattack is usually triggered by emails, instant messages, or phone calls. The existing anti-phishing techniques are mainly based on source code features, which require to scrape the content of web pages, and on third-party services which retard the classification process of phishing URLs. Although the machine learning techniques have lately been used to detect phishing, they require essential manual feature engineering and are not an expert at detecting emerging phishing offenses. Due to the recent rapid development of deep learning techniques, many deep learning-based methods have also been introduced to enhance the classification performance. In this paper, a fast deep learning-based solution model, which uses character-level convolutional neural network (CNN) for phishing detection based on the URL of the website, is proposed. The proposed model does not require the retrieval of target website content or the use of any third-party services. It captures information and sequential patterns of URL strings without requiring a prior knowledge about phishing, and then uses the sequential pattern features for fast classification of the actual URL. For evaluations, comparisons are provided between different traditional machine learning models and deep learning models using various feature sets such as hand-crafted, character embedding, character level TF-IDF, and character level count vectors features. According to the experiments, the proposed model achieved an accuracy of 95.02% on our dataset and an accuracy of 98.58%, 95.46%, and 95.22% on benchmark datasets which outperform the existing phishing URL models.


Author(s):  
Zied Tayeb ◽  
Juri Fedjaev ◽  
Nejla Ghaboosi ◽  
Christoph Richter ◽  
Lukas Everding ◽  
...  

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g. hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause for the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: 1) a long short-term memory (LSTM); 2) a proposed spectrogram-based convolutional neural network model (pCNN); and 3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (manual) feature engineering. Results were evaluated on our own, publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from "BCI Competition IV". Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Sorayya Rezayi ◽  
Niloofar Mohammadzadeh ◽  
Hamid Bouraghi ◽  
Soheila Saeedi ◽  
Ali Mohammadpour

Background. Leukemia is fatal cancer in both children and adults and is divided into acute and chronic. Acute lymphoblastic leukemia (ALL) is a subtype of this cancer. Early diagnosis of this disease can have a significant impact on the treatment of this disease. Computational intelligence-oriented techniques can be used to help physicians identify and classify ALL rapidly. Materials and Method. In this study, the utilized dataset was collected from a CodaLab competition to classify leukemic cells from normal cells in microscopic images. Two famous deep learning networks, including residual neural network (ResNet-50) and VGG-16 were employed. These two networks are already trained by our assigned parameters, meaning we did not use the stored weights; we adjusted the weights and learning parameters too. Also, a convolutional network with ten convolutional layers and 2 ∗ 2 max-pooling layers—with strides 2—was proposed, and six common machine learning techniques were developed to classify acute lymphoblastic leukemia into two classes. Results. The validation accuracies (the mean accuracy of training and test networks for 100 training cycles) of the ResNet-50, VGG-16, and the proposed convolutional network were found to be 81.63%, 84.62%, and 82.10%, respectively. Among applied machine learning methods, the lowest obtained accuracy was related to multilayer perceptron (27.33%) and highest for random forest (81.72%). Conclusion. This study showed that the proposed convolutional neural network has optimal accuracy in the diagnosis of ALL. By comparing various convolutional neural networks and machine learning methods in diagnosing this disease, the convolutional neural network achieved good performance and optimal execution time without latency. This proposed network is less complex than the two pretrained networks and can be employed by pathologists and physicians in clinical systems for leukemia diagnosis.


2019 ◽  
Vol 147 (8) ◽  
pp. 2827-2845 ◽  
Author(s):  
David John Gagne II ◽  
Sue Ellen Haupt ◽  
Douglas W. Nychka ◽  
Gregory Thompson

Abstract Deep learning models, such as convolutional neural networks, utilize multiple specialized layers to encode spatial patterns at different scales. In this study, deep learning models are compared with standard machine learning approaches on the task of predicting the probability of severe hail based on upper-air dynamic and thermodynamic fields from a convection-allowing numerical weather prediction model. The data for this study come from patches surrounding storms identified in NCAR convection-allowing ensemble runs from 3 May to 3 June 2016. The machine learning models are trained to predict whether the simulated surface hail size from the Thompson hail size diagnostic exceeds 25 mm over the hour following storm detection. A convolutional neural network is compared with logistic regressions using input variables derived from either the spatial means of each field or principal component analysis. The convolutional neural network statistically significantly outperforms all other methods in terms of Brier skill score and area under the receiver operator characteristic curve. Interpretation of the convolutional neural network through feature importance and feature optimization reveals that the network synthesized information about the environment and storm morphology that is consistent with our understanding of hail growth, including large lapse rates and a wind shear profile that favors wide updrafts. Different neurons in the network also record different storm modes, and the magnitude of the output of those neurons is used to analyze the spatiotemporal distributions of different storm modes in the NCAR ensemble.


Author(s):  
Siji George C G, Et. al.

Sentiment analysis is one of the active research areas in the field of datamining. Machine learning algorithms are capable to implement sentiment analysis. Due to the capacity of self-learning and massive data handling, most of the researchers are using deep learning neural networks for solving sentiment classification tasks. So, in this paper, a new model is designed under a hybrid framework of machine learning and deep learning which couples Convolutional Neural Network and Random Forest classifier for fine-grained sentiment analysis. The Continuous Bag-of-Word (CBOW) model is used to vectorize the text input. The most important features are extracted by the Convolutional Neural Network (CNN). The extracted features are used by the Random Forest(RF) classifier for sentiment classification. The performance of the proposed hybrid CNNRF model is comparedwith the base model such as Convolutional Neural Network (CNN) and Random Forest (RF) classifier. The experimental result shows that the proposed model far beat the existing base models in terms of classification accuracy and effectively integrated genetically-modified CNN with Random Forest classifier.


Author(s):  
Chuan Jiang ◽  
Qianmin Su ◽  
Lele Zhang ◽  
Bo Huang

As a typical cyber-physical-social system (CPSS), the waste collection system profoundly changes the current waste processing mode and greatly relieves the dilemma of waste disposal. However, the existing waste collection system does not provide the function that guides people to deliver the waste into the correct trash bin. In order to improve the efficiency of waste collection system, we propose an automatic question answering system based on convolutional neural network (CNN) to help people classify waste correctly. The construction process of automatic question answering system is divided into the following steps. We first construct a question answering dataset about waste classification, in which question answering pairs from the four waste categories (recyclable waste, harmful waste, dry waste, and wet waste) are included. After the dataset is constructed, we perform text preprocessing on the dataset, which includes denoising, Chinese word segmentation, and removing stop words. After text preprocessing, we use the Word2vec model as feature representation. Then, we construct a CNN and utilize the word embeddings as an input to train model. Finally, we deploy the trained model to the waste collection system, which can answer the question of waste classification that people ask. We also present a comparative analysis of the proposed method and traditional machine learning methods. The experiment shows that the proposed method has higher accuracy of waste classification than that of traditional machine learning methods.


2018 ◽  
Vol 8 (9) ◽  
pp. 1573 ◽  
Author(s):  
Vladimir Kulyukin ◽  
Sarbajit Mukherjee ◽  
Prakhar Amlathe

Electronic beehive monitoring extracts critical information on colony behavior and phenology without invasive beehive inspections and transportation costs. As an integral component of electronic beehive monitoring, audio beehive monitoring has the potential to automate the identification of various stressors for honeybee colonies from beehive audio samples. In this investigation, we designed several convolutional neural networks and compared their performance with four standard machine learning methods (logistic regression, k-nearest neighbors, support vector machines, and random forests) in classifying audio samples from microphones deployed above landing pads of Langstroth beehives. On a dataset of 10,260 audio samples where the training and testing samples were separated from the validation samples by beehive and location, a shallower raw audio convolutional neural network with a custom layer outperformed three deeper raw audio convolutional neural networks without custom layers and performed on par with the four machine learning methods trained to classify feature vectors extracted from raw audio samples. On a more challenging dataset of 12,914 audio samples where the training and testing samples were separated from the validation samples by beehive, location, time, and bee race, all raw audio convolutional neural networks performed better than the four machine learning methods and a convolutional neural network trained to classify spectrogram images of audio samples. A trained raw audio convolutional neural network was successfully tested in situ on a low voltage Raspberry Pi computer, which indicates that convolutional neural networks can be added to a repertoire of in situ audio classification algorithms for electronic beehive monitoring. The main trade-off between deep learning and standard machine learning is between feature engineering and training time: while the convolutional neural networks required no feature engineering and generalized better on the second, more challenging dataset, they took considerably more time to train than the machine learning methods. To ensure the replicability of our findings and to provide performance benchmarks for interested research and citizen science communities, we have made public our source code and our curated datasets.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3464-3468

Psychological stress which is a mental illness also causes physical problems to the human. Nowadays social media plays an important role in the world for communication to share their thoughts with their friends and family. The social media analysis is the process of detecting and predicting the user's thoughts and opinions which also one of the important perspective in the developing business environment. The overwhelming stress and long term stress sometimes lead to suicidal ideation. By analyzing the social media content to predict the overwhelming stress state of the users in the earlier stage will reduce the psychological stress and suicidal rate too. In this paper, we address the problem of stress prediction by using social media. The machine learning and deep learning methods to perform the classification of stress analysis. Here both image and text- tweet data are used and the images are processed with the Optical Character Recognition and the text data are processed by using the Natural Language Processing and Convolutional Neural Network for classifying the tweet content of the user as stressed or non-stressed. Furthermore, with the advancement of the machine learning and deep learning method of classification gives a better result in terms of performance and accuracy of the prediction.


Sign in / Sign up

Export Citation Format

Share Document