Textual data dimensionality reduction - a deep learning approach

2018 ◽  
Vol 79 (15-16) ◽  
pp. 11039-11050 ◽  
Author(s):  
Neetu Kushwaha ◽  
Millie Pant
2020 ◽  
Vol 12 (2) ◽  
pp. 21-34
Author(s):  
Mostefai Abdelkader

In recent years, increasing attention is being paid to sentiment analysis on microblogging platforms such as Twitter. Sentiment analysis refers to the task of detecting whether a textual item (e.g., a tweet) contains an opinion about a topic. This paper proposes a probabilistic deep learning approach for sentiments analysis. The deep learning model used is a convolutional neural network (CNN). The main contribution of this approach is a new probabilistic representation of the text to be fed as input to the CNN. This representation is a matrix that stores for each word composing the message the probability that it belongs to a positive class and the probability that it belongs to a negative class. The proposed approach is evaluated on four well-known datasets HCR, OMD, STS-gold, and a dataset provided by the SemEval-2017 Workshop. The results of the experiments show that the proposed approach competes with the state-of-the-art sentiment analyzers and has the potential to detect sentiments from textual data in an effective manner.


2016 ◽  
Vol 28 (7) ◽  
pp. 1779-1789 ◽  
Author(s):  
Lei Xu ◽  
Chunxiao Jiang ◽  
Yong Ren ◽  
Hsiao-Hwa Chen

2019 ◽  
Vol 8 (3) ◽  
pp. 7153-7160

From the analysis of big data, dimensionality reduction techniques play a significant role in various fields where the data is huge with multiple columns or classes. Data with high dimensions contains thousands of features where many of these features contain useful information. Along with this there contains a lot of redundant or irrelevant features which reduce the quality, performance of data and decrease the efficiency in computation. Procedures which are done mathematically for reducing dimensions are known as dimensionality reduction techniques. The main aim of the Dimensionality Reduction algorithms such as Principal Component Analysis (PCA), Random Projection (RP) and Non Negative Matrix Factorization (NMF) is used to decrease the inappropriate information from the data and moreover the features and attributes taken from these algorithms were not able to characterize data as different divisions. This paper gives a review about the traditional methods used in Machine algorithm for reducing the dimension and proposes a view, how deep learning can be used for dimensionality reduction.


2018 ◽  
Vol 6 (3) ◽  
pp. 122-126
Author(s):  
Mohammed Ibrahim Khan ◽  
◽  
Akansha Singh ◽  
Anand Handa ◽  
◽  
...  

2020 ◽  
Vol 17 (3) ◽  
pp. 299-305 ◽  
Author(s):  
Riaz Ahmad ◽  
Saeeda Naz ◽  
Muhammad Afzal ◽  
Sheikh Rashid ◽  
Marcus Liwicki ◽  
...  

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.


Sign in / Sign up

Export Citation Format

Share Document