A Text Preprocessing Framework for Text Mining on Big Data Infrastructure

Author(s):  
Watcharaporn Sriyanong ◽  
Nunnapus Moungmingsuk ◽  
Nattawat Khamphakdee
2017 ◽  
Vol 13 (3) ◽  
pp. 47-67 ◽  
Author(s):  
Carina Sofia Andrade ◽  
Maribel Yasmina Santos

The evolution of technology, along with the common use of different devices connected to the Internet, provides a vast growth in the volume and variety of data that are daily generated at high velocity, phenomenon commonly denominated as Big Data. Related with this, several Text Mining techniques make possible the extraction of useful insights from that data, benefiting the decision-making process across multiple areas, using the information, models, patterns or tendencies that these techniques are able to identify. With Sentiment Analysis, it is possible to understand which sentiments and opinions are implicit in this data. This paper proposes an architecture for Sentiment Analysis that uses data from the Twitter, which is able to collect, store, process and analyse data on a real-time fashion. To demonstrate its utility, practical applications are developed using real world examples where Sentiment Analysis brings benefits when applied. With the presented demonstration case, it is possible to verify the role of each used technology and the techniques adopted for Sentiment Analysis.


2017 ◽  
Vol 8 (1) ◽  
pp. 51-72
Author(s):  
Jin-seo Park

Qualitative research methods based on literature review or expert judgement have been used to find core issues, analyze emerging trends and discover promising areas for the future. Deriving results from large amounts of information under this approach is both costly and time consuming. Besides, there is a risk that the results may be influenced by the subjective opinion of experts. In order to make up for such weaknesses, the analysis paradigm for choosing future emerging trend is undergoing a shift toward mplementing qualitative research methods along with quantitative research methods like text mining in a mutually complementary manner. The hange used to implement recent studies is being witnessed in various areas such as the steel industry, the information and communications technology industry, the construction industry in architectural engineering and so on. This study focused on retrieving aviation-related core issues and the promising areas for the future from research papers pertaining to overall aviation areas through text mining method, which is one of the big data analysis techniques. This study has limitations in that its analysis for retrieving the aviation-related core issues and promising fields was restricted to research papers containing the keyword "aviation." However, it has significance in that it prepared a quantitative analysis model for continuously monitoring the derived core issues and emerging trends regarding the promising areas for the future in the aviation industry through the application of a big data-based descriptive approach.


2018 ◽  
Vol 4 (2) ◽  
pp. 148-149 ◽  
Author(s):  
Jinjun Chen ◽  
Honggang Wang

Sign in / Sign up

Export Citation Format

Share Document