Text Mining and Big Textual Data: Relevant Statistical Models

Author(s):  
Fionn Murtagh
Author(s):  
Annie T. Chen ◽  
Shu-Hong Zhu ◽  
Mike Conway

Our aim in this work is to apply text mining and novel visualization techniques to textual data derived from online health discussion forums in order to better understand consumers experiences and perceptions of electronic cigarettes and hookah.


Author(s):  
Mohammed M. Tumala ◽  
Babatunde S. Omotosho

This paper employs text-mining techniques to analyse the communication strategy of the Central Bank of Nigeria (CBN) during the period 2004-2019. Since the policy communique released after each meeting of the CBN’s monetary policy committee (MPC) represents an important tool of central bank communication, we construct a corpus based on 87 policy communiques with a total of 123, 353 words. Having processed the textual data into a form suitable for analysis, we examined the readability, sentiments, and topics of the policy documents. While the CBN’s communication has increased substantially over the years, implying increased monetary policy transparency; the computed Coleman and Liau readability index shows that the word and sentence structures of the policy communiques have become more complex, thus reducing its readability. In terms of monetary policy sentiments, we find an average net score of -10.5 per cent, reflecting the level of policy uncertainties faced by the MPC over the sample period. In addition, our results indicate that the topics driving the linguistic contents of the communiques were influenced by the Bank’s policy objectives as well as the nature of shocks hitting the economy per period.


2020 ◽  
Vol 11 (2) ◽  
pp. 66-81
Author(s):  
Badia Klouche ◽  
Sidi Mohamed Benslimane ◽  
Sakina Rim Bennabi

Sentiment analysis is one of the recent areas of emerging research in the classification of sentiment polarity and text mining, particularly with the considerable number of opinions available on social media. The Algerian Operator Telephone Ooredoo, as other operators, deploys in its new strategy to conquer new customers, by exploiting their opinions through a sentiments analysis. The purpose of this work is to set up a system called “Ooredoo Rayek”, whose objective is to collect, transliterate, translate and classify the textual data expressed by the Ooredoo operator's customers. This article developed a set of rules allowing the transliteration from Algerian Arabizi to Algerian dialect. Furthermore, the authors used Naïve Bayes (NB) and (Support Vector Machine) SVM classifiers to assign polarity tags to Facebook comments from the official pages of Ooredoo written in multilingual and multi-dialect context. Experimental results show that the system obtains good performance with 83% of accuracy.


Author(s):  
Masaomi Kimura ◽  

Text mining has been growing; mainly due to the need to extract useful information from vast amounts of textual data. Our target here is text data, a collection of freely described data from questionnaires. Unlike research papers, newspaper articles, call-center logs and web pages, which are usually the targets of text mining analysis, the freely described data contained in the questionnaire responses have specific characteristics, including a small number of short sentences forming individual pieces of data, while the wide variety of content precludes the applications of clustering algorithms used to classify the same. In this paper, we suggest the way to extract the opinions which are delivered by multiple respondents, based on the modification relationships included in each sentence in the freely described data. Certain applications of our method are also presented after the introduction of our approach.


2014 ◽  
Vol 18 (03) ◽  
pp. 1440004 ◽  
Author(s):  
VICTORIA KAYSER ◽  
KERSTIN GOLUCHOWICZ ◽  
ANTJE BIERWISCH

Technology roadmapping is a well-established method used in strategy development to map alternative future paths, while text mining offers untapped potentials concerning early detection and environmental scanning. In this paper, the roadmapping process is split into different steps in order to analyse which text mining methods could add further value within each. This leads to a two-layered process model, which includes text mining techniques to systematically integrate external information in ongoing roadmapping processes. Textual data can be used for a structured analysis and exploration of thematic fields and an objective, quantitative summary of actual developments. To demonstrate some of the benefits, the field of "cloud computing" is used to illustrate the procedure. As this article will show, the results provided by this approach extend the existing methodology, integrates an external view and complements expert opinion.


Author(s):  
Axel Philipps

Current text mining applications statistically work on the basis of linguistic models and theories and certain parameter settings. This enables researchers to classify, group and rank a large textual corpus – a useful feature for scholars who study all forms of written text. However, these underlying conditions differ in respect to the way how interpretively-oriented social scientists approach textual data. They aim to understand the meaning of text by heuristically using known categorisations, concepts and other formal methods. More importantly, they are primarily interested in documents that are incomprehensible with our current knowledge because these  documents offer a chance to formulate new empirically-grounded typifications, hypotheses, and theories. In this paper, therefore, I propose for a text mining technique with different aims and procedures. It includes a shift away from methods of grouping and clustering the whole text corpus to a process that sorts out uncategorisable documents. Such an approach will be demonstrated using a simple example. While more elaborate text mining techniques might become tools for more complex tasks, the given example just presents the essence of a possible working principle. As such, it supports social inquiries that search for and examine unfamiliar patterns and regularities.


2020 ◽  
Vol 2 (2) ◽  
pp. 153-171
Author(s):  
Zulkifli Arsyad

Text mining is widely used to find hidden patterns and information in a large number of semi and unstructured texts. Text mining extracts interesting patterns to explore knowledge from textual data sources. Association rule extraction GARW (Generating Association Rule using Weighting Scheme) can be used to find knowledge from a collection of web content without having to read all the web content manually from the many search results of crawlers. The GARW algorithm is a development of a priori to produce relevant association rules. From the results of this knowledge discovery can facilitate netizens users in finding relevant information from search keywords without having to review one by one web content generated from search engine searches.


Author(s):  
Vladimer B. Kobayashi ◽  
Stefan T. Mol ◽  
Jarno Vrolijk ◽  
Gábor Kismihók
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document