Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data - AND '09

As the amount of data grows very fast inside and outside of an enterprise, it is getting important to seamlessly analyze both data types for total business intelligence. The data can be classified into two categories: structured and unstructured. For getting total business intelligence, it is important to seamlessly analyze both of them. Especially, as most of business data are unstructured text documents, including the Web pages in Internet, we need a Text OLAP solution to perform multidimensional analysis of text documents in the same way as structured relational data. We first survey the representative works selected for demonstrating how the technologies of text mining and information retrieval can be applied for multidimensional analysis of text documents, because they are major technologies handling text data. And then, we survey the representative works selected for demonstrating how we can associate and consolidate both unstructured text documents and structured relation data for obtaining total business intelligence. Finally, we present a future business intelligence platform architecture as well as related research topics. We expect the proposed total heterogeneous business intelligence architecture, which integrates information retrieval, text mining, and information extraction technologies all together, including relational OLAP technologies, would make a better platform toward total business intelligence.

Download Full-text

Text-Driven Reasoning and Multi-Structured Data Analytics for Business Intelligence

Business Intelligence ◽

10.4018/978-1-4666-9562-7.ch001 ◽

2016 ◽

pp. 1-32 ◽

Cited By ~ 1

Author(s):

Lipika Dey ◽

Ishan Verma

Keyword(s):

Business Intelligence ◽

Data Analytics ◽

Statistical Significance ◽

Heterogeneous Data ◽

Structured Data ◽

Text Data ◽

Business Operations ◽

Unstructured Text ◽

Business Data ◽

Heterogeneous Resources

Business Intelligence (BI) refers to an organization's capability to gather and analyze data about business operations and transactions in order to evaluate its performance. The abundance of information both within the enterprise and outside of it has necessitated a change in traditional Business Intelligence practices. There is a need to exploit heterogeneous resources. Text data like news, analyst reports, etc. helps in better interpretation of business data. In this chapter, the authors present a futuristic BI framework that facilitates acquisition, indexing, and analysis of heterogeneous data for extracting business intelligence. It enables integration of unstructured text data and structured business data seamlessly to generate insights. The authors propose methods that can help in extraction of events or significant happenings from both unstructured and structured data, correlate the events, and thereafter reason to generate insights. The insights extracted could be validated as cause-effect pairs based on the statistical significance of co-occurrence of events.

Download Full-text

Discovering Business Processes in CRM Systems by Leveraging Unstructured Text Data

2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) ◽

10.1109/hpcc/smartcity/dss.2018.00257 ◽

2018 ◽

Cited By ~ 2

Author(s):

Rolf B. Banziger ◽

Artie Basukoski ◽

Thierry Chaussalet

Keyword(s):

Business Processes ◽

Text Data ◽

Unstructured Text

Download Full-text

Modelling the sensory space of varietal wines: Mining of large, unstructured text data and visualisation of style patterns

Scientific Reports ◽

10.1038/s41598-018-23347-w ◽

2018 ◽

Vol 8 (1) ◽

Author(s):

Carlo C. Valente ◽

Florian F. Bauer ◽

Fritz Venter ◽

Bruce Watson ◽

Hélène H. Nieuwoudt

Keyword(s):

Text Data ◽

Unstructured Text ◽

Sensory Space

Download Full-text

Changes in industrial network logics: the case of the Japanese retail industry

Journal of Business and Industrial Marketing ◽

10.1108/jbim-05-2019-0213 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Yoritoshi Hara

Keyword(s):

Content Analysis ◽

Design Methodology ◽

Retail Industry ◽

Business Networks ◽

Text Data ◽

Japanese Market ◽

Content Type ◽

The Third ◽

Micro Level ◽

Industrial Network

Purpose This study aims to examine changes in “network logics” that refer to cognitive views socially accepted by actors about the network. These logics provide organizations with templates on how to act in business networks. This study investigates the causes and processes of network logic changes and the phases in the changes. Design/methodology/approach This study relies on content analysis using text data from newspaper articles on global retailers entering the Japanese retail industry. Three different logics were found to describe the actions of the retailers. Two of the logics are related to institutional and strategic logics including network logics, while the third is associated with institutional works that mean actions to create, maintain and disrupt institutions. Findings With regard to transitions in network logics in the Japanese retail industry, the analysis identified four phases: politicization, reflection, establishment and evaluation. Changes in regulative and normative logics were resulted from institutional works of the global retailers into the Japanese market. The findings also include empirical description about how network changes progress through interactions among business actors. Additionally, compared to the regulative and normative logics, it would be difficult to influence the cultural-cognitive logics. Originality/value Business networks often transform with changes in network logics. This study contributes to the literature on industrial network changes by exploring the interactions between macro-level structural states and micro-level events in network logic transitions.

Download Full-text

A graph construction study for graph-based semi-supervised learning: Case study on unstructured text data

2019 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata47090.2019.9006465 ◽

2019 ◽

Author(s):

Sumedh Yadav ◽

Gautam Kumar ◽

Shivam Kumar

Keyword(s):

Supervised Learning ◽

Text Data ◽

Unstructured Text

Download Full-text

Effective processing of unstructured data using python in Hadoop map reduce

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.21.12456 ◽

2018 ◽

Vol 7 (2.21) ◽

pp. 417

Author(s):

K Kousalya ◽

Shaik Javed Parvez

Keyword(s):

Open Source ◽

Unstructured Data ◽

Map Reduce ◽

Text Data ◽

Apache Hadoop ◽

Unstructured Text ◽

Wide Range ◽

Two Stages

In present scenario, the growing data are naturally unstructured. In this case to handle the wide range of data, is difficult. The proposed paper is to process the unstructured text data effectively in Hadoop map reduce using Python. Apache Hadoop is an open source platform and it widely uses Map Reduce framework. Map Reduce is popular and effective for processing the unstructured data in parallel manner. There are two stages in map reduce, namely transform and repository. Here the input splits into small blocks and worker node process individual blocks in parallel. This map reduce generally is based on java. While Hadoop Streaming allows writing mapper and reducer in other languages like Python. In this paper, we are going to show an alternative way of processing the growing unstructured content data by using python. We will also compare the performance between java based and non-java based programs.

Download Full-text

Introduction of topic modeling for extracting potential information from unstructured text data: Issue analysis on news article of dementia-related physical activity

Korean Journal of Sport Science ◽

10.24985/kjss.2019.30.3.501 ◽

2019 ◽

Vol 30 (3) ◽

pp. 501-512

Author(s):

윤효준 ◽

Jiwun Yoon ◽

JaeHyeon

Keyword(s):

Physical Activity ◽

Topic Modeling ◽

News Article ◽

Text Data ◽

Unstructured Text ◽

Issue Analysis

Download Full-text

Recognition of Disease Genetic Information from Unstructured Text Data Based on BiLSTM-CRF for Molecular Mechanisms

Security and Communication Networks ◽

10.1155/2021/6635027 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Lejun Gong ◽

Xingxing Zhang ◽

Tianyin Chen ◽

Li Zhang

Keyword(s):

Genetic Information ◽

Molecular Mechanisms ◽

Short Term Memory ◽

Conditional Random Field ◽

Autism Spectrum ◽

Biomedical Literature ◽

Repetitive Behaviour ◽

Biomedical Knowledge ◽

Text Data ◽

Unstructured Text

Disease relevant entities are an important task in mining unstructured text data from the biomedical literature for achieving biomedical knowledge. Autism spectrum disorder (ASD) is a disease related to a neurological and developmental disorder characterized by deficits in communication and social interaction and by repetitive behaviour. However, this kind of disease remains unclear to date. In this study, it identifies entities associated with disease using the machine learning of a computational way from text data collection for molecular mechanisms related to ASD. Entities related to disease are extracted from the biomedical literature related to autism by using deep learning with bidirectional long short-term memory (BiLSTM) and conditional random field (CRF) model. Compared other previous works, the approach is promising for identifying entities related to disease. The proposed approach including five types of molecular entities is evaluated by GENIA corpus to obtain an F-score of 76.81%. The work has extracted 9146 proteins, 145 RNAs, 7680 DNAs, 1058 cell-types, and 981 cell-lines from the autism biomedical literature after removing repeated molecular entities. Finally, we perform GO and KEGG analyses of the test dataset. This study could serve as a reference for further studies on the etiology of disease on the basis of molecular mechanisms and provide a way to explore disease genetic information.

Download Full-text

Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data - AND '09

Analysis of unstructured text data for a person social profile

Incorporating Text OLAP in Business Intelligence

Text-Driven Reasoning and Multi-Structured Data Analytics for Business Intelligence

Discovering Business Processes in CRM Systems by Leveraging Unstructured Text Data

Modelling the sensory space of varietal wines: Mining of large, unstructured text data and visualisation of style patterns

Changes in industrial network logics: the case of the Japanese retail industry

A graph construction study for graph-based semi-supervised learning: Case study on unstructured text data

Effective processing of unstructured data using python in Hadoop map reduce

Introduction of topic modeling for extracting potential information from unstructured text data: Issue analysis on news article of dementia-related physical activity

Recognition of Disease Genetic Information from Unstructured Text Data Based on BiLSTM-CRF for Molecular Mechanisms

Export Citation Format