Exploring technology opportunities and evolution of IoT-related logistics services with text mining

Complex & Intelligent Systems ◽

10.1007/s40747-021-00453-3 ◽

2021 ◽

Author(s):

Mu-Chen Chen ◽

Pui Hung Ho

Keyword(s):

Text Mining ◽

Text Documents ◽

Text Data ◽

Substantial Impact ◽

Logistics Industry ◽

Evolution Analysis ◽

Technology Service ◽

Service Evolution ◽

Logistics Services ◽

Methodological Guidelines

AbstractMany IoT technologies have been applied in the logistics industry in recent years, and they have had a substantial impact on many sectors such as shipping, air freight, warehousing, inventory, etc. Exploring technology opportunities and carrying out technological trend analysis are essential for IoT’s evolution, and there are many techniques or methods for doing so. In this paper, data analysis and text mining techniques, technology opportunity analysis (TOA) and technology-service evolution analysis (TSEA) have been applied to analyze and observe IoT technologies’ and services’ evolution. Academic journals, market reports, and patents have been collected and reviewed on the topic of IoT in the logistics field in this paper. Moreover, by using TOA, technology opportunities have been analyzed to explore IoT-related logistics services. The results of TOA, for example, show that cloud technology is essential to develop smart logistics services, and communication RFID technologies are key to developing information logistics services. Finally, TSEA enables the observation of IoT technology and logistics service evolution by combining unstructured and semi-structured data from text documents. Observing the results of TSEA, the evolution of IoT in logistics is identified, and the results of TSEA also confirm those of TOA using unstructured or semi-structured text data from documents only. The results of this paper are discussed and compared with those of some previous review studies. In summary, the results of this paper provide methodological guidelines on this topic for a comprehensive understanding of IoT-related logistics services.

Download Full-text

Incorporating Text OLAP in Business Intelligence

Business Intelligence Applications and the Web - Advances in Business Information Systems and Analytics ◽

10.4018/978-1-61350-038-5.ch004 ◽

2011 ◽

pp. 77-101 ◽

Cited By ~ 1

Author(s):

Byung-Kwon Park ◽

Il-Yeol Song

Keyword(s):

Information Retrieval ◽

Text Mining ◽

Business Intelligence ◽

Multidimensional Analysis ◽

Web Pages ◽

Data Types ◽

Text Documents ◽

Text Data ◽

Platform Architecture ◽

Unstructured Text

As the amount of data grows very fast inside and outside of an enterprise, it is getting important to seamlessly analyze both data types for total business intelligence. The data can be classified into two categories: structured and unstructured. For getting total business intelligence, it is important to seamlessly analyze both of them. Especially, as most of business data are unstructured text documents, including the Web pages in Internet, we need a Text OLAP solution to perform multidimensional analysis of text documents in the same way as structured relational data. We first survey the representative works selected for demonstrating how the technologies of text mining and information retrieval can be applied for multidimensional analysis of text documents, because they are major technologies handling text data. And then, we survey the representative works selected for demonstrating how we can associate and consolidate both unstructured text documents and structured relation data for obtaining total business intelligence. Finally, we present a future business intelligence platform architecture as well as related research topics. We expect the proposed total heterogeneous business intelligence architecture, which integrates information retrieval, text mining, and information extraction technologies all together, including relational OLAP technologies, would make a better platform toward total business intelligence.

Download Full-text

Dual Scaling in Data Mining from Text Databases

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2006.p0451 ◽

2006 ◽

Vol 10 (4) ◽

pp. 451-457 ◽

Cited By ~ 3

Author(s):

Junzo Watada ◽

◽

Keisuke Aoki ◽

Masahiro Kawano ◽

Muhammad Suzuri Hitam ◽

...

Keyword(s):

Multivariate Analysis ◽

Text Mining ◽

Kansei Engineering ◽

Semantic Meaning ◽

Dual Scaling ◽

Text Documents ◽

Text Data ◽

Text Document ◽

Text Information ◽

Quantification Model

The availability of multimedia text document information has disseminated text mining among researchers. Text documents, integrate numerical and linguistic data, making text mining interesting and challenging. We propose text mining based on a fuzzy quantification model and fuzzy thesaurus. In text mining, we focus on: 1) Sentences included in Japanese text that are broken down into words. 2) Fuzzy thesaurus for finding words matching keywords in text. 3) Fuzzy multivariate analysis to analyze semantic meaning in predefined case studies. We use a fuzzy thesaurus to translate words using Chinese and Japanese characters into keywords. This speeds up processing without requiring a dictionary to separate words. Fuzzy multivariate analysis is used to analyze such processed data and to extract latent mutual related structures in text data, i.e., to extract otherwise obscured knowledge. We apply dual scaling to mining library and Web page text information, and propose integrating the result in Kansei engineering for possible application in sales, marketing, and production.

Download Full-text

Use of text mining techniques for unsupervised organization of digital procedural acts

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.83581 ◽

2018 ◽

Vol 25 (4) ◽

pp. 74

Author(s):

Alfredo Silveira Araújo Neto ◽

Marcos Negreiros

Keyword(s):

Data Mining ◽

Text Mining ◽

Text Documents ◽

Digital Format ◽

Large Databases ◽

Context Data ◽

Self Discovery ◽

The Many ◽

Many Sources ◽

And Storage

The rapid advances in technologies related to the capture and storage of data in digital format have allowed to organizations the accumulation of a volume of information extremely high, constituted a higher proportion of data in unstructured format, represented by texts. However, it is noted that the retrieval of useful information from these large repositories has been a very challenging activity. In this context, data mining is presented as a self-discovery process that acts on large databases and enables the knowledge extraction from raw text documents. Among the many sources of textual documents are electronic diaries of justice, which are intended to make public officially all the acts of the Judiciary. Despite the publication in digital form has provided improvements represented by the removal of imperfections related to divulgation at printed format, it is observed that the application of data mining methods could render more rapid analysis of its contents. In this sense, this article establishes a tool capable of automatically grouping and categorizing digital procedural acts, based on the evaluation of text mining techniques applied to groups determination activity. In addition, the strategy of defining the descriptors of the groups, that is usually conducted based on the most frequent words in the documents, was evaluated and remodeled in order to use, instead of words, the most regularly identified concepts in the texts.

Download Full-text

FACTORS AFFECTING THE USE OF LOGISTICS SERVICES BY SEAFOOD EXPORTING ENTERPRISES IN HO CHI MINH CITY

Journal of Science and Technology - IUH ◽

10.46242/jst-iuh.v46i04.708 ◽

2021 ◽

Vol 46 (04) ◽

Author(s):

HÀ CÔNG NGUYÊN ◽

NGUYỄN THANH DƯƠNG ◽

NGUYỄN TIẾN HOÀNG

Keyword(s):

Business Performance ◽

Service Providers ◽

Research Process ◽

Influential Factor ◽

Ho Chi Minh City ◽

Logistics Industry ◽

Factors Affecting ◽

Five Factors ◽

Logistics Service Providers ◽

Logistics Services

The study identifies the factors and measures the influence on the use of logistics services by seafood exporting enterprises in Ho Chi Minh City. Many research models on the same and relevant fields are reviewed to provide the comprehensive research background of the topic. Thereby, the research model and hypotheses are proposed with 26 observed variables for 05 factors of using logistics services. The survey is conducted in Ho Chi Minh City with 161 valid answers accepted for the analysis. After various statistical techniques to fully analyze data, the findings show that there is statistically significant relationship between all five factors and the use of logistics services by seafood exporting enterprises. Transportation time is the most influential factor, followed by reliability, cost, reputation and service quality. In accordance with these results, implications are proposed for both seafood exporting enterprises and logistics service providers to improve their operations and business performance. Despite the limitations during research process, the study has made a significant contribution to the literature of B2B buying relationship in the logistics industry.

Download Full-text

Assessment of Twitter Data Clusters with Cosine-Based Validation Metrics Using Hybrid Topic Models

Ingénierie des systèmes d information ◽

10.18280/isi.250606 ◽

2020 ◽

Vol 25 (6) ◽

pp. 755-769

Author(s):

Noorullah R. Mohammed ◽

Moulana Mohammed

Keyword(s):

Data Clustering ◽

Topic Models ◽

Cluster Validity ◽

Text Documents ◽

Text Data ◽

Validity Assessment ◽

Text Document ◽

Cluster Validity Indices ◽

Validity Indices ◽

Data Clusters

Text data clustering is performed for organizing the set of text documents into the desired number of coherent and meaningful sub-clusters. Modeling the text documents in terms of topics derivations is a vital task in text data clustering. Each tweet is considered as a text document, and various topic models perform modeling of tweets. In existing topic models, the clustering tendency of tweets is assessed initially based on Euclidean dissimilarity features. Cosine metric is more suitable for more informative assessment, especially of text clustering. Thus, this paper develops a novel cosine based external and interval validity assessment of cluster tendency for improving the computational efficiency of tweets data clustering. In the experimental, tweets data clustering results are evaluated using cluster validity indices measures. Experimentally proved that cosine based internal and external validity metrics outperforms the other using benchmarked and Twitter-based datasets.

Download Full-text

The Classification System of Literary Works Based on K-Means Clustering

Journal of Interconnection Networks ◽

10.1142/s0219265921410012 ◽

2021 ◽

pp. 2141001

Author(s):

Sanqiang Wei ◽

Hongxia Hou ◽

Hua Sun ◽

Wei Li ◽

Wenxia Song

Keyword(s):

Clustering Algorithm ◽

Performance Ratio ◽

Levels Of Abstraction ◽

Text Documents ◽

Text Data ◽

Literary Works ◽

Accuracy Comparison ◽

Word Classification ◽

Text Features ◽

And Performance

The plots in certain literary works are very complicated and hinder readers from understanding them. Therefore tools should be proposed to support readers; comprehension of complex literary works supports their understanding by providing the most important information to readers. A human reader must capture multiple levels of abstraction and meaning to formulate an understanding of a document. Hence, in this paper, an Improved [Formula: see text]-means clustering algorithm (IKCA) has been proposed for literary word classification. For text data, the words that can express exact semantic in a class are generally better features. This paper uses the proposed technique to capture numerous cluster centroids for every class and then select the high-frequency words in centroids the text features for classification. Furthermore, neural networks have been used to classify text documents and [Formula: see text]-mean to cluster text documents. To develop the model based on unsupervised and supervised techniques to meet and identify the similarity between documents. The numerical results show that the suggested model will enhance to increases quality comparison of the existing Algorithm and [Formula: see text]-means algorithm, accuracy comparison of ALA and IKCA (95.2%), time is taken for clustering is less than 2 hours, success rate (97.4%) and performance ratio (98.1%).

Download Full-text

Document Clustering

Pattern and Data Analysis in Healthcare Settings - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-5225-0536-5.ch013 ◽

2017 ◽

pp. 264-281

Author(s):

Harsha Patil ◽

R. S. Thakur

Keyword(s):

Text Mining ◽

Clustering Algorithms ◽

Document Clustering ◽

Web Pages ◽

Digital Form ◽

Search Query ◽

Text Documents ◽

Keen Interest ◽

Use Of Internet

As we know use of Internet flourishes with its full velocity and in all dimensions. Enormous availability of Text documents in digital form (email, web pages, blog post, news articles, ebooks and other text files) on internet challenges technology to appropriate retrieval of document as a response for any search query. As a result there has been an eruption of interest in people to mine these vast resources and classify them properly. It invigorates researchers and developers to work on numerous approaches of document clustering. Researchers got keen interest in this problem of text mining. The aim of this chapter is to summarised different document clustering algorithms used by researchers.

Download Full-text

Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality

Political Analysis ◽

10.1017/pan.2020.1 ◽

2020 ◽

Vol 28 (4) ◽

pp. 445-468 ◽

Cited By ~ 3

Author(s):

Reagan Mozer ◽

Luke Miratrix ◽

Aaron Russell Kaufman ◽

L. Jason Anastasopoulos

Keyword(s):

Propensity Scores ◽

Human Subjects ◽

Medical Intervention ◽

Exact Matching ◽

Match Quality ◽

Text Documents ◽

Text Data ◽

News Sources ◽

Text Matching

Matching for causal inference is a well-studied problem, but standard methods fail when the units to match are text documents: the high-dimensional and rich nature of the data renders exact matching infeasible, causes propensity scores to produce incomparable matches, and makes assessing match quality difficult. In this paper, we characterize a framework for matching text documents that decomposes existing methods into (1) the choice of text representation and (2) the choice of distance metric. We investigate how different choices within this framework affect both the quantity and quality of matches identified through a systematic multifactor evaluation experiment using human subjects. Altogether, we evaluate over 100 unique text-matching methods along with 5 comparison methods taken from the literature. Our experimental results identify methods that generate matches with higher subjective match quality than current state-of-the-art techniques. We enhance the precision of these results by developing a predictive model to estimate the match quality of pairs of text documents as a function of our various distance scores. This model, which we find successfully mimics human judgment, also allows for approximate and unsupervised evaluation of new procedures in our context. We then employ the identified best method to illustrate the utility of text matching in two applications. First, we engage with a substantive debate in the study of media bias by using text matching to control for topic selection when comparing news articles from thirteen news sources. We then show how conditioning on text data leads to more precise causal inferences in an observational study examining the effects of a medical intervention.

Download Full-text

Towards Robust Text Classification with Semantics-Aware Recurrent Neural Architecture

Machine Learning and Knowledge Extraction ◽

10.3390/make1020034 ◽

2019 ◽

Vol 1 (2) ◽

pp. 575-589 ◽

Cited By ~ 1

Author(s):

Blaž Škrlj ◽

Jan Kralj ◽

Nada Lavrač ◽

Senja Pollak

Keyword(s):

Text Mining ◽

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

Semantic Knowledge ◽

Text Documents ◽

Neural Architecture ◽

Classification Tasks ◽

And Gender ◽

Semantic Resources

Deep neural networks are becoming ubiquitous in text mining and natural language processing, but semantic resources, such as taxonomies and ontologies, are yet to be fully exploited in a deep learning setting. This paper presents an efficient semantic text mining approach, which converts semantic information related to a given set of documents into a set of novel features that are used for learning. The proposed Semantics-aware Recurrent deep Neural Architecture (SRNA) enables the system to learn simultaneously from the semantic vectors and from the raw text documents. We test the effectiveness of the approach on three text classification tasks: news topic categorization, sentiment analysis and gender profiling. The experiments show that the proposed approach outperforms the approach without semantic knowledge, with highest accuracy gain (up to 10%) achieved on short document fragments.

Download Full-text

Extracting kinetic information from literature with KineticRE

Journal of Integrative Bioinformatics ◽

10.1515/jib-2015-282 ◽

2015 ◽

Vol 12 (4) ◽

pp. 56-68

Author(s):

Ana Alão Freitas ◽

Hugo Costa ◽

Isabel Rocha

Keyword(s):

Text Mining ◽

Metabolic Networks ◽

Scientific Literature ◽

Kluyveromyces Lactis ◽

Relevant Information ◽

Text Documents ◽

Kinetic Information ◽

Mining Tool ◽

Text Mining Tool

Summary To better understand the dynamic behavior of metabolic networks in a wide variety of conditions, the field of Systems Biology has increased its interest in the use of kinetic models. The different databases, available these days, do not contain enough data regarding this topic. Given that a significant part of the relevant information for the development of such models is still wide spread in the literature, it becomes essential to develop specific and powerful text mining tools to collect these data. In this context, this work has as main objective the development of a text mining tool to extract, from scientific literature, kinetic parameters, their respective values and their relations with enzymes and metabolites. The approach proposed integrates the development of a novel plug-in over the text mining framework @Note2. In the end, the pipeline developed was validated with a case study on Kluyveromyces lactis, spanning the analysis and results of 20 full text documents.

Download Full-text