Business Information Systems

Recently we observe a significant increase in the amount of easily accessible data on transport and mobility. This data is mostly massive streams of high velocity, magnitude, and heterogeneity, which represent a flow of goods, shipments and the movements of fleet. It is therefore necessary to develop a scalable framework and apply tools capable of handling these streams. In the paper we propose an approach for the selection of software for stream processing solutions that may be used in the transportation domain. We provide an overview of potential stream processing technologies, followed by the method for choosing the selected software for real-time analysis of data streams coming from objects in motion. We have selected two solutions: Apache Spark Streaming and Apache Flink, and benchmarked them on a real-world task. We identified the caveats and challenges when it comes to implementation of the solution in practice.

Download Full-text

Social Media Crisis Communication Model for Building Public Resilience: A Preliminary Study

Business Information Systems ◽

10.52825/bis.v1i.55 ◽

2021 ◽

pp. 245-256

Author(s):

Umar Ali Bukar ◽

Marzanah A Jabar ◽

Fatimah Sidi ◽

Rozi Nor Haizan Nor ◽

Salfarina Abdullah

Keyword(s):

Social Media ◽

Regression Model ◽

Crisis Communication ◽

Preliminary Investigation ◽

Crisis Response ◽

Kappa Statistics ◽

Communication Model ◽

Dynamic Feature ◽

Consistency Requirement ◽

Process Macro

There is an ongoing discussion about the effectiveness of social media usage on the ability of people to recover from the crisis. However, the existing social media crisis communication models could not address the dynamic feature of social media users and the crisis, respectively. Therefore, the objective of this study is to conduct a preliminary investigation of the social media crisis communication model for building public resilience. Thus, 34 items were generated from the literature concerning the crisis, crisis response, social interaction, and resilience. The items were validated by three experts via content validity index and modified kappa statistics. After passing the validation test, the instruments were pre-tested by 32 participants. The reliability of the items was analyzed using Cronbach’s alpha. Also, the model fits and mediation were examined by the regression model, and the hypotheses were independently assessed in process macro models. Based on the result obtained, each of the constructs satisfied the internal consistency requirement; crisis (0.743), crisis response (0.724), social media interaction (0.716), and resilience (0.827). Furthermore, the result also indicates that the regression model is a good fit for the data. The independent variables statistically significantly predict the dependent variable, p < 0.05. Also, the result of the process macro models indicates that all the hypotheses are independently supported.

Download Full-text

Developing a Legal Form Classification and Extraction Approach for Company Entity Matching

Business Information Systems ◽

10.52825/bis.v1i.44 ◽

2021 ◽

pp. 13-26

Author(s):

Felix Kruse ◽

Jan-Philipp Awick ◽

Jorge Marx Gómez ◽

Peter Loos

Keyword(s):

Data Quality ◽

Record Linkage ◽

Hybrid Approach ◽

Supervised Machine Learning ◽

Data Sets ◽

Legal Form ◽

Process Step ◽

Machine Learning Model ◽

Rule Set ◽

Processing Steps

This paper explores the data integration process step record linkage. Thereby we focus on the entity company. For the integration of company data, the company name is a crucial attribute, which often includes the legal form. This legal form is not concise and consistent represented among different data sources, which leads to considerable data quality problems for the further process steps in record linkage. To solve these problems, we classify and ex-tract the legal form from the attribute company name. For this purpose, we iteratively developed four different approaches and compared them in a benchmark. The best approach is a hybrid approach combining a rule set and a supervised machine learning model. With our developed hybrid approach, any company data sets from research or business can be processed. Thus, the data quality for subsequent data processing steps such as record linkage can be improved. Furthermore, our approach can be adapted to solve the same data quality problems in other attributes.

Download Full-text

Supporting an Expert-centric Process of New Product Introduction With Statistical Machine Learning

Business Information Systems ◽

10.52825/bis.v1i.57 ◽

2021 ◽

pp. 187-198

Author(s):

Shima Zahmatkesh ◽

Alessio Bernardo ◽

Emanuele Falzone ◽

Edgardo Di Nicola Carena ◽

Emanuele Della Valle

Keyword(s):

Machine Learning ◽

New Product ◽

Parameter Tuning ◽

Life Cycles ◽

New Product Introduction ◽

Gradient Boosting ◽

Statistical Machine Learning ◽

Product Introduction ◽

Better Than

Industries that sell products with short-term or seasonal life cycles must regularly introduce new products. Forecasting the demand for New Product Introduction (NPI) can be challenging due to the fluctuations of many factors such as trend, seasonality, or other external and unpredictable phenomena (e.g., COVID-19 pandemic). Traditionally, NPI is an expertcentric process. This paper presents a study on automating the forecast of NPI demands using statistical Machine Learning (namely, Gradient Boosting and XGBoost). We show how to overcome shortcomings of the traditional data preparation that underpins the manual process. Moreover, we illustrate the role of cross-validation techniques for the hyper-parameter tuning and the validation of the models. Finally, we provide empirical evidence that statistical Machine Learning can forecast NPI demand better than experts.

Download Full-text

Mapping of ImageNet and Wikidata for Knowledge Graphs Enabled Computer Vision

Business Information Systems ◽

10.52825/bis.v1i.65 ◽

2021 ◽

pp. 151-161

Author(s):

Dominik Filipiak ◽

Anna Fensel ◽

Agata Filipowska

Keyword(s):

Computer Vision ◽

Performance Metrics ◽

Contextual Information ◽

Ground Truth ◽

Knowledge Graph ◽

Graph Structure ◽

Exact Match ◽

Ground Truth Data ◽

Final Performance ◽

Knowledge Graphs

Knowledge graphs are used as a source of prior knowledge in numerous computer vision tasks. However, such an approach requires to have a mapping between ground truth data labels and the target knowledge graph. We linked the ILSVRC 2012 dataset (often simply referred to as ImageNet) labels to Wikidata entities. This enables using rich knowledge graph structure and contextual information for several computer vision tasks, traditionally benchmarked with ImageNet and its variations. For instance, in few-shot learning classification scenarios with neural networks, this mapping can be leveraged for weight initialisation, which can improve the final performance metrics value. We mapped all 1000 ImageNet labels – 461 were already directly linked with the exact match property (P2888), 467 have exact match candidates, and 72 cannot be matched directly. For these 72 labels, we discuss different problem categories stemming from the inability of finding an exact match. Semantically close non-exact match candidates are presented as well. The mapping is publicly available athttps://github.com/DominikFilipiak/imagenet-to-wikidata-mapping.

Download Full-text

Contextual Personality-Aware Recommender System Versus Big Data Recommender System

Business Information Systems ◽

10.52825/bis.v1i.38 ◽

2021 ◽

pp. 163-173

Author(s):

Marcin Szmydt

Keyword(s):

Big Data ◽

Personality Traits ◽

Collaborative Filtering ◽

Recommender Systems ◽

Recommender System ◽

Factor Model ◽

Five Factor Model ◽

Customer Reviews ◽

System Versus ◽

Personality Theories

Many personality theories suggest that personality influences customer shopping preference. Thus, this research analyses the potential ability to improve the accuracy of the collaborative filtering recommender system by incorporating the Five-Factor Model personality traits data obtained from customer text reviews. The study uses a large Amazon dataset with customer reviews and information about verified customer product purchases. However, evaluation results show that the model leveraging big data by using the whole Amazon dataset provides better recommendations than the recommender systems trained in the contexts of the customer personality traits.

Download Full-text

Database-Less Extraction of Event Logs from Redo Logs

Business Information Systems ◽

10.52825/bis.v1i.66 ◽

2021 ◽

pp. 73-82

Author(s):

Dorina Bano ◽

Tom Lichtenstein ◽

Finn Klessascheck ◽

Mathias Weske

Keyword(s):

Performance Analysis ◽

Real World ◽

Business Processes ◽

Process Mining ◽

Detailed Knowledge ◽

System Failure ◽

Conformance Checking ◽

Event Logs ◽

Event Log ◽

And Performance

Process mining is widely adopted in organizations to gain deep insights about running business processes. This can be achieved by applying different process mining techniques like discovery, conformance checking, and performance analysis. These techniques are applied on event logs, which need to be extracted from the organization’s databases beforehand. This not only implies access to databases, but also detailed knowledge about the database schema, which is often not available. In many real-world scenarios, however, process execution data is available as redo logs. Such logs are used to bring a database into a consistent state in case of a system failure. This paper proposes a semi-automatic approach to extract an event log from redo logs alone. It does not require access to the database or knowledge of the databaseschema. The feasibility of the proposed approach is evaluated on two synthetic redo logs.

Download Full-text

Post-Brexit Power of European Union From the World Trade Network Analysis

Business Information Systems ◽

10.52825/bis.v1i.48 ◽

2021 ◽

pp. 39-47

Author(s):

Justin Loye ◽

Katia Jaffrès-Runser ◽

Dima L. Shepelyansky

Keyword(s):

European Union ◽

World Trade ◽

Dominant Role ◽

Matrix Analysis ◽

The European Union ◽

Trade Network ◽

Google Matrix ◽

The World ◽

Leading Position

We develop the Google matrix analysis of the multiproduct world trade network obtained from the UN COMTRADE database in recent years. The comparison is done between this new approach and the usual Import-Export description of this world trade network. The Google matrix analysis takes into account the multiplicity of trade transactions thus highlighting in a better way the world influence of specific countries and products. It shows that after Brexit, the European Union of 27 countries has the leading position in the world trade network ranking, being ahead of USA and China. Our approach determines also a sensitivity of trade country balance to specific products showing the dominant role of machinery and mineral fuels in multiproduct exchanges. It also underlines the growing influence of Asian countries.

Download Full-text

Semantic Representation of Domain Knowledge for Professional VR Training

Business Information Systems ◽

10.52825/bis.v1i.64 ◽

2021 ◽

pp. 139-150

Author(s):

Jakub Flotyński ◽

Paweł Sobociński ◽

Sergiusz Strykowski ◽

Dominik Strugała ◽

Paweł Buń ◽

...

Keyword(s):

Knowledge Representation ◽

Domain Knowledge ◽

Professional Training ◽

Semantic Representation ◽

Essential Element ◽

Domain Experts ◽

Training Systems ◽

Web Standards ◽

Domain Specific Knowledge ◽

Vr Training

Domain-specific knowledge representation is an essential element of efficient management of professional training. Formal and powerful knowledge representation for training systems can be built upon the semantic web standards, which enable reasoning and complex queries against the content. Virtual reality training is currently used in multiple domains, in particular, if the activities are potentially dangerous for the trainees or require advanced skills or expensive equipment. However, the available methods and tools for creating VR training systems do not use knowledge representation. Therefore, creation, modification and management of training scenarios is problematic for domain experts without expertise in programming and computer graphics. In this paper, we propose an approach to creating semantic virtual training scenarios, in which users’ activities, mistakes as well as equipment and its possible errors are represented using domain knowledge understandable to domain experts. We have verified the approach by developing a user-friendly editor of VR training scenarios for electrical operators of high-voltage installations.

Download Full-text

A Novel Example-Dependent Cost-Sensitive Stacking Classifier to Identify Tax Return Defaulters

Business Information Systems ◽

10.52825/bis.v1i.61 ◽

2021 ◽

pp. 343-353

Author(s):

Sanat Bhargava ◽

M. Ravi Kumar ◽

Priya Mehta ◽

Jithin Mathews ◽

Sandeep Kumar ◽

...

Keyword(s):

Tax Evasion ◽

Classification Algorithms ◽

Financial Loss ◽

Cost Sensitive Classification ◽

Tax Returns ◽

Periodic Report ◽

Illegal Activities ◽

The Government ◽

Financial Losses ◽

Do So

Tax evasion refers to an entity indulging in illegal activities to avoid paying their actual tax liability. A tax return statement is a periodic report comprising information about income, expenditure, etc. One of the most basic tax evasion methods is failing to file tax returns or delay filing tax return statements. The taxpayers who do not file their returns, or fail to do so within the stipulated period are called tax return defaulters. As a result, the Government has to bear the financial losses due to a taxpayer defaulting, which varies for each taxpayer. Therefore, while designing any statistical model to predict potential return defaulters, we have to consider the real financial loss associated with the misclassification of each individual. This paper proposes a framework for an example-dependent cost-sensitive stacking classifier that uses cost-insensitive classifiers as base generalizers to make predictions on the input space. These predictions are used to train an example-dependent cost-sensitive meta generalizer. Based on the meta-generalizer choice, we propose four variant models used to predict potential return defaulters for the upcoming tax-filing period. These models have been developed for the Commercial Taxes Department, Government of Telangana, India. Applying our proposed variant models to GST data, we observe a significant increase in savings compared to conventional classifiers. Additionally, we develop an empirical study showing that our approach is more adept at identifying potential tax return defaulters than existing example-dependent cost-sensitive classification algorithms.

Download Full-text

Business Information Systems
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By TIB Open Publishing

Stream Processing Tools for Analyzing Objects in Motion Sending High-Volume Location Data

Social Media Crisis Communication Model for Building Public Resilience: A Preliminary Study

Developing a Legal Form Classification and Extraction Approach for Company Entity Matching

Supporting an Expert-centric Process of New Product Introduction With Statistical Machine Learning

Mapping of ImageNet and Wikidata for Knowledge Graphs Enabled Computer Vision

Contextual Personality-Aware Recommender System Versus Big Data Recommender System

Database-Less Extraction of Event Logs from Redo Logs

Post-Brexit Power of European Union From the World Trade Network Analysis

Semantic Representation of Domain Knowledge for Professional VR Training

A Novel Example-Dependent Cost-Sensitive Stacking Classifier to Identify Tax Return Defaulters

Export Citation Format

Business Information SystemsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By TIB Open Publishing

Stream Processing Tools for Analyzing Objects in Motion Sending High-Volume Location Data

Social Media Crisis Communication Model for Building Public Resilience: A Preliminary Study

Developing a Legal Form Classification and Extraction Approach for Company Entity Matching

Supporting an Expert-centric Process of New Product Introduction With Statistical Machine Learning

Mapping of ImageNet and Wikidata for Knowledge Graphs Enabled Computer Vision

Contextual Personality-Aware Recommender System Versus Big Data Recommender System

Database-Less Extraction of Event Logs from Redo Logs

Post-Brexit Power of European Union From the World Trade Network Analysis

Semantic Representation of Domain Knowledge for Professional VR Training

A Novel Example-Dependent Cost-Sensitive Stacking Classifier to Identify Tax Return Defaulters

Business Information Systems
Latest Publications