scholarly journals Discovering Business Processes from Email Logs Using fastText and Process Mining

Author(s):  
Yaghoub Rashnavadi ◽  
Sina Behzadifard ◽  
Reza Farzadnia ◽  
Sina Zamani

Communication is indispensable for today's lifestyle, and thanks to technology, millions of people can communicate as quickly as possible. The effect of this breakthrough has transformed organizations to the degree that they generate billions of emails daily to facilitate their operations. There is implicit information behind this vast corpus of human-generated content that can be mined and used for their benefit. This paper tries to address the opportunity that email logs can bring to organizations and propose an approach to discover process models by combining supervised text classification and process mining. This framework consists of two main steps, text classification, and process mining. First, Emails will be classified with supervised machine learning, and to mine, the processes fuzzy Miner is used. To further investigate the application of this framework, we also applied this framework over a real-life dataset from a case study organization.

2020 ◽  
Author(s):  
Yaghoub rashnavadi ◽  
Sina Behzadifard ◽  
Reza Farzadnia ◽  
sina zamani

<p>Communication has never been more accessible than today. With the help of Instant messengers and Email Services, millions of people can transfer information with ease, and this trend has affected organizations as well. There are billions of organizational emails sent or received daily, and their main goal is to facilitate the daily operation of organizations. Behind this vast corpus of human-generated content, there is much implicit information that can be mined and used to improve or optimize the organizations’ operations. Business processes are one of those implicit knowledge areas that can be discovered from Email logs of an Organization, as most of the communications are followed inside Emails. The purpose of this research is to propose an approach to discover the process models in the Email log. In this approach, we combine two tools, supervised machine learning and process mining. With the help of supervised machine learning, fastText classifier, we classify the body text of emails to the activity-related. Then the generated log will be mined with process mining techniques to find process models. We illustrate the approach with a case study company from the oil and gas sector.</p>


2020 ◽  
Author(s):  
Yaghoub rashnavadi ◽  
Sina Behzadifard ◽  
Reza Farzadnia ◽  
sina zamani

<p>Communication has never been more accessible than today. With the help of Instant messengers and Email Services, millions of people can transfer information with ease, and this trend has affected organizations as well. There are billions of organizational emails sent or received daily, and their main goal is to facilitate the daily operation of organizations. Behind this vast corpus of human-generated content, there is much implicit information that can be mined and used to improve or optimize the organizations’ operations. Business processes are one of those implicit knowledge areas that can be discovered from Email logs of an Organization, as most of the communications are followed inside Emails. The purpose of this research is to propose an approach to discover the process models in the Email log. In this approach, we combine two tools, supervised machine learning and process mining. With the help of supervised machine learning, fastText classifier, we classify the body text of emails to the activity-related. Then the generated log will be mined with process mining techniques to find process models. We illustrate the approach with a case study company from the oil and gas sector.</p>


Author(s):  
Yaghoub Rashnavadi ◽  
Sina Behzadifard ◽  
Reza Farzadnia ◽  
Sina Zamani

Communication has never been more accessible than today. With the help of Instant messengers and Email Services, millions of people can transfer information with ease, and this trend has affected organizations as well. There are billions of organizational emails sent or received daily, and their main goal is to facilitate the daily operation of organizations. Behind this vast corpus of human-generated content, there is much implicit information that can be mined and used to improve or optimize the organizations&rsquo; operations. Business processes are one of those implicit knowledge areas that can be discovered from Email logs of an Organization, as most of the communications are followed inside Emails. The purpose of this research is to propose an approach to discover the process models in the Email log. In this approach, we combine two tools, supervised machine learning and process mining. With the help of supervised machine learning, fastText classifier, we classify the body text of emails to the activity-related. Then the generated log will be mined with process mining techniques to find process models. We illustrate the approach with a case study company from the oil and gas sector.


Author(s):  
Yutika Amelia Effendi ◽  
Nania Nuzulita

Background: Nowadays, enterprise computing manages business processes which has grown up rapidly. This situation triggers the production of a massive event log. One type of event log is double timestamp event log. The double timestamp has a start time and complete time of each activity executed in the business process. It also has a close relationship with temporal causal relation. The temporal causal relation is a pattern of event log that occurs from each activity performed in the process.Objective: In this paper, seven types of temporal causal relation between activities were presented as an extended version of relations used in the double timestamp event log. Since the event log was not always executed sequentially, therefore using temporal causal relation, the event log was divided into several small groups to determine the relations of activities and to mine the business process.Methods: In these experiments, the temporal causal relation based on time interval which were presented in Gantt chart also determined whether each case could be classified as sequential or parallel relations. Then to obtain the business process, each temporal causal relation was combined into one business process based on the timestamp of activity in the event log.Results: The experimental results, which were implemented in two real-life event logs, showed that using temporal causal relation and double timestamp event log could discover business process models.Conclusion: Considering the findings, this study concludes that business process models and their sequential and parallel AND, OR, XOR relations can be discovered by using temporal causal relation and double timestamp event log.Keywords:Business Process, Process Discovery, Process Mining, Temporal Causal Relation, Double Timestamp Event Log


2015 ◽  
Vol 24 (01) ◽  
pp. 1550001 ◽  
Author(s):  
Viara Popova ◽  
Dirk Fahland ◽  
Marlon Dumas

Artifact-centric modeling is an approach for capturing business processes in terms of so-called business artifacts — key entities driving a company's operations and whose lifecycles and interactions define an overall business process. This approach has been shown to be especially suitable in the context of processes where one-to-many or many-to-many relations exist between the entities involved in the process. As a contribution towards building up a body of methods to support artifact-centric modeling, this article presents a method for automated discovery of artifact-centric process models starting from logs consisting of flat collections of event records. We decompose the problem in such a way that a wide range of existing (non-artifact-centric) automated process discovery methods can be reused in a flexible manner. The presented methods are implemented as a package for ProM, a generic open-source framework for process mining. The methods have been applied to reverse-engineer an artifact-centric process model starting from logs of a real-life business process.


Author(s):  
Evellin Cardoso ◽  
João Paulo A. Almeida ◽  
Renata S. S. Guizzardi ◽  
Giancarlo Guizzardi

While traditional approaches in business process modeling tend to focus on “how” the business processes are performed (adopting a behavioral description in which business processes are described in terms of procedural aspects), in goal-oriented business process modeling, the proposals strive to extend traditional business process methodologies by providing a dimension of intentionality to business processes. One of the key difficulties in enabling one to model goal-oriented processes concerns the identification or elicitation of goals. This paper reports on a case study conducted in a Brazilian hospital, which obtained several goal models represented in i*/Tropos, each of which correspond to a business process also modeled in the scope of the study. NFR catalogues were helpful in goal elicitation, uncovering goals that did not come up during previous interviews prior to these catalogues’ use.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Rafael Vega Vega ◽  
Héctor Quintián ◽  
Carlos Cambra ◽  
Nuño Basurto ◽  
Álvaro Herrero ◽  
...  

Present research proposes the application of unsupervised and supervised machine-learning techniques to characterize Android malware families. More precisely, a novel unsupervised neural-projection method for dimensionality-reduction, namely, Beta Hebbian Learning (BHL), is applied to visually analyze such malware. Additionally, well-known supervised Decision Trees (DTs) are also applied for the first time in order to improve characterization of such families and compare the original features that are identified as the most important ones. The proposed techniques are validated when facing real-life Android malware data by means of the well-known and publicly available Malgenome dataset. Obtained results support the proposed approach, confirming the validity of BHL and DTs to gain deep knowledge on Android malware.


2020 ◽  
Vol 25 (4) ◽  
pp. 174-189 ◽  
Author(s):  
Guillaume  Palacios ◽  
Arnaud Noreña ◽  
Alain Londero

Introduction: Subjective tinnitus (ST) and hyperacusis (HA) are common auditory symptoms that may become incapacitating in a subgroup of patients who thereby seek medical advice. Both conditions can result from many different mechanisms, and as a consequence, patients may report a vast repertoire of associated symptoms and comorbidities that can reduce dramatically the quality of life and even lead to suicide attempts in the most severe cases. The present exploratory study is aimed at investigating patients’ symptoms and complaints using an in-depth statistical analysis of patients’ natural narratives in a real-life environment in which, thanks to the anonymization of contributions and the peer-to-peer interaction, it is supposed that the wording used is totally free of any self-limitation and self-censorship. Methods: We applied a purely statistical, non-supervised machine learning approach to the analysis of patients’ verbatim exchanged on an Internet forum. After automated data extraction, the dataset has been preprocessed in order to make it suitable for statistical analysis. We used a variant of the Latent Dirichlet Allocation (LDA) algorithm to reveal clusters of symptoms and complaints of HA patients (topics). The probability of distribution of words within a topic uniquely characterizes it. The convergence of the log-likelihood of the LDA-model has been reached after 2,000 iterations. Several statistical parameters have been tested for topic modeling and word relevance factor within each topic. Results: Despite a rather small dataset, this exploratory study demonstrates that patients’ free speeches available on the Internet constitute a valuable material for machine learning and statistical analysis aimed at categorizing ST/HA complaints. The LDA model with K = 15 topics seems to be the most relevant in terms of relative weights and correlations with the capability to individualizing subgroups of patients displaying specific characteristics. The study of the relevance factor may be useful to unveil weak but important signals that are present in patients’ narratives. Discussion/Conclusion: We claim that the LDA non-supervised approach would permit to gain knowledge on the patterns of ST- and HA-related complaints and on patients’ centered domains of interest. The merits and limitations of the LDA algorithms are compared with other natural language processing methods and with more conventional methods of qualitative analysis of patients’ output. Future directions and research topics emerging from this innovative algorithmic analysis are proposed.


Sign in / Sign up

Export Citation Format

Share Document