A multi-dimensional quality assessment of state-of-the-art process discovery algorithms using real-life event logs

2012 ◽  
Vol 37 (7) ◽  
pp. 654-676 ◽  
Author(s):  
Jochen De Weerdt ◽  
Manu De Backer ◽  
Jan Vanthienen ◽  
Bart Baesens
2022 ◽  
Vol 183 (3-4) ◽  
pp. 293-317
Author(s):  
Anna Kalenkova ◽  
Josep Carmona ◽  
Artem Polyvyanyy ◽  
Marcello La Rosa

State-of-the-art process discovery methods construct free-choice process models from event logs. Consequently, the constructed models do not take into account indirect dependencies between events. Whenever the input behaviour is not free-choice, these methods fail to provide a precise model. In this paper, we propose a novel approach for enhancing free-choice process models by adding non-free-choice constructs discovered a-posteriori via region-based techniques. This allows us to benefit from the performance of existing process discovery methods and the accuracy of the employed fundamental synthesis techniques. We prove that the proposed approach preserves fitness with respect to the event log while improving the precision when indirect dependencies exist. The approach has been implemented and tested on both synthetic and real-life datasets. The results show its effectiveness in repairing models discovered from event logs.


Author(s):  
Stephan A. Fahrenkrog-Petersen ◽  
Niek Tax ◽  
Irene Teinemaa ◽  
Marlon Dumas ◽  
Massimiliano de Leoni ◽  
...  

AbstractPredictive process monitoring is a family of techniques to analyze events produced during the execution of a business process in order to predict the future state or the final outcome of running process instances. Existing techniques in this field are able to predict, at each step of a process instance, the likelihood that it will lead to an undesired outcome. These techniques, however, focus on generating predictions and do not prescribe when and how process workers should intervene to decrease the cost of undesired outcomes. This paper proposes a framework for prescriptive process monitoring, which extends predictive monitoring with the ability to generate alarms that trigger interventions to prevent an undesired outcome or mitigate its effect. The framework incorporates a parameterized cost model to assess the cost–benefit trade-off of generating alarms. We show how to optimize the generation of alarms given an event log of past process executions and a set of cost model parameters. The proposed approaches are empirically evaluated using a range of real-life event logs. The experimental results show that the net cost of undesired outcomes can be minimized by changing the threshold for generating alarms, as the process instance progresses. Moreover, introducing delays for triggering alarms, instead of triggering them as soon as the probability of an undesired outcome exceeds a threshold, leads to lower net costs.


2014 ◽  
Vol 23 (01) ◽  
pp. 1440001 ◽  
Author(s):  
J. C. A. M. Buijs ◽  
B. F. van Dongen ◽  
W. M. P. van der Aalst

Process discovery algorithms typically aim at discovering process models from event logs that best describe the recorded behavior. Often, the quality of a process discovery algorithm is measured by quantifying to what extent the resulting model can reproduce the behavior in the log, i.e. replay fitness. At the same time, there are other measures that compare a model with recorded behavior in terms of the precision of the model and the extent to which the model generalizes the behavior in the log. Furthermore, many measures exist to express the complexity of a model irrespective of the log.In this paper, we first discuss several quality dimensions related to process discovery. We further show that existing process discovery algorithms typically consider at most two out of the four main quality dimensions: replay fitness, precision, generalization and simplicity. Moreover, existing approaches cannot steer the discovery process based on user-defined weights for the four quality dimensions.This paper presents the ETM algorithm which allows the user to seamlessly steer the discovery process based on preferences with respect to the four quality dimensions. We show that all dimensions are important for process discovery. However, it only makes sense to consider precision, generalization and simplicity if the replay fitness is acceptable.


2019 ◽  
Vol 19 (6) ◽  
pp. 1307-1343
Author(s):  
Ario Santoso ◽  
Michael Felderer

Abstract Predictive analysis in business process monitoring aims at forecasting the future information of a running business process. The prediction is typically made based on the model extracted from historical process execution logs (event logs). In practice, different business domains might require different kinds of predictions. Hence, it is important to have a means for properly specifying the desired prediction tasks, and a mechanism to deal with these various prediction tasks. Although there have been many studies in this area, they mostly focus on a specific prediction task. This work introduces a language for specifying the desired prediction tasks, and this language allows us to express various kinds of prediction tasks. This work also presents a mechanism for automatically creating the corresponding prediction model based on the given specification. Differently from previous studies, instead of focusing on a particular prediction task, we present an approach to deal with various prediction tasks based on the given specification of the desired prediction tasks. We also provide an implementation of the approach which is used to conduct experiments using real-life event logs.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Riyanarto Sarno ◽  
Kelly Rossa Sungkono ◽  
Muhammad Taufiqulsa’di ◽  
Hendra Darmawan ◽  
Achmad Fahmi ◽  
...  

AbstractProcess discovery helps companies automatically discover their existing business processes based on the vast, stored event log. The process discovery algorithms have been developed rapidly to discover several types of relations, i.e., choice relations, non-free choice relations with invisible tasks. Invisible tasks in non-free choice, introduced by $$\alpha ^{\$ }$$ α $ method, is a type of relationship that combines the non-free choice and the invisible task. $$\alpha ^{\$ }$$ α $ proposed rules of ordering relations of two activities for determining invisible tasks in non-free choice. The event log records sequences of activities, so the rules of $$\alpha ^{\$ }$$ α $ check the combination of invisible task within non-free choice. The checking processes are time-consuming and result in high computing times of $$\alpha ^{\$ }$$ α $ . This research proposes Graph-based Invisible Task (GIT) method to discover efficiently invisible tasks in non-free choice. GIT method develops sequences of business activities as graphs and determines rules to discover invisible tasks in non-free choice based on relationships of the graphs. The analysis of the graph relationships by rules of GIT is more efficient than the iterative process of checking combined activities by $$\alpha ^{\$ }$$ α $ . This research measures the time efficiency of storing the event log and discovering a process model to evaluate GIT algorithm. Graph database gains highest storing computing time of batch event logs; however, this database obtains low storing computing time of streaming event logs. Furthermore, based on an event log with 99 traces, GIT algorithm discovers a process model 42 times faster than α++ and 43 times faster than α$. GIT algorithm can also handle 981 traces, while α++ and α$ has maximum traces at 99 traces. Discovering a process model by GIT algorithm has less time complexity than that by $$\alpha ^{\$ }$$ α $ , wherein GIT obtains $$O(n^{3} )$$ O ( n 3 ) and $$\alpha ^{\$ }$$ α $ obtains $$O(n^{4} )$$ O ( n 4 ) . Those results of the evaluation show a significant improvement of GIT method in term of time efficiency.


2019 ◽  
Vol 9 (11) ◽  
pp. 2368 ◽  
Author(s):  
Hyun Ahn ◽  
Dinh-Lam Pham ◽  
Kwanghoon Pio Kim

Work transference network is a type of enterprise social network centered on the interactions among performers participating in the workflow processes. It is thought that the work transference networks hidden in workflow enactment histories are able to denote not only the structure of the enterprise social network among performers but also imply the degrees of relevancy and intensity between them. The purpose of this paper is to devise a framework that can discover and analyze work transference networks from workflow enactment event logs. The framework includes a series of conceptual definitions to formally describe the overall procedure of the network discovery. To support this conceptual framework, we implement a system that provides functionalities for the discovery, analysis and visualization steps. As a sanity check for the framework, we carry out a mining experiment on a dataset of real-life event logs by using the implemented system. The experiment results show that the framework is valid in discovering transference networks correctly and providing primitive knowledge pertaining to the discovered networks. Finally, we expect that the analytics of the work transference network facilitates assessing the workflow fidelity in human resource planning and its observed performance, and eventually enhances the workflow process from the organizational aspect.


2019 ◽  
Vol 25 (5) ◽  
pp. 995-1019 ◽  
Author(s):  
Anna Kalenkova ◽  
Andrea Burattin ◽  
Massimiliano de Leoni ◽  
Wil van der Aalst ◽  
Alessandro Sperduti

Purpose The purpose of this paper is to demonstrate that process mining techniques can help to discover process models from event logs, using conventional high-level process modeling languages, such as Business Process Model and Notation (BPMN), leveraging their representational bias. Design/methodology/approach The integrated discovery approach presented in this work is aimed to mine: control, data and resource perspectives within one process diagram, and, if possible, construct a hierarchy of subprocesses improving the model readability. The proposed approach is defined as a sequence of steps, performed to discover a model, containing various perspectives and presenting a holistic view of a process. This approach was implemented within an open-source process mining framework called ProM and proved its applicability for the analysis of real-life event logs. Findings This paper shows that the proposed integrated approach can be applied to real-life event logs of information systems from different domains. The multi-perspective process diagrams obtained within the approach are of good quality and better than models discovered using a technique that does not consider hierarchy. Moreover, due to the decomposition methods applied, the proposed approach can deal with large event logs, which cannot be handled by methods that do not use decomposition. Originality/value The paper consolidates various process mining techniques, which were never integrated before and presents a novel approach for the discovery of multi-perspective hierarchical BPMN models. This approach bridges the gap between well-known process mining techniques and a wide range of BPMN-complaint tools.


2019 ◽  
Vol 11 (2) ◽  
pp. 106-118
Author(s):  
Michal Halaška ◽  
Roman Šperka

AbstractThe simulation and modelling paradigms have significantly shifted in recent years under the influence of the Industry 4.0 concept. There is a requirement for a much higher level of detail and a lower level of abstraction within the simulation of a modelled system that continuously develops. Consequently, higher demands are placed on the construction of automated process models. Such a possibility is provided by automated process discovery techniques. Thus, the paper aims to investigate the performance of automated process discovery techniques within the controlled environment. The presented paper aims to benchmark the automated discovery techniques regarding realistic simulation models within the controlled environment and, more specifically, the logistics process of a manufacturing company. The study is based on a hybrid simulation of logistics in a manufacturing company that implemented the AnyLogic framework. The hybrid simulation is modelled using the BPMN notation using BIMP, the business process modelling software, to acquire data in the form of event logs. Next, five chosen automated process discovery techniques are applied to the event logs, and the results are evaluated. Based on the evaluation of benchmark results received using the chosen discovery algorithms, it is evident that the discovery algorithms have a better overall performance using more extensive event logs both in terms of fitness and precision. Nevertheless, the discovery techniques perform better in the case of smaller data sets, with less complex process models. Typically, automated discovery techniques have to address scalability issues due to the high amount of data present in the logs. However, as demonstrated, the process discovery techniques can also encounter issues of opposite nature. While discovery techniques typically have to address scalability issues due to large datasets, in the case of companies with long delivery cycles, long processing times and parallel production, which is common for the industrial sector, they have to address issues with incompleteness and lack of information in datasets. The management of business companies is becoming essential for companies to stay competitive through efficiency. The issues encountered within the simulation model will be amplified through both vertical and horizontal integration of the supply chain within the Industry 4.0. The impact of vertical integration in the BPMN model and the chosen case identifier is demonstrated. Without the assumption of smart manufacturing, it would be impossible to use a single case identifier throughout the entire simulation. The entire process would have to be divided into several subprocesses.


Author(s):  
Ying Huang ◽  
Liyun Zhong ◽  
Yan Chen

The aim of process discovery is to discover process models from the process execution data stored in event logs. In the era of “Big Data,” one of the key challenges is to analyze the large amounts of collected data in meaningful and scalable ways. Most process discovery algorithms assume that all the data in an event log fully comply with the process execution specification, and the process event logs are no exception. However, real event logs contain large amounts of noise and data from irrelevant infrequent behavior. The infrequent behavior or noise has a negative influence on the process discovery procedure. This article presents a technique to remove infrequent behavior from event logs by calculating the minimum expectation of the process event log. The method was evaluated in detail, and the results showed that its application in existing process discovery algorithms significantly improves the quality of the discovered process models and that it scales well to large datasets.


Sign in / Sign up

Export Citation Format

Share Document