Recommendation of Process Discovery Algorithms Through Event Log Classification

AbstractProcess discovery helps companies automatically discover their existing business processes based on the vast, stored event log. The process discovery algorithms have been developed rapidly to discover several types of relations, i.e., choice relations, non-free choice relations with invisible tasks. Invisible tasks in non-free choice, introduced by $$\alpha ^{\$ }$$ α $ method, is a type of relationship that combines the non-free choice and the invisible task. $$\alpha ^{\$ }$$ α $ proposed rules of ordering relations of two activities for determining invisible tasks in non-free choice. The event log records sequences of activities, so the rules of $$\alpha ^{\$ }$$ α $ check the combination of invisible task within non-free choice. The checking processes are time-consuming and result in high computing times of $$\alpha ^{\$ }$$ α $ . This research proposes Graph-based Invisible Task (GIT) method to discover efficiently invisible tasks in non-free choice. GIT method develops sequences of business activities as graphs and determines rules to discover invisible tasks in non-free choice based on relationships of the graphs. The analysis of the graph relationships by rules of GIT is more efficient than the iterative process of checking combined activities by $$\alpha ^{\$ }$$ α $ . This research measures the time efficiency of storing the event log and discovering a process model to evaluate GIT algorithm. Graph database gains highest storing computing time of batch event logs; however, this database obtains low storing computing time of streaming event logs. Furthermore, based on an event log with 99 traces, GIT algorithm discovers a process model 42 times faster than α++ and 43 times faster than α$. GIT algorithm can also handle 981 traces, while α++ and α$ has maximum traces at 99 traces. Discovering a process model by GIT algorithm has less time complexity than that by $$\alpha ^{\$ }$$ α $ , wherein GIT obtains $$O(n^{3} )$$ O ( n 3 ) and $$\alpha ^{\$ }$$ α $ obtains $$O(n^{4} )$$ O ( n 4 ) . Those results of the evaluation show a significant improvement of GIT method in term of time efficiency.

Download Full-text

Filtering Infrequent Behavior in Business Process Discovery by Using the Minimum Expectation

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.2020040101 ◽

2020 ◽

Vol 14 (2) ◽

pp. 1-15

Author(s):

Ying Huang ◽

Liyun Zhong ◽

Yan Chen

Keyword(s):

Negative Influence ◽

Large Datasets ◽

Process Models ◽

Process Discovery ◽

Event Logs ◽

Event Log ◽

Process Execution ◽

Process Event ◽

Discovery Algorithms

The aim of process discovery is to discover process models from the process execution data stored in event logs. In the era of “Big Data,” one of the key challenges is to analyze the large amounts of collected data in meaningful and scalable ways. Most process discovery algorithms assume that all the data in an event log fully comply with the process execution specification, and the process event logs are no exception. However, real event logs contain large amounts of noise and data from irrelevant infrequent behavior. The infrequent behavior or noise has a negative influence on the process discovery procedure. This article presents a technique to remove infrequent behavior from event logs by calculating the minimum expectation of the process event log. The method was evaluated in detail, and the results showed that its application in existing process discovery algorithms significantly improves the quality of the discovered process models and that it scales well to large datasets.

Download Full-text

Process Discovery Algorithms Using Numerical Abstract Domains

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2013.156 ◽

2014 ◽

Vol 26 (12) ◽

pp. 3064-3076 ◽

Cited By ~ 14

Author(s):

Josep Carmona ◽

Jordi Cortadella

Keyword(s):

Process Discovery ◽

Numerical Abstract Domains ◽

Abstract Domains ◽

Discovery Algorithms

Download Full-text

Temporal Logics Over Finite Traces with Uncertainty

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i06.6583 ◽

2020 ◽

Vol 34 (06) ◽

pp. 10218-10225 ◽

Cited By ~ 1

Author(s):

Fabrizio M Maggi ◽

Marco Montali ◽

Rafael Peñaloza

Keyword(s):

Decision Making ◽

Temporal Logic ◽

Business Process ◽

Dynamic Systems ◽

Real Life ◽

Temporal Logics ◽

Business Process Modelling ◽

Process Discovery ◽

Event Log ◽

Computational Properties

Temporal logics over finite traces have recently seen wide application in a number of areas, from business process modelling, monitoring, and mining to planning and decision making. However, real-life dynamic systems contain a degree of uncertainty which cannot be handled with classical logics. We thus propose a new probabilistic temporal logic over finite traces using superposition semantics, where all possible evolutions are possible, until observed. We study the properties of the logic and provide automata-based mechanisms for deriving probabilistic inferences from its formulas. We then study a fragment of the logic with better computational properties. Notably, formulas in this fragment can be discovered from event log data using off-the-shelf existing declarative process discovery techniques.

Download Full-text

Role of Stochastic Petri Net (SPN) in Process Discovery for Modelling and Analysis

Mathematical Problems in Engineering ◽

10.1155/2021/8699164 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Shabnam Shahzadi ◽

Xianwen Fang ◽

David Anekeya Alilah

Keyword(s):

Petri Net ◽

Business Processes ◽

Time Perspective ◽

Process Mining ◽

Vital Role ◽

Process Discovery ◽

Stochastic Petri Net ◽

Event Log ◽

The Stability ◽

Generalized Stochastic Petri Net

For exploitation and extraction of an event’s data that has vital information which is related to the process from the event log, process mining is used. There are three main basic types of process mining as explained in relation to input and output. These are process discovery, conformance checking, and enhancement. Process discovery is one of the most challenging process mining activities based on the event log. Business processes or system performance plays a vital role in modelling, analysis, and prediction. Recently, a memoryless model such as exponential distribution of the stochastic Petri net SPN has gained much attention in research and industry. This paper uses time perspective for modelling and analysis and uses stochastic Petri net to check the performance, evolution, stability, and reliability of the model. To assess the effect of time delay in firing the transition, stochastic reward net SRN model is used. The model can also be used in checking the reliability of the model, whereas the generalized stochastic Petri net GSPN is used for evaluation and checking the performance of the model. SPN is used to analyze the probability of state transition and the stability from one state to another. However, in process mining, logs are used by linking log sequence with the state and, by this, modelling can be done, and its relation with stability of the model can be established.

Download Full-text

Improving the performance of process discovery algorithms by instance selection

Computer Science and Information Systems ◽

10.2298/csis200127028s ◽

2020 ◽

Vol 17 (3) ◽

pp. 927-958

Author(s):

Mohammadreza Sani ◽

Sebastiaan van Zelst ◽

Aalst van der

Keyword(s):

Process Model ◽

Business Processes ◽

Process Models ◽

Instance Selection ◽

Event Data ◽

Process Discovery ◽

Selection Strategies ◽

Speed Up ◽

The Right ◽

Discovery Algorithms

Process discovery algorithms automatically discover process models based on event data that is captured during the execution of business processes. These algorithms tend to use all of the event data to discover a process model. When dealing with large event logs, it is no longer feasible using standard hardware in limited time. A straightforward approach to overcome this problem is to down-size the event data by means of sampling. However, little research has been conducted on selecting the right sample, given the available time and characteristics of event data. This paper evaluates various subset selection methods and evaluates their performance on real event data. The proposed methods have been implemented in both the ProM and the RapidProM platforms. Our experiments show that it is possible to considerably speed up discovery using instance selection strategies. Furthermore, results show that applying biased selection of the process instances compared to random sampling will result in simpler process models with higher quality.

Download Full-text

Discovering Process Horizontal Boundaries to Facilitate Process Comprehension

International Journal of Operations Research and Information Systems ◽

10.4018/ijoris.2018040101 ◽

2018 ◽

Vol 9 (2) ◽

pp. 1-31 ◽

Cited By ~ 2

Author(s):

Pavlos Delias ◽

Kleanthi Lakiotaki

Keyword(s):

Process Model ◽

Process Mining ◽

A Priori ◽

Quality Criteria ◽

Large Set ◽

Process Discovery ◽

Event Log ◽

Human Interpretation ◽

Automated Discovery ◽

Priori Information

Automated discovery of a process model is a major task of Process Mining that means to produce a process model from an event log, without any a-priori information. However, when an event log contains a large number of distinct activities, process discovery can be real challenging. The goal of this article is to facilitate process discovery in such cases when a process is expected to contain a large set of unique activities. To this end, this article proposes a clustering approach that recommends horizontal boundaries for the process. The proposed approach ultimately partitions the event log in a way that human interpretation efforts are decomposed. In addition, it makes automated discovery more efficient as well as effective by simultaneously considering two quality criteria: informativeness and robustness of the derived groups of activities. The authors conducted several experiments to test the behavior of the algorithm under different settings, and to compare it against other techniques. Finally, they provide a set of recommendations that may help process analysts during the process discovery endeavor.

Download Full-text

PRETSA: Event Log Sanitization for Privacy-aware Process Discovery

Informatik-Spektrum ◽

10.1007/s00287-019-01203-z ◽

2019 ◽

Vol 42 (5) ◽

pp. 352-353

Author(s):

Stephan A. Fahrenkrog-Petersen ◽

Han van der Aa ◽

Matthias Weidlich

Keyword(s):

Process Discovery ◽

Event Log

Download Full-text

Recommendation of Process Discovery Algorithms Through Event Log Classification

The Impact of Event Log Subset Selection on the Performance of Process Discovery Algorithms

Behavioural Similarity Measurement of Business Process Model to Compare Process Discovery Algorithms Performance in Dealing with Noisy Event Log

Improving efficiency for discovering business processes containing invisible tasks in non-free choice

Filtering Infrequent Behavior in Business Process Discovery by Using the Minimum Expectation

Process Discovery Algorithms Using Numerical Abstract Domains

Temporal Logics Over Finite Traces with Uncertainty

Role of Stochastic Petri Net (SPN) in Process Discovery for Modelling and Analysis

Improving the performance of process discovery algorithms by instance selection

Discovering Process Horizontal Boundaries to Facilitate Process Comprehension

PRETSA: Event Log Sanitization for Privacy-aware Process Discovery

Export Citation Format