scholarly journals Improving efficiency for discovering business processes containing invisible tasks in non-free choice

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Riyanarto Sarno ◽  
Kelly Rossa Sungkono ◽  
Muhammad Taufiqulsa’di ◽  
Hendra Darmawan ◽  
Achmad Fahmi ◽  
...  

AbstractProcess discovery helps companies automatically discover their existing business processes based on the vast, stored event log. The process discovery algorithms have been developed rapidly to discover several types of relations, i.e., choice relations, non-free choice relations with invisible tasks. Invisible tasks in non-free choice, introduced by $$\alpha ^{\$ }$$ α $ method, is a type of relationship that combines the non-free choice and the invisible task. $$\alpha ^{\$ }$$ α $ proposed rules of ordering relations of two activities for determining invisible tasks in non-free choice. The event log records sequences of activities, so the rules of $$\alpha ^{\$ }$$ α $ check the combination of invisible task within non-free choice. The checking processes are time-consuming and result in high computing times of $$\alpha ^{\$ }$$ α $ . This research proposes Graph-based Invisible Task (GIT) method to discover efficiently invisible tasks in non-free choice. GIT method develops sequences of business activities as graphs and determines rules to discover invisible tasks in non-free choice based on relationships of the graphs. The analysis of the graph relationships by rules of GIT is more efficient than the iterative process of checking combined activities by $$\alpha ^{\$ }$$ α $ . This research measures the time efficiency of storing the event log and discovering a process model to evaluate GIT algorithm. Graph database gains highest storing computing time of batch event logs; however, this database obtains low storing computing time of streaming event logs. Furthermore, based on an event log with 99 traces, GIT algorithm discovers a process model 42 times faster than α++ and 43 times faster than α$. GIT algorithm can also handle 981 traces, while α++ and α$ has maximum traces at 99 traces. Discovering a process model by GIT algorithm has less time complexity than that by $$\alpha ^{\$ }$$ α $ , wherein GIT obtains $$O(n^{3} )$$ O ( n 3 ) and $$\alpha ^{\$ }$$ α $ obtains $$O(n^{4} )$$ O ( n 4 ) . Those results of the evaluation show a significant improvement of GIT method in term of time efficiency.

2020 ◽  
Author(s):  
Riyanarto Sarno ◽  
Kelly Rossa Sungkono ◽  
Muhammad Taufiqulsa’di ◽  
Hendra Darmawan ◽  
Achmad Fahmi ◽  
...  

Abstract Process discovery helps companies to automatically discover their existing business processes based on the huge, stored event log. The algorithms of process discovery have been developed rapidly to discover several types of relations, i.e., choice relations, non-free choice relations with invisible tasks. Invisible tasks in non-free choice, introduced by α $ method, is a type of relation that combines the non-free choice and the invisible task. α $ proposed rules of ordering relations of two activities for determining invisible tasks in non-free choice. The event log records sequences of activities, so the rules of α $ check the combination of invisible task within non-free choice. The checking processes is time consuming, and results in high computing times of α $. This research proposes Graph-based Invisible Task (GIT) method to discover efficiently invisible tasks in non-free choice. GIT method develops sequences of business activities as graphs and determines rules to discover invisible tasks in non-free choice based on relations of the graphs. The analysis of the graph relations by rules of GIT is more efficient than the iterative process of checking combined activities by α $. This research measures the time efficiency of storing the event log and discovering a process model to evaluate GIT algorithm. Storing a streaming event log in a graph-database has the lowest computing time than storing in other databases, i.e., SQL and MongoDB. Discovering a process model by GIT algorithm has less time complexity than that by α $, wherein GIT obtains O(n3) and α $ obtains O(n4) . In terms of computing time, GIT algorithm is 0.89 faster on batch event log and 0.85 seconds faster on streaming event log than α $. Those results of the evaluation show a significant improvement of GIT method in term of time efficiency.


2020 ◽  
Vol 17 (3) ◽  
pp. 927-958
Author(s):  
Mohammadreza Sani ◽  
Sebastiaan van Zelst ◽  
Aalst van der

Process discovery algorithms automatically discover process models based on event data that is captured during the execution of business processes. These algorithms tend to use all of the event data to discover a process model. When dealing with large event logs, it is no longer feasible using standard hardware in limited time. A straightforward approach to overcome this problem is to down-size the event data by means of sampling. However, little research has been conducted on selecting the right sample, given the available time and characteristics of event data. This paper evaluates various subset selection methods and evaluates their performance on real event data. The proposed methods have been implemented in both the ProM and the RapidProM platforms. Our experiments show that it is possible to considerably speed up discovery using instance selection strategies. Furthermore, results show that applying biased selection of the process instances compared to random sampling will result in simpler process models with higher quality.


2016 ◽  
Vol 67 (2) ◽  
pp. 111-123 ◽  
Author(s):  
Julijana Lekić ◽  
Dragan Milićev

Abstract α-algorithm is suitable to discover a large class of workflow (WF) nets based on the behaviour recorded in event logs, with the main limiting assumption that the event log is complete. Our research has been aimed at finding ways of discovering business process models based on examples of traces, ie, logs of workflow actions that do not meet the requirement of completeness. In this aim, we have modified the existing and introduced a new relation between activities recorded in the event log, which has led to a partial correction of the process models discovering technique, including the α-algorithm. We have also introduced the notion of causally complete logs, from which our modified algorithm can produce the same result as the α-algorithm from complete logs. The effect of these modifications on the efficiency of the process model discovering is mostly evident for business processes in which many activities can be performed in parallel. The application of the modified method for discovering block-structured models of parallel business processes is presented in this paper.


Author(s):  
Riyanarto Sarno ◽  
Kelly Rossa Sungkono

Process discovery is a technique for obtaining process model based on traces recorded in the event log. Nowadays, information systems produce streaming event logs to record their huge processes. The truncated streaming event log is a big issue in process discovery because it inflicts incomplete traces that make process discovery depict wrong processes in a process model. Earlier research suggested several methods for recovering the truncated streaming event log and none of them utilized Coupled Hidden Markov Model. This research proposes a method that combines Coupled Hidden Markov Model with Double States and the Modification of Viterbi–Backward method for recovering the truncated streaming event log. The first layer of states contains the transition probability of activities. The second layer of states uses patterns for detecting traces which have a low appearance in the event log. The experiment results showed that the proposed method recovered appropriately the truncated streaming event log. These results also have proven that the accuracies of recovered traces obtained by the proposed method are higher than those obtained by the Hidden Markov Model and the Coupled Hidden Markov Model.


2020 ◽  
Vol 21 (1) ◽  
pp. 126-141
Author(s):  
Yutika Amelia Effendi ◽  
Riyanarto Sarno

A lot of services in business processes lead information systems to build huge amounts of event logs that are difficult to observe. The event log will be analysed using a process discovery technique to mine the process model by implementing some well-known algorithms such as deterministic algorithms and heuristic algorithms. All of the algorithms have their own benefits and limitations in analysing and discovering the event log into process models. This research proposed a new Time-based Alpha++ Miner with an improvement of the Alpha++ Miner and Modified Time-based Alpha Miner algorithm. The proposed miner is able to consider noise traces, loop, and non-free choice when modelling a process model where both of original algorithms cannot override those issues. A new Time-based Alpha++ Miner utilizing Time Interval Pattern can mine the process model using new rules defined by the time interval pattern using a double-time stamp event log and define sequence and parallel (AND, OR, and XOR) relation. The original miners are only able to discover sequence and parallel (AND and XOR) relation. To know the differences between the original Alpha++ Miner and the new one including the process model and its relations, the evaluation using fitness and precision was done in this research. The results presented that the process model obtained by a new Time-based Alpha++ Miner was better than that of the original Alpha++ Miner algorithm in terms of parallel OR, handling noise, fitness value, and precision value. ABSTRAK: Banyak sistem perniagaan perkhidmatan menghasilkan sejumlah besar log data maklumat yang payah dipantau. Log data ini akan dianalisis menggunakan teknik proses penemuan bagi memperoleh model proses dengan menerapkan beberapa algoritma terkenal, seperti algoritma deterministik dan algoritma heuristik. Semua algoritma ini memiliki kehebatan dan kekurangannya dalam menganalisis dan mencari log data ke dalam model proses. Kajian ini mencadangkan Time-based Alpha++ Miner baru yang merupakan pembaharuan dari algoritma Alpha++ Miner dan Modified Time-based Alpha Miner. Algoritma baru ini dapat mempertimbangkan kesan bunyi, pusingan, dan pilihan tidak bebas ketika memodelkan model proses di mana kedua algoritma asal tidak dapat menggantikan isu tersebut. Time-based Alpha++ Miner baru mengguna pakai Pola Interval Waktu berjaya memperoleh model proses menggunakan peraturan baru berdasarkan Pola Interval Waktu menggunakan log peristiwa waktu-ganda dan menentukan jujukan dan hubungan selari (AND, OR, dan XOR). Dibandingkan algoritma asal, ia hanya dapat menemukan jujukan dan hubungan selari (AND dan XOR). Bagi membezakan Alpha++ Miner asal dan yang baru termasuk model proses dan kaitannya, penilaian menggunakan nilai padanan dan penelitian telah dijalankan dalam kajian ini. Hasil kajian model proses yang diperoleh oleh Time-based Alpha++ Miner baru, adalah lebih baik keputusannya berbanding menggunakan algoritma Alpha++ Miner asal, berdasarkan hubungan selari OR, bunyi kawalan, nilai padanan, dan nilai penelitian.


Author(s):  
Ying Huang ◽  
Liyun Zhong ◽  
Yan Chen

The aim of process discovery is to discover process models from the process execution data stored in event logs. In the era of “Big Data,” one of the key challenges is to analyze the large amounts of collected data in meaningful and scalable ways. Most process discovery algorithms assume that all the data in an event log fully comply with the process execution specification, and the process event logs are no exception. However, real event logs contain large amounts of noise and data from irrelevant infrequent behavior. The infrequent behavior or noise has a negative influence on the process discovery procedure. This article presents a technique to remove infrequent behavior from event logs by calculating the minimum expectation of the process event log. The method was evaluated in detail, and the results showed that its application in existing process discovery algorithms significantly improves the quality of the discovered process models and that it scales well to large datasets.


Computing ◽  
2021 ◽  
Author(s):  
Mohammadreza Fani Sani ◽  
Sebastiaan J. van Zelst ◽  
Wil M. P. van der Aalst

AbstractWith Process discovery algorithms, we discover process models based on event data, captured during the execution of business processes. The process discovery algorithms tend to use the whole event data. When dealing with large event data, it is no longer feasible to use standard hardware in a limited time. A straightforward approach to overcome this problem is to down-size the data utilizing a random sampling method. However, little research has been conducted on selecting the right sample, given the available time and characteristics of event data. This paper systematically evaluates various biased sampling methods and evaluates their performance on different datasets using four different discovery techniques. Our experiments show that it is possible to considerably speed up discovery techniques using biased sampling without losing the resulting process model quality. Furthermore, due to the implicit filtering (removing outliers) obtained by applying the sampling technique, the model quality may even be improved.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Shabnam Shahzadi ◽  
Xianwen Fang ◽  
David Anekeya Alilah

For exploitation and extraction of an event’s data that has vital information which is related to the process from the event log, process mining is used. There are three main basic types of process mining as explained in relation to input and output. These are process discovery, conformance checking, and enhancement. Process discovery is one of the most challenging process mining activities based on the event log. Business processes or system performance plays a vital role in modelling, analysis, and prediction. Recently, a memoryless model such as exponential distribution of the stochastic Petri net SPN has gained much attention in research and industry. This paper uses time perspective for modelling and analysis and uses stochastic Petri net to check the performance, evolution, stability, and reliability of the model. To assess the effect of time delay in firing the transition, stochastic reward net SRN model is used. The model can also be used in checking the reliability of the model, whereas the generalized stochastic Petri net GSPN is used for evaluation and checking the performance of the model. SPN is used to analyze the probability of state transition and the stability from one state to another. However, in process mining, logs are used by linking log sequence with the state and, by this, modelling can be done, and its relation with stability of the model can be established.


2021 ◽  
Vol 10 (9) ◽  
pp. 144-147
Author(s):  
Huiling LI ◽  
Xuan SU ◽  
Shuaipeng ZHANG

Massive amounts of business process event logs are collected and stored by modern information systems. Model discovery aims to discover a process model from such event logs, however, most of the existing approaches still suffer from low efficiency when facing large-scale event logs. Event log sampling techniques provide an effective scheme to improve the efficiency of process discovery, but the existing techniques still cannot guarantee the quality of model mining. Therefore, a sampling approach based on set coverage algorithm named set coverage sampling approach is proposed. The proposed sampling approach has been implemented in the open-source process mining toolkit ProM. Furthermore, experiments using a real event log data set from conformance checking and time performance analysis show that the proposed event log sampling approach can greatly improve the efficiency of log sampling on the premise of ensuring the quality of model mining.


Sign in / Sign up

Export Citation Format

Share Document