Novel Approach for Mining Patterns

2021 ◽  
Vol 12 (1) ◽  
pp. 27-42
Author(s):  
Ishak H. A. Meddah ◽  
Nour Elhouda Remil ◽  
Hadja Nebia Meddah

Process mining techniques allow for extracting information from event logs. In general, there are two steps in process mining, correlation definition or discovery and then process inference or composition. Firstly, the work consists to mine small patterns from a log traces; those patterns are the representation of the traces execution from a log file of a business process. In this step, the authors use existing techniques. The patterns are represented by finite state automaton or their regular expression. The final model is the combination of only two types of small patterns that are represented by the regular expressions. Secondly, they compute these patterns in parallel and then combine those small patterns using the MapReduce framework. They have two parties the first is the map step. They mine patterns from execution traces, and the second is the combination of these small patterns as reduce step. The results are promising; they show that the approach is scalable, general, and precise. It minimizes the execution time by the use of the MapReduce framework.

2016 ◽  
Vol 3 (4) ◽  
pp. 21-31 ◽  
Author(s):  
Ishak Meddah ◽  
Belkadi Khaled

Process mining provides an important bridge between data mining and business process analysis, his techniques allow for extracting information from event logs. In general, there are two steps in process mining, correlation definition or discovery and then process inference or composition. Firstly, the authors' work consists to mine small patterns from a log traces of two applications; SKYPE, and VIBER, those patterns are the representation of the execution traces of a business process. In this step, the authors use existing techniques; The patterns are represented by finite state automaton or their regular expression; The final model is the combination of only two types of small patterns whom are represented by the regular expressions (ab)* and (ab*c)*. Secondly, the authors compute these patterns in parallel, and then combine those small patterns using the composition rules, they have two parties the first is the mine, they discover patterns from execution traces and the second is the combination of these small patterns. The patterns mining and the composition is illustrated by the automaton existing techniques. The Execution traces are the different actions effected by users in the SKYPE and VIBER. The results are general and precise. It minimizes the execution time and the loss of information.


Author(s):  
Ishak H. A. Meddah ◽  
Khaled Belkadi

MapReduce is a solution for the treatment of large data. With it we can analyze and process data. It does this by distributing the computation in a large set of machines. Process mining provides an important bridge between data mining and business process analysis. This technique allows for the extraction of information from event logs. Firstly, the chapter mines small patterns from log traces. Those patterns are the representation of the traces execution from a business process. The authors use existing techniques; the patterns are represented by finite state automaton; the final model is the combination of only two types of patterns that are represented by the regular expressions. Secondly, the authors compute these patterns in parallel, and then combine those patterns using MapReduce. They have two parties. The first is the Map Step. The authors mine patterns from execution traces. The second is the combination of these small patterns as reduce step. The results are promising; they show that the approach is scalable, general, and precise. It minimizes the execution time by the use of MapReduce.


Author(s):  
Ishak H. A. Meddah ◽  
Khaled Belkadi

Process mining provides an important bridge between data mining and business process analysis. This technique allows for the extraction of information from event logs. In general, there are two steps in process mining: correlation definition or discovery and then process inference or composition. Firstly, the authors mine small patterns from log traces of two applications; those patterns are the representation of the execution traces of a business process. In this step, the authors use existing techniques. The patterns are represented by finite state automaton or their regular expression. The final model is the combination of only two types of small patterns that are represented by the regular expressions (ab)* and (ab*c)*. Secondly, the authors compute these patterns in parallel and then combine those small patterns using the composition rules. They have two parties. The first is the mine, where the authors discover patterns from execution traces, and the second is the combination of these small patterns. The pattern mining and the composition is illustrated by the automaton existing techniques.


Author(s):  
Ishak H. A. Meddah ◽  
Khaled Belkadi

The treatment of large data is proving more difficult in different axes, but the arrival of the framework MapReduce is a solution of this problem. With it we can analyze and process vast amounts of data. It does this by distributing the computational work across a cluster of virtual servers running in a cloud or large set of machines while process mining provides an important bridge between data mining and business process analysis. The process mining techniques allow for extracting information from event logs. In general, there are two steps in process mining: correlation definition or discovery and process inference or composition. Firstly, the authors' work consists to mine small patterns from a log traces. Those patterns are the representation of the traces execution from a log file of a business process. In this step, they use existing techniques. The patterns are represented by finite state automaton or their regular expression. The final model is the combination of only two types of small patterns whom are represented by the regular expressions (ab)* and (ab*c)*. Secondly, the authors compute these patterns in parallel, and then combine those small patterns using the MapReduce framework. They have two parties: the first is the Map Step in which they mine patterns from execution traces; the second is the combination of these small patterns as reduce step. The authors' results are promising in that they show that their approach is scalable, general, and precise. It minimizes the execution time by the use of the MapReduce framework.


Author(s):  
Ishak H.A. Meddah ◽  
Khaled Belkadi ◽  
Mohamed Amine Boudia

Hadoop MapReduce has arrived to solve the problem of treatment of big data, also the parallel treatment, with this framework the authors analyze, process a large size of data. It based for distributing the work in two big steps, the map and the reduce steps in a cluster or big set of machines. They apply the MapReduce framework to solve some problems in the domain of process mining how provides a bridge between data mining and business process analysis, this technique consists to mine lot of information from the process traces; In process mining, there are two steps, correlation definition and the process inference. The work consists in first time of mining patterns whom are the work flow of the process from execution traces, those patterns present the work or the history of each party of the process, the authors' small patterns are represented in this work by finite state automaton or their regular expression, the authors have only two patterns to facilitate the process, the general presentation of the process is the combination of the small mining patterns. The patterns are represented by the regular expressions (ab)* and (ab*c)*. Secondly, they compute the patterns, and combine them using the Hadoop MapReduce framework, in this work they have two general steps, first the Map step, they mine small patterns or small models from business process, and the second is the combination of models as reduce step. The authors use the business process of two web applications, the SKYPE, and VIBER applications. The general result shown that the parallel distributed process by using the Hadoop MapReduce framework is scalable, and minimizes the execution time.


2017 ◽  
Vol 9 (1) ◽  
pp. 49-60
Author(s):  
Ishak H.A. Meddah ◽  
Khaled Belkadi ◽  
Mohamed Amine Boudia

Hadoop MapReduce is one of the solutions for the process of large and big data, with-it the authors can analyze and process data, it does this by distributing the computational in a large set of machines. Process mining provides an important bridge between data mining and business process analysis, his techniques allow for mining data information from event logs. Firstly, the work consists to mine small patterns from a log traces, those patterns are the workflow of the execution traces of business process. The authors' work is an amelioration of the existing techniques who mine only one general workflow, the workflow present the general traces of two web applications; they use existing techniques; the patterns are represented by finite state automaton; the final model is the combination of only two types of patterns whom are represented by the regular expressions. Secondly, the authors compute these patterns in parallel, and then combine those patterns using MapReduce, they have two parts the first is the Map Step, they mine patterns from execution traces and the second is the combination of these small patterns as reduce step. The results are promising; they show that the approach is scalable, general and precise. It reduces the execution time by the use of Hadoop MapReduce Framework.


2021 ◽  
Vol 4 ◽  
Author(s):  
Rashid Zaman ◽  
Marwan Hassani ◽  
Boudewijn F. Van Dongen

In the context of process mining, event logs consist of process instances called cases. Conformance checking is a process mining task that inspects whether a log file is conformant with an existing process model. This inspection is additionally quantifying the conformance in an explainable manner. Online conformance checking processes streaming event logs by having precise insights into the running cases and timely mitigating non-conformance, if any. State-of-the-art online conformance checking approaches bound the memory by either delimiting storage of the events per case or limiting the number of cases to a specific window width. The former technique still requires unbounded memory as the number of cases to store is unlimited, while the latter technique forgets running, not yet concluded, cases to conform to the limited window width. Consequently, the processing system may later encounter events that represent some intermediate activity as per the process model and for which the relevant case has been forgotten, to be referred to as orphan events. The naïve approach to cope with an orphan event is to either neglect its relevant case for conformance checking or treat it as an altogether new case. However, this might result in misleading process insights, for instance, overestimated non-conformance. In order to bound memory yet effectively incorporate the orphan events into processing, we propose an imputation of missing-prefix approach for such orphan events. Our approach utilizes the existing process model for imputing the missing prefix. Furthermore, we leverage the case storage management to increase the accuracy of the prefix prediction. We propose a systematic forgetting mechanism that distinguishes and forgets the cases that can be reliably regenerated as prefix upon receipt of their future orphan event. We evaluate the efficacy of our proposed approach through multiple experiments with synthetic and three real event logs while simulating a streaming setting. Our approach achieves considerably higher realistic conformance statistics than the state of the art while requiring the same storage.


Author(s):  
Weidong Yang ◽  
Hao Zhu

Chapter 5 presents a novel approach for processing complex twig pattern with OR-predicates and AND-predicates over XML document streams which a twig pattern is represented as a query tree. Its OR-predicates and AND-predicates are represented as a separate abstract syntax tree associated with the branch node, and all the twig patterns are combined into a single prefix query tree that represents such queries by sharing their common prefixes. Consequently, all the twig patterns are evaluated in a single, document-order pass over the input document stream avoiding the translation of the set of twig patterns into a finite state automaton. Chapter 1 introduces the background of this issue. Chapter 2 discusses the representation of complex twig pattern as a query tree, how to combine a set of twig patterns into a single query three, how to match multi twig patterns over the incoming XML document, and possible optimization of computing logical AND/OR predicates. In section 3, the architecture of a XML stream process system named LeoXSQ is given. Section 4 shows the conducted experiments. In section 5, the related work is discussed. Section 6 summarizes this chapter.


2009 ◽  
Vol 20 (06) ◽  
pp. 1069-1086
Author(s):  
WIKUS COETSER ◽  
DERRICK G. KOURIE ◽  
BRUCE W. WATSON

The consequences of regular expression hashing as a means of finite state automaton reduction is explored, based on variations of Brzozowski's algorithm. In this approach, each hash collision results in the merging of the automaton's states, and it is subsequently shown that a super-automaton will always be constructed, regardless of the hash function used. Since direct adaptation of the classical Brzozowski algorithm leads to a non-deterministic super-automaton, a new algorithm is put forward for constructing a deterministic FA. Approaches are proposed for measuring the quality of a hash function. These ideas are empirically tested on a large sample of relatively small regular expressions and their associated automata, as well as on a small sample of relatively large regular expressions. Differences in the quality of tested hash functions are observed. Possible reasons for this are mentioned, but future empirical work is required to investigate the matter.


2015 ◽  
Vol 27 (5) ◽  
pp. 807-826
Author(s):  
GÉRARD HUET ◽  
BENOÎT RAZET

We propose a relational computing paradigm based on Eilenberg machines, an effective version of Eilenberg's X-machines suitable for general non-deterministic computation. An Eilenberg machine generalizes a finite-state automaton, seen as its control component, with a computation component over a data domain specified as a relational algebra, its actions being interpreted as binary relations over the data domain. We show various strategies for the sequential simulation of our relational machines, using variants of the reactive engine. In a particular case of finite machines, we show that bottom-up search yields an efficient complete simulator.Relational machines may be composed in a modular fashion, since atomic actions of one machine can be mapped to the characteristic relation of other relational machines acting as its parameters.The control components of machines can be compiled from regular expressions. Several such translations have been proposed in the literature, which we briefly survey.


Sign in / Sign up

Export Citation Format

Share Document