Cleaning Antipatterns in an SQL Query Log

Author(s):  
Natalia Arzamasova ◽  
Martin Schaler ◽  
Klemens Bohm
Keyword(s):  
2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Rajesh Kumar Dhanaraj ◽  
Vinothsaravanan Ramakrishnan ◽  
M. Poongodi ◽  
Lalitha Krishnasamy ◽  
Mounir Hamdi ◽  
...  

In the current ongoing crisis, people mostly rely on mobile phones for all the activities, but query analysis and mobile data security are major issues. Several research works have been made on efficient detection of antipatterns for minimizing the complexity of query analysis. However, more focus needs to be given to the accuracy aspect. In addition, for grouping similar antipatterns, a clustering process was performed to eradicate the design errors. To address the above-said issues and further enhance the antipattern detection accuracy with minimum time and false positive rate, in this work, Random Forest Bagging X-means SQL Query Clustering (RFBXSQLQC) technique is proposed. Different patterns or queries are initially gathered from the input SQL query log, and bootstrap samples are created. Then, for each pattern, various weak clusters are constructed via X-means clustering and are utilized as the weak learner (clusters). During this process, the input patterns are categorized into different clusters. Using the Bayesian information criterion, the similarity measure is employed to evaluate the similarity between the patterns and cluster weight. Based on the similarity value, patterns are assigned to either relevant or irrelevant groups. The weak learner results are aggregated to form strong clusters, and, with the aid of voting, a majority vote is considered for designing strong clusters with minimum time. Experiments are conducted to evaluate the performance of the RFBXSQLQC technique using the IIT Bombay dataset using the metrics like antipattern detection accuracy, time complexity, false-positive rate, and computational overhead with respect to the differing number of queries. The results revealed that the RFBXSQLQC technique outperforms the existing algorithms by 19% with pattern detection accuracy, 34% minimized time complexity, 64% false-positive rate, and 31% in terms of computational overhead.


2018 ◽  
Vol 30 (3) ◽  
pp. 421-434 ◽  
Author(s):  
Natalia Arzamasova ◽  
Martin Schaler ◽  
Klemens Bohm
Keyword(s):  

2019 ◽  
Vol 48 (1) ◽  
pp. 6-13 ◽  
Author(s):  
Wim Martens ◽  
Tina Trautner

2021 ◽  
Vol 1873 (1) ◽  
pp. 012065
Author(s):  
Huiran Zhang ◽  
Cheng Zhang ◽  
Rui Hu ◽  
Xi Liu ◽  
Dongbo Dai

2009 ◽  
Vol 43 (2) ◽  
pp. 71-77 ◽  
Author(s):  
Paul Clough ◽  
Bettina Berendt

Sign in / Sign up

Export Citation Format

Share Document