pruning techniques
Recently Published Documents


TOTAL DOCUMENTS

123
(FIVE YEARS 32)

H-INDEX

13
(FIVE YEARS 1)

Author(s):  
Nesma Youssef ◽  
Hatem Abdulkader ◽  
Amira Abdelwahab

Sequential rule mining is one of the most common data mining techniques. It intends to find desired rules in large sequence databases. It can decide the essential information that helps acquire knowledge from large search spaces and select curiously rules from sequence databases. The key challenge is to avoid wasting time, which is particularly difficult in large sequence databases. This paper studies the mining rules from two representations of sequential patterns to have compact databases without affecting the final result. In addition, execute a parallel approach by utilizing multi core processor architecture for mining non-redundant sequential rules. Also, perform pruning techniques to enhance the efficiency of the generated rules. The evaluation of the proposed algorithm was accomplished by comparing it with another non-redundant sequential rule algorithm called Non-Redundant with Dynamic Bit Vector (NRD-DBV). Both algorithms were performed on four real datasets with different characteristics. Our experiments show the performance of the proposed algorithm in terms of execution time and computational cost. It achieves the highest efficiency, especially for large datasets and with low values of minimum support, as it takes approximately half the time consumed by the compared algorithm.


2021 ◽  
Author(s):  
Sebastian Schmidl ◽  
Thorsten Papenbrock

AbstractBidirectional order dependencies (bODs) capture order relationships between lists of attributes in a relational table. They can express that, for example, sorting books by publication date in ascending order also sorts them by age in descending order. The knowledge about order relationships is useful for many data management tasks, such as query optimization, data cleaning, or consistency checking. Because the bODs of a specific dataset are usually not explicitly given, they need to be discovered. The discovery of all minimal bODs (in set-based canonical form) is a task with exponential complexity in the number of attributes, though, which is why existing bOD discovery algorithms cannot process datasets of practically relevant size in a reasonable time. In this paper, we propose the distributed bOD discovery algorithm DISTOD, whose execution time scales with the available hardware. DISTOD is a scalable, robust, and elastic bOD discovery approach that combines efficient pruning techniques for bOD candidates in set-based canonical form with a novel, reactive, and distributed search strategy. Our evaluation on various datasets shows that DISTOD outperforms both single-threaded and distributed state-of-the-art bOD discovery algorithms by up to orders of magnitude; it can, in particular, process much larger datasets.


Author(s):  
Alexander Tuisov ◽  
Michael Katz

Heuristic search is among the best performing approaches to classical satisficing planning, with its performance heavily relying on informative and fast heuristics, as well as search-boosting and pruning techniques. While both heuristics and pruning techniques have gained much attention recently, search-boosting techniques in general, and preferred operators in particular have received less attention in the last decade. Our work aims at bringing the light back to preferred operators research, with the introduction of preferred operators pruning technique, based on the concept of novelty. Continuing the research on novelty with respect to an underlying heuristic, we present the definition of preferred operators for such novelty heuristics. For that, we extend the previously defined concepts to operators, allowing us to reason about the novelty of the preferred operators. Our experimental evaluation shows the practical benefit of our suggested approach, compared to the currently used methods.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Hugo Masson ◽  
Amran Bhuiyan ◽  
Le Thanh Nguyen-Meidine ◽  
Mehrsan Javan ◽  
Parthipan Siva ◽  
...  

AbstractRecent years have witnessed a substantial increase in the deep learning (DL) architectures proposed for visual recognition tasks like person re-identification, where individuals must be recognized over multiple distributed cameras. Although these architectures have greatly improved the state-of-the-art accuracy, the computational complexity of the convolutional neural networks (CNNs) commonly used for feature extraction remains an issue, hindering their deployment on platforms with limited resources, or in applications with real-time constraints. There is an obvious advantage to accelerating and compressing DL models without significantly decreasing their accuracy. However, the source (pruning) domain differs from operational (target) domains, and the domain shift between image data captured with different non-overlapping camera viewpoints leads to lower recognition accuracy. In this paper, we investigate the prunability of these architectures under different design scenarios. This paper first revisits pruning techniques that are suitable for reducing the computational complexity of deep CNN networks applied to person re-identification. Then, these techniques are analyzed according to their pruning criteria and strategy and according to different scenarios for exploiting pruning methods to fine-tuning networks to target domains. Experimental results obtained using DL models with ResNet feature extractors, and multiple benchmarks re-identification datasets, indicate that pruning can considerably reduce network complexity while maintaining a high level of accuracy. In scenarios where pruning is performed with large pretraining or fine-tuning datasets, the number of FLOPS required by ResNet architectures is reduced by half, while maintaining a comparable rank-1 accuracy (within 1% of the original model). Pruning while training a larger CNNs can also provide a significantly better performance than fine-tuning smaller ones.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Mostafa El Habib Daho ◽  
Nesma Settouti ◽  
Mohammed El Amine Bechar ◽  
Amina Boublenza ◽  
Mohammed Amine Chikh

PurposeEnsemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.Design/methodology/approachIn this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.FindingsThe proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.Originality/valueCES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.


2021 ◽  
pp. 1-1
Author(s):  
Shinsuke Fujisawa ◽  
Fatih Yaman ◽  
Hussam G. Batshon ◽  
Masaaki Tanio ◽  
Naoto Ishii ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document