KeyFinder: An Efficient Minimal Keys Finding Algorithm For Relational Databases

In relational databases, it is essential to know all minimal keys since the concept of database normaliza-tion is based on keys and functional dependencies of a relation schema. Existing algorithms for determining keysor computing the closure of arbitrary sets of attributes are generally time-consuming. In this paper we present aneï¬ƒcient algorithm, called KeyFinder, for solving the key-ï¬nding problem. We also propose a more direct methodfor computing the closure of a set of attributes. KeyFinder is based on a powerful proof procedure for ï¬ndingkeys called tableaux. Experimental results show that KeyFinder outperforms its predecessors in terms of searchspace and execution time.

Download Full-text

A Method for Normalization of Relation Schema Based on Data to Abide by the Third Normal Form

WSEAS TRANSACTIONS ON MATHEMATICS ◽

10.37394/23206.2020.19.20 ◽

2020 ◽

Vol 19 ◽

Keyword(s):

Normal Form ◽

Relational Databases ◽

Normal Forms ◽

Applied Mathematics ◽

Database Management System ◽

Functional Dependencies ◽

Full Understanding ◽

Data Bases ◽

The Third ◽

Relation Schema

Data bases play an important role in applied Mathematics. Normalization for relational databases is very important to avoid anomalies of relations which may not be in normalized forms of the third normal forms. But, normalization may be a difficult task, since the designers of the databases may not fully understand the domain of each attribute that are contained in the relation schema or they may not have full understanding about the concept of normalization. In this paper an efficient method that checks the possibility of the need of further normalization using stored data in relations is presented based on possible functional dependencies between attributes in the relations. By checking possible functional dependencies, the database designers can determine the need of further normalization, and may improve the structure of the relation schemas. Experiments were performed for an example of relational database that can be found in the organization of tutorial of MySQL which is a representational database management system, and the experiments showed good results.

Download Full-text

Lightweight Blockchain Processing. Case Study: Scanned Document Tracking on Tezos Blockchain

Applied Sciences ◽

10.3390/app11157169 ◽

2021 ◽

Vol 11 (15) ◽

pp. 7169

Author(s):

Mohamed Allouche ◽

Tarek Frikha ◽

Mihai Mitrea ◽

Gérard Memmi ◽

Faten Chaabane

Keyword(s):

Load Balancing ◽

Relative Error ◽

Execution Time ◽

General Purpose ◽

Experimental Results ◽

Raspberry Pi ◽

Embedded Platform ◽

Memory Resources ◽

Processing Solution

To bridge the current gap between the Blockchain expectancies and their intensive computation constraints, the present paper advances a lightweight processing solution, based on a load-balancing architecture, compatible with the lightweight/embedding processing paradigms. In this way, the execution of complex operations is securely delegated to an off-chain general-purpose computing machine while the intimate Blockchain operations are kept on-chain. The illustrations correspond to an on-chain Tezos configuration and to a multiprocessor ARM embedded platform (integrated into a Raspberry Pi). The performances are assessed in terms of security, execution time, and CPU consumption when achieving a visual document fingerprint task. It is thus demonstrated that the advanced solution makes it possible for a computing intensive application to be deployed under severely constrained computation and memory resources, as set by a Raspberry Pi 3. The experimental results show that up to nine Tezos nodes can be deployed on a single Raspberry Pi 3 and that the limitation is not derived from the memory but from the computation resources. The execution time with a limited number of fingerprints is 40% higher than using a classical PC solution (value computed with 95% relative error lower than 5%).

Download Full-text

Performance Analysis and Optimization Techniques for Oracle Relational Databases

Cybernetics and Information Technologies ◽

10.2478/cait-2019-0019 ◽

2019 ◽

Vol 19 (2) ◽

pp. 117-132

Author(s):

Fernando Almeida ◽

Pedro Silva ◽

Fernando Araújo

Keyword(s):

Execution Time ◽

Relational Databases ◽

Database System ◽

Optimization Techniques ◽

Management Systems ◽

System Response ◽

Database Access ◽

Data Manipulation ◽

System Response Time ◽

Analyze Data

Abstract Databases provide an efficient way to store, retrieve and analyze data. Oracle relational database is one of the most popular database management systems that is widely used in a different variety of industries and businesses. Therefore, it is important to guarantee that the database access and data manipulation is optimized for reducing database system response time. This paper intends to analyze the performance and the main optimization techniques (Forall, Returning, and Bulk Collect) that can be adopted for Oracle Relational Databases. The results have shown that the adoption of Forall and Bulk Collect approaches bring significant benefits in terms of execution time. Furthermore, the growth rate of the average execution time is lower for Bulk Collect than Forall. However, adoption of Returning approach doesn’t bring significant statistical benefits.

Download Full-text

Research on Data Mining Optimization and Security Based on MapReduce

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.631-632.1053 ◽

2014 ◽

Vol 631-632 ◽

pp. 1053-1056

Author(s):

Hui Xia

Keyword(s):

Data Mining ◽

Execution Time ◽

Cluster Computing ◽

Limited Resource ◽

Experimental Results ◽

Computing Environment ◽

Cluster Systems ◽

National Education ◽

Distributed Cluster ◽

Data Optimization

The paper addressed the issues of limited resource for data optimization for efficiency, reliability, scalability and security of data in distributed, cluster systems with huge datasets. The study’s experimental results predicted that the MapReduce tool developed improved data optimization. The system exhibits undesired speedup with smaller datasets, but reasonable speedup is achieved with a larger enough datasets that complements the number of computing nodes reducing the execution time by 30% as compared to normal data mining and processing. The MapReduce tool is able to handle data growth trendily, especially with larger number of computing nodes. Scaleup gracefully grows as data and number of computing nodes increases. Security of data is guaranteed at all computing nodes since data is replicated at various nodes on the cluster system hence reliable. Our implementation of the MapReduce runs on distributed cluster computing environment of a national education web portal and is highly scalable.

Download Full-text

Comparison of Linguistic Summaries and Fuzzy Functional Dependencies Related to Data Mining

Advances in Data Mining and Database Management - Biologically-Inspired Techniques for Knowledge Discovery and Data Mining ◽

10.4018/978-1-4666-6078-6.ch008 ◽

2014 ◽

pp. 174-203 ◽

Cited By ~ 4

Author(s):

Miroslav Hudec ◽

Miljan Vučetić ◽

Mirko Vujošević

Keyword(s):

Data Mining ◽

Fuzzy Logic ◽

Relational Databases ◽

Missing Values ◽

Expert Knowledge ◽

Real Data ◽

Research Area ◽

Functional Dependencies ◽

Useful Knowledge ◽

Important Research Area

Data mining methods based on fuzzy logic have been developed recently and have become an increasingly important research area. In this chapter, the authors examine possibilities for discovering potentially useful knowledge from relational database by integrating fuzzy functional dependencies and linguistic summaries. Both methods use fuzzy logic tools for data analysis, acquiring, and representation of expert knowledge. Fuzzy functional dependencies could detect whether dependency between two examined attributes in the whole database exists. If dependency exists only between parts of examined attributes' domains, fuzzy functional dependencies cannot detect its characters. Linguistic summaries are a convenient method for revealing this kind of dependency. Using fuzzy functional dependencies and linguistic summaries in a complementary way could mine valuable information from relational databases. Mining intensities of dependencies between database attributes could support decision making, reduce the number of attributes in databases, and estimate missing values. The proposed approach is evaluated with case studies using real data from the official statistics. Strengths and weaknesses of the described methods are discussed. At the end of the chapter, topics for further research activities are outlined.

Download Full-text

Automatic Partitioning of Large Scale Simulation in Grid Computing for Run Time Reduction

Innovations in Information Systems for Business Functionality and Operations Management ◽

10.4018/978-1-4666-0933-4.ch014 ◽

2012 ◽

pp. 225-252

Author(s):

Nurcin Celik ◽

Esfandyar Mazhari ◽

John Canby ◽

Omid Kazemi ◽

Parag Sarfare ◽

...

Keyword(s):

Execution Time ◽

Large Scale ◽

Time Synchronization ◽

Computational Grid ◽

Experimental Results ◽

Time Interval ◽

Computational Power ◽

Large Scale Systems ◽

Large Scale Simulations ◽

Reduce Execution Time

Simulating large-scale systems usually entails exhaustive computational powers and lengthy execution times. The goal of this research is to reduce execution time of large-scale simulations without sacrificing their accuracy by partitioning a monolithic model into multiple pieces automatically and executing them in a distributed computing environment. While this partitioning allows us to distribute required computational power to multiple computers, it creates a new challenge of synchronizing the partitioned models. In this article, a partitioning methodology based on a modified Prim’s algorithm is proposed to minimize the overall simulation execution time considering 1) internal computation in each of the partitioned models and 2) time synchronization between them. In addition, the authors seek to find the most advantageous number of partitioned models from the monolithic model by evaluating the tradeoff between reduced computations vs. increased time synchronization requirements. In this article, epoch- based synchronization is employed to synchronize logical times of the partitioned simulations, where an appropriate time interval is determined based on the off-line simulation analyses. A computational grid framework is employed for execution of the simulations partitioned by the proposed methodology. The experimental results reveal that the proposed approach reduces simulation execution time significantly while maintaining the accuracy as compared with the monolithic simulation execution approach.

Download Full-text

Improving the K-Means Clustering Algorithm Oriented to Big Data Environments

Handbook of Research on Natural Language Processing and Smart Service Systems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-4730-4.ch013 ◽

2021 ◽

pp. 289-308

Author(s):

Joaquín Pérez Ortega ◽

Nelva Nely Almanza Ortega ◽

Andrea Vega Villalobos ◽

Marco A. Aguirre L. ◽

Crispín Zavala Díaz ◽

...

Keyword(s):

Big Data ◽

Text Mining ◽

Large Volume ◽

Execution Time ◽

Clustering Algorithm ◽

Efficient Algorithms ◽

Experimental Results ◽

Digital Format ◽

Basic Approaches ◽

Previous Iteration

In recent years, the amount of texts in natural language, in digital format, has had an impressive increase. To obtain useful information from a large volume of data, new specialized techniques and efficient algorithms are required. Text mining consists of extracting meaningful patterns from texts; one of the basic approaches is clustering. The most used clustering algorithm is k-means. This chapter proposes an improvement of the k-means algorithm in the convergence step; the process stops whenever the number of objects that change their assigned cluster in the current iteration is bigger than the ones that changed in the previous iteration. Experimental results showed a reduction in execution time up to 93%. It is remarkable that, in general, better results are obtained when the volume of the text increase, particularly in those texts within big data environments.

Download Full-text

Incremental Discovery of Fuzzy Functional Dependencies

Handbook of Research on Fuzzy Information Processing in Databases ◽

10.4018/978-1-59904-853-6.ch024 ◽

2011 ◽

pp. 615-633 ◽

Cited By ~ 1

Author(s):

Shyue-Liang Wang ◽

Ju-Wen Shen ◽

Tuzng-Pei Hong

Keyword(s):

Data Mining ◽

Relational Databases ◽

Search Algorithm ◽

Current Data ◽

Research Interest ◽

Functional Dependencies ◽

Incremental Search ◽

Data Mining Techniques ◽

Analysis Technique ◽

Mining Algorithms

Mining functional dependencies (FDs) from databases has been identified as an important database analysis technique. It has received considerable research interest in recent years. However, most current data mining techniques for determining functional dependencies deal only with crisp databases. Although various forms of fuzzy functional dependencies (FFDs) have been proposed for fuzzy databases, they emphasized conceptual viewpoints and only a few mining algorithms are given. In this research, we propose methods to validate and incrementally search for FFDs from similarity-based fuzzy relational databases. For a given pair of attributes, the validation of FFDs is based on fuzzy projection and fuzzy selection operations. In addition, the property that FFDs are monotonic in the sense that r1 ? r2 implies FDa(r1) ? FDa(r2) is shown. An incremental search algorithm for FFDs based on this property is then presented. Experimental results showing the behavior of the search algorithm are discussed.

Download Full-text

Fuzzy functional dependencies and linguistic interpretations employed in knowledge discovery tasks from relational databases

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2019.103395 ◽

2020 ◽

Vol 88 ◽

pp. 103395

Author(s):

Miljan Vučetić ◽

Miroslav Hudec ◽

Boško Božilović

Keyword(s):

Knowledge Discovery ◽

Relational Databases ◽

Functional Dependencies

Download Full-text

Based on Improved Genetic Algorithm for Task Scheduling of Heterogeneous Multi-Core Processor

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1030-1032.1671 ◽

2014 ◽

Vol 1030-1032 ◽

pp. 1671-1675

Author(s):

Yue Qiu ◽

Jing Feng Zang

Keyword(s):

Genetic Algorithm ◽

Task Scheduling ◽

Execution Time ◽

Scheduling Algorithm ◽

Experimental Results ◽

Initial Population ◽

Improved Genetic Algorithm ◽

High Quality ◽

Multi Core Processor ◽

Better Than

This paper puts forward an improved genetic scheduling algorithm in order to improve the execution efficiency of task scheduling of the heterogeneous multi-core processor system and give full play to its performance. The attribute values and the high value of tasks were introduced to structure the initial population, randomly selected a method with the 50% probability to sort for task of individuals of the population, thus to get high quality initial population and ensured the diversity of the population. The experimental results have shown that the performance of the improved algorithm was better than that of the traditional genetic algorithm and the HEFT algorithm. The execution time of tasks was reduced.

Download Full-text