Horizontal Data Partitioning

A Genetic Algorithm for Selecting Horizontal Fragments

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch142 ◽

2011 ◽

pp. 920-925

Author(s):

Ladjel Bellatreche

Keyword(s):

Database Systems ◽

Data Partitioning ◽

Materialized Views ◽

Access Path ◽

Multiple Dimensions ◽

Physical Database Design ◽

Join Queries ◽

Speed Up ◽

Disjoint Sets ◽

Horizontal Partitioning

Decision support applications require complex queries, e.g., multi way joins defining on huge warehouses usually modelled using star schemas, i.e., a fact table and a set of data dimensions (Papadomanolakis & Ailamaki, 2004). Star schemas have an important property in terms of join operations between dimensions tables and the fact table (i.e., the fact table contains foreign keys for each dimension). None join operations between dimension tables. Joins in data warehouses (called star join queries) are particularly expensive because the fact table (the largest table in the warehouse by far) participates in every join and multiple dimensions are likely to participate in each join. To speed up star join queries, many optimization structures were proposed: redundant structures (materialized views and advanced index schemes) and non redundant structures (data partitioning and parallel processing). Recently, data partitioning is known as an important aspect of physical database design (Sanjay, Narasayya & Yang, 2004; Papadomanolakis & Ailamaki, 2004). Two types of data partitioning are available (Özsu & Valduriez, 1999): vertical and horizontal partitioning. Vertical partitioning allows tables to be decomposed into disjoint sets of columns. Horizontal partitioning allows tables, materialized views and indexes to be partitioned into disjoint sets of rows that are physically stored and usually accessed separately. Contrary to redundant structures, data partitioning does not replicate data, thereby reducing storage requirement and minimizing maintenance overhead. In this paper, we concentrate only on horizontal data partitioning (HP). HP may affect positively (1) query performance, by performing partition elimination: if a query includes a partition key as a predicate in the WHERE clause, the query optimizer will automatically route the query to only relevant partitions and (2) database manageability: for instance, by allocating partitions in different machines or by splitting any access paths: tables, materialized views, indexes, etc. Most of database systems allow three methods to perform the HP using PARTITION statement: RANGE, HASH and LIST (Sanjay, Narasayya & Yang, 2004). In the range partitioning, an access path (table, view, and index) is split according to a range of values of a given set of columns. The hash mode decomposes the data according to a hash function (provided by the system) applied to the values of the partitioning columns. The list partitioning splits a table according to the listed values of a column. These methods can be combined to generate composite partitioning. Oracle currently supports range-hash and range-list composite partitioning using PARTITION - SUBPARTITION statement. The following SQL statement shows an example of fragmenting a table Student using range partitioning.

Download Full-text

Bitmap Join Indexes vs. Data Partitioning

Database Technologies ◽

10.4018/978-1-60566-058-5.ch140 ◽

2009 ◽

pp. 2292-2300

Author(s):

Ladjel Bellatreche

Keyword(s):

Parallel Processing ◽

Data Partitioning ◽

Optimization Techniques ◽

Sloan Digital Sky Survey ◽

Materialized Views ◽

Binary Operations ◽

Vertical Partitioning ◽

Speed Up ◽

Redundant Structure ◽

Horizontal Partitioning

Scientific databases and data warehouses store large amounts of data ith several tables and attributes. For instance, the Sloan Digital Sky Survey (SDSS) astronomical database contains a large number of tables with hundreds of attributes, which can be queried in various combinations (Papadomanolakis & Ailamaki, 2004). These queries involve many tables using binary operations, such as joins. To speed up these queries, many optimization structures were proposed that can be divided into two main categories: redundant structures like materialized views, advanced indexing schemes (bitmap, bitmap join indexes, etc.) (Sanjay, Chaudhuri & Narasayya, 2000) and vertical partitioning (Sanjay, Narasayya & Yang 2004) and non redundant structures like horizontal partitioning (Sanjay, Narasayya & Yang 2004; Bellatreche, Boukhalfa & Mohania, 2007) and parallel processing (Datta, Moon, & Thomas, 2000; Stöhr, Märtens & Rahm, 2000). These optimization techniques are used either in a sequential manner ou combined. These combinations are done intra-structures: materialized views and indexes for redundant and partitioning and data parallel processing for no redundant. Materialized views and indexes compete for the same resource representing storage, and incur maintenance overhead in the presence of updates (Sanjay, Chaudhuri & Narasayya, 2000). None work addresses the problem of selecting combined optimization structures. In this paper, we propose two approaches; one for combining a non redundant structures horizontal partitioning and a redundant structure bitmap indexes in order to reduce the query processing and reduce the maintenance overhead, and another to exploit algorithms for vertical partitioning to generate bitmap join indexes. To facilitate the understanding of our approaches, for review these techniques in details.

Download Full-text

Bitmap Join Indexes vs. Data Partitioning

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch028 ◽

2011 ◽

pp. 171-177

Author(s):

Ladjel Bellatreche

Keyword(s):

Parallel Processing ◽

Data Partitioning ◽

Optimization Techniques ◽

Sloan Digital Sky Survey ◽

Materialized Views ◽

Binary Operations ◽

Vertical Partitioning ◽

Speed Up ◽

Redundant Structure ◽

Horizontal Partitioning

Scientific databases and data warehouses store large amounts of data ith several tables and attributes. For instance, the Sloan Digital Sky Survey (SDSS) astronomical database contains a large number of tables with hundreds of attributes, which can be queried in various combinations (Papadomanolakis & Ailamaki, 2004). These queries involve many tables using binary operations, such as joins. To speed up these queries, many optimization structures were proposed that can be divided into two main categories: redundant structures like materialized views, advanced indexing schemes (bitmap, bitmap join indexes, etc.) (Sanjay, Chaudhuri & Narasayya, 2000) and vertical partitioning (Sanjay, Narasayya & Yang 2004) and non redundant structures like horizontal partitioning (Sanjay, Narasayya & Yang 2004; Bellatreche, Boukhalfa & Mohania, 2007) and parallel processing (Datta, Moon, & Thomas, 2000; Stöhr, Märtens & Rahm, 2000). These optimization techniques are used either in a sequential manner ou combined. These combinations are done intra-structures: materialized views and indexes for redundant and partitioning and data parallel processing for no redundant. Materialized views and indexes compete for the same resource representing storage, and incur maintenance overhead in the presence of updates (Sanjay, Chaudhuri & Narasayya, 2000). None work addresses the problem of selecting combined optimization structures. In this paper, we propose two approaches; one for combining a non redundant structures horizontal partitioning and a redundant structure bitmap indexes in order to reduce the query processing and reduce the maintenance overhead, and another to exploit algorithms for vertical partitioning to generate bitmap join indexes. To facilitate the understanding of our approaches, for review these techniques in details.

Download Full-text

Hybrid Approach to Speed-Up the Privacy Preserving Kernel K-means Clustering and its Application in Social Distributed Environment

Journal of Network and Systems Management ◽

10.1007/s10922-019-09511-1 ◽

2020 ◽

Vol 28 (2) ◽

pp. 398-422

Author(s):

P. L. Lekshmy ◽

M. Abdul Rahiman

Keyword(s):

Hybrid Approach ◽

Privacy Preserving ◽

Distributed Environment ◽

Speed Up

Download Full-text

A data partitioning approach to speed up the fuzzy ARTMAP algorithm using the Hilbert space-filling curve

2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541) ◽

10.1109/ijcnn.2004.1380997 ◽

2005 ◽

Author(s):

J. Castro ◽

M. Georgiopoulos ◽

R. Demara

Keyword(s):

Hilbert Space ◽

Data Partitioning ◽

Fuzzy Artmap ◽

Space Filling ◽

Space Filling Curve ◽

Speed Up ◽

Filling Curve

Download Full-text

Prototyping DBS3, a shared-memory parallel database system

[1991] Proceedings of the First International Conference on Parallel and Distributed Information Systems ◽

10.1109/pdis.1991.183107 ◽

2002 ◽

Cited By ~ 24

Author(s):

B. Bergsten ◽

M. Couprie ◽

P. Valduriez

Keyword(s):

Shared Memory ◽

Database System ◽

Parallel Database ◽

Parallel Database System

Download Full-text

BLOG AS A PLATFORM FOR LEARNER-CENTERED APPROACH IN TEACHING BILINGUALS

Problems of Education in the 21st Century ◽

10.33225/pec/12.41.130 ◽

2012 ◽

Vol 41 (1) ◽

pp. 130-139

Author(s):

Anastassia Rezepova ◽

Natalia Tshuikina

Keyword(s):

Foreign Language ◽

Key Words ◽

Educational Institutions ◽

Bilingual Students ◽

Vast Number ◽

Reading And Writing ◽

Individual Approach ◽

Learner Centered ◽

Text Production ◽

In The Beginning

The article presents grounds for necessity for implying a learner-oriented course for bilinguals studying in modern Estonian school. Such a course is caused by changes in the State Curriculum. The second supposition is the need in teaching Russian to bilinguals not as a foreign language but as one of native ones, which is not provided by curricula of Estonian-speaking educational institutions, although many of them count more than 10% of students from Russian-speaking families. As a result, bilingual students fairly and without accent speak Russian, however experiencing difficulties in reading and writing texts. The realization of learner-centered approach for bilinguals via web blogs solves a vast number of organization problems, starting with the timetable settings for students from different classes and ending with individual approach to students’ personal achivements assessment. The article also decribes the course’s structure, which is organized in four cycles with eight lessons in each; contact classes are to be held in the beginning of the course and between the cycles for interim results, as well as in the end of the course for conclusion. Key words: bilingualism, bilinguals, learner-centered approach, competence of text production, web blog.

Download Full-text

Efficiency in reading of Cyrillic and Latin text

Psihologija ◽

10.2298/psi0404495p ◽

2004 ◽

Vol 37 (4) ◽

pp. 495-505

Author(s):

Milena Pasic

Keyword(s):

Age Groups ◽

The Other ◽

Latin Text ◽

Reading And Writing ◽

Phonological Structure ◽

Primary Schoolchildren ◽

Writing Skill ◽

In The Beginning

Readability of visual units is not only identification of single letters.The success in reading of a text is result of the interaction between different factors.A one of the most important being phonological structure of sequence. The research is done on the sample of 395 primary schoolchildren, divided in to three age groups. There are no differences in success of reading of Cyrillic and Latin text for most of the variables which measured success in reading. The readers who don?t have greater difficulties in reading, use Cyrillic equally well as and Latin alphabet. Moreover, greater practice in uses of one alphabet is not a deciding factor. Differences in uses of one alphabet or the other are notified with inferior readers and are more emphatic in the beginning phases of the development of the reading and writing skill.

Download Full-text