partitioning strategy
Recently Published Documents


TOTAL DOCUMENTS

91
(FIVE YEARS 15)

H-INDEX

10
(FIVE YEARS 2)

Big data is traditionally associated with distributed systems and this is understandable given that the volume dimension of Big Data appears to be best accommodated by the continuous addition of resources over a distributed network rather than the continuous upgrade of a central storage resource. Based on this implementation context, non- distributed relational database models are considered volume-inefficient and a departure from their usage contemplated by the database community. Distributed systems depend on data partitioning to determine chunks of related data and where in storage they can be accommodated. In existing Database Management Systems (DBMS), data partitioning is automated which in the opinion of this paper does not give the best results since partitioning is an NP-hard problem in terms of algorithmic time complexity. The NP-hardness is shown to be reduced by a partitioning strategy that relies on the discretion of the programmer which is more effective and flexible though requires extra coding effort. NP-hard problems are solved more effectively by a combination of discretion rather than full automation. In this paper, the partitioning process is reviewed and a programmer-based partitioning strategy implemented for an application with a relational DBMS backend. By doing this, the relational DBMS is made adaptive in the volume dimension of big data. The ACID properties (atomicity, consistency, isolation, and durability) of the relational database model which constitutes a major attraction especially for applications that process transactions is thus harnessed. On a more general note, the results of this research suggest that databases can be made adaptive in the areas of their weaknesses as a one-size-fits- all database management system may no longer be feasible.


2021 ◽  
Vol 13 (2) ◽  
pp. 324
Author(s):  
Lei Qu ◽  
Xingliang Zhu ◽  
Jiannan Zheng ◽  
Liang Zou

Convolutional neural networks have been highly successful in hyperspectral image classification owing to their unique feature expression ability. However, the traditional data partitioning strategy in tandem with patch-wise classification may lead to information leakage and result in overoptimistic experimental insights. In this paper, we propose a novel data partitioning scheme and a triple-attention parallel network (TAP-Net) to enhance the performance of HSI classification without information leakage. The dataset partitioning strategy is simple yet effective to avoid overfitting, and allows fair comparison of various algorithms, particularly in the case of limited annotated data. In contrast to classical encoder–decoder models, the proposed TAP-Net utilizes parallel subnetworks with the same spatial resolution and repeatedly reuses high-level feature maps of preceding subnetworks to refine the segmentation map. In addition, a channel–spectral–spatial-attention module is proposed to optimize the information transmission between different subnetworks. Experiments were conducted on three benchmark hyperspectral datasets, and the results demonstrate that the proposed method outperforms state-of-the-art methods with the overall accuracy of 90.31%, 91.64%, and 81.35% and the average accuracy of 93.18%, 87.45%, and 78.85% over Salinas Valley, Pavia University and Indian Pines dataset, respectively. It illustrates that the proposed TAP-Net is able to effectively exploit the spatial–spectral information to ensure high performance.


Author(s):  
Benoit Gallet ◽  
Michael Gowanlock

Abstract Given two datasets (or tables) A and B and a search distance $$\epsilon$$ ϵ , the distance similarity join, denoted as $$A \ltimes _\epsilon B$$ A ⋉ ϵ B , finds the pairs of points ($$p_a$$ p a , $$p_b$$ p b ), where $$p_a \in A$$ p a ∈ A and $$p_b \in B$$ p b ∈ B , and such that the distance between $$p_a$$ p a and $$p_b$$ p b is $$\le \epsilon$$ ≤ ϵ . If $$A = B$$ A = B , then the similarity join is equivalent to a similarity self-join, denoted as $$A \bowtie _\epsilon A$$ A ⋈ ϵ A . We propose in this paper Heterogeneous Epsilon Grid Joins (HEGJoin), a heterogeneous CPU-GPU distance similarity join algorithm. Efficiently partitioning the work between the CPU and the GPU is a challenge. Indeed, the work partitioning strategy needs to consider the different characteristics and computational throughput of the processors (CPU and GPU), as well as the data-dependent nature of the similarity join that accounts in the overall execution time (e.g., the number of queries, their distribution, the dimensionality, etc.). In addition to HEGJoin, we design in this paper a dynamic and two static work partitioning strategies. We also propose a performance model for each static partitioning strategy to perform the distribution of the work between the processors. We evaluate the performance of all three partitioning methods by considering the execution time and the load imbalance between the CPU and GPU as performance metrics. HEGJoin achieves a speedup of up to $$5.46\times$$ 5.46 × ($$3.97\times$$ 3.97 × ) over the GPU-only (CPU-only) algorithms on our first test platform and up to $$1.97\times$$ 1.97 × ($$12.07\times$$ 12.07 × ) on our second test platform over the GPU-only (CPU-only) algorithms.


2020 ◽  
Vol 6 (2) ◽  
pp. 204-230
Author(s):  
Michelle Ann Hurst ◽  
Marisa Massaro ◽  
Sara Cordes

Fraction notation conveys both part-whole (3/4 is 3 out of 4) and magnitude (3/4 = 0.75) information, yet evidence suggests that both children and adults find accessing magnitude information from fractions particularly difficult. Recent research suggests that using number lines to teach children about fractions can help emphasize fraction magnitude. In three experiments with adults and 9-12-year-old children, we compare the benefits of number lines and pie charts for thinking about rational numbers. In Experiment 1, we first investigate how adults spontaneously visualize symbolic fractions. Then, in two further experiments, we explore whether priming children to use pie charts vs. number lines impacts performance on a subsequent symbolic magnitude task and whether children differentially rely on a partitioning strategy to map rational numbers to number lines vs. pie charts. Our data reveal that adults very infrequently spontaneously visualize fractions along a number line and, contrary to other findings, that practice mapping rational numbers to number lines did not improve performance on a subsequent symbolic magnitude comparison task relative to practice mapping the same magnitudes to pie charts. However, children were more likely to use overt partitioning strategies when working with pie charts compared to number lines, suggesting these representations did lend themselves to different working strategies. We discuss the interpretations and implications of these findings for future research and education. All materials and data are provided as Supplementary Materials.


2020 ◽  
Vol 12 (7) ◽  
pp. 1104
Author(s):  
Jiansi Ren ◽  
Ruoxiang Wang ◽  
Gang Liu ◽  
Ruyi Feng ◽  
Yuanni Wang ◽  
...  

The classification of hyperspectral remote sensing images is difficult due to the curse of dimensionality. Therefore, it is necessary to find an effective way to reduce the dimensions of such images. The Relief-F method has been introduced for supervising dimensionality reduction, but the band subset obtained by this method has a large number of continuous bands, resulting in a reduction in the classification accuracy. In this paper, an improved method—called Partitioned Relief-F—is presented to mitigate the influence of continuous bands on classification accuracy while retaining important information. Firstly, the importance scores of each band are obtained using the original Relief-F method. Secondly, the whole band interval is divided in an orderly manner, using a partitioning strategy according to the correlation between the bands. Finally, the band with the highest importance score is selected in each sub-interval. To verify the effectiveness of the proposed Partitioned Relief-F method, a classification experiment is performed on three publicly available data sets. The dimensionality reduction methods Principal Component Analysis (PCA) and original Relief-F are selected for comparison. Furthermore, K-Means and Balanced Iterative Reducing and Clustering Using Hierarchies (BIRCH) are selected for comparison in terms of partitioning strategy. This paper mainly measures the effectiveness of each method indirectly, using the overall accuracy of the final classification. The experimental results indicate that the addition of the proposed partitioning strategy increases the overall accuracy of the three data sets by 1.55%, 3.14%, and 0.83%, respectively. In general, the proposed Partitioned Relief-F method can achieve significantly superior dimensionality reduction effects.


Sign in / Sign up

Export Citation Format

Share Document