DETERMINISTIC SAMPLE SORT FOR GPUS

We demonstrate that parallel deterministic sample sort for many-core GPUs (GPU BUCKET SORT) is not only considerably faster than the best comparison-based sorting algorithm for GPUs (THRUST MERGE [Satish et.al., Proc. IPDPS 2009]) but also as fast as randomized sample sort for GPUs (GPU SAMPLE SORT [Leischner et.al., Proc. IPDPS 2010]). However, deterministic sample sort has the advantage that bucket sizes are guaranteed and therefore its running time does not have the input data dependent fluctuations that can occur for randomized sample sort.

Download Full-text

Design and Analysis of Optimized Stooge Sort Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3167.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 1669-1673

Keyword(s):

Time Complexity ◽

Large Time ◽

Sorting Algorithm ◽

Data Element ◽

Running Time ◽

Sorting Algorithms ◽

Sorting Technique ◽

Sort Algorithm ◽

The Given ◽

Data Sorting

Data sorting hasmany advantages and applications in software and web development. Search engines use sorting techniques to sorttheresult before itispresented totheuser.Thewordsinadictionary are insorted ordersothatthewords canbe found easily.There aremanysorting algorithms that areused in many domains to perform some operation and obtain the desired output. But there are some sorting algorithms that take large time in sorting the data. This huge time can be vulnerable to the operation. Every sorting algorithm has the different sorting technique to sort the given data, Stooge sort is asorting algorithm which sorts the data recursively. Stooge sort takes comparatively more time as compared tomany othersorting algorithms.Stooge sortworks recursively to sort the data element but the Optimized Stooge sort does not use recursive process. In this paper, we propose Optimized Stooge sort to reduce the time complexity of the Stooge sort. The running time of Optimized Stooge sort is very much reduced as compared to theStooge sort algorithm. The existing researchfocuses onreducing therunning time of Stooge sort. Our results show that the Optimized Stooge sort is faster than the Stooge sort algorithm.

Download Full-text

No Neuron Left Behind: A genetic approach to higher precision topological mapping of self-organizing maps

Open Computer Science ◽

10.1515/comp-2015-0001 ◽

2015 ◽

Vol 5 (1) ◽

pp. 1-12

Author(s):

Chris Gorman ◽

Clint Rogers ◽

Iren Valova

Keyword(s):

Pattern Recognition ◽

Input Data ◽

Topological Representation ◽

Genetic Approach ◽

Self Organizing Maps ◽

Data Set ◽

Running Time ◽

Left Behind ◽

Self Organizing

AbstractSelf-organizing maps are extremely useful in the field of pattern recognition. They become less useful, however, when neurons fail to activate during training. This phenomenon occurs when neurons are initialized in areas of non-input and are far enough away from the input data to never move toward the input. These neurons effectively misrepresent the data set. This results in, among other things, patterns becoming unrecognizable.We introduce an algorithm called No Neuron Left Behind to solve this problem.We show that our algorithm produces a more accurate topological representation of the input space.We also show that no neuron clusters form in areas of noninput and that mapping quality of the SOM increases drastically when our algorithm is implemented. Finally, the running time of NNLB is better or comparable to classic SOM without it.

Download Full-text

What Is the Sublinear Computation Paradigm?

10.1007/978-981-16-4095-7_1 ◽

2021 ◽

pp. 3-10

Author(s):

Naoki Katoh ◽

Hiro Ito

Keyword(s):

Big Data ◽

Polynomial Time ◽

Input Data ◽

Science And Technology ◽

Time Algorithm ◽

Central Concept ◽

Research Project ◽

Running Time ◽

Polynomial Time Algorithms ◽

Computational Resources

AbstractThis chapter introduces the “sublinear computation paradigm.” A sublinear-time algorithm is an algorithm that runs in time sublinear in the size of the instance (input data). In other words, the running time is o(n), where n is the size of the instance. This century marks the start of the era of big data. In order to manage big data, polynomial-time algorithms, which are considered to be efficient, may sometimes be inadequate because they may require too much time or computational resources. In such cases, sublinear-time algorithms are expected to work well. We call this idea the “sublinear computation paradigm.” A research project named “Foundations on Innovative Algorithms for Big Data (ABD),” in which this paradigm is the central concept, was started under the CREST program of the Japan Science and Technology Agency (JST) in October 2014 and concluded in September 2021. This book mainly introduces the results of this project.

Download Full-text

On the Adaptiveness of Quicksort

BRICS Report Series ◽

10.7146/brics.v11i27.21852 ◽

2004 ◽

Vol 11 (27) ◽

Cited By ~ 2

Author(s):

Gerth Stølting Brodal ◽

Rolf Fagerberg ◽

Gabriel Moruz

Keyword(s):

Adaptive Behavior ◽

Complexity Analysis ◽

Input Sequence ◽

Theoretical Explanation ◽

Sorting Algorithm ◽

Running Time ◽

Empirical Results ◽

Sorting Algorithms ◽

Observed Behavior

Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e. they have a complexity analysis which is better for inputs which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Omega(n log n) comparisons even when the input is already sorted. However, in this paper we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n (1 + log (1 + Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior, and gives new insights on the behavior of the Quicksort algorithm. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort.

Download Full-text

Performance Analysis of Heap Sort and Insertion Sort Algorithm

International Journal of Emerging Trends in Engineering Research ◽

10.30534/ijeter/2021/08952021 ◽

2021 ◽

Vol 9 (5) ◽

pp. 580-586

Keyword(s):

Performance Analysis ◽

New Technology ◽

Large Data ◽

Sorting Algorithm ◽

Small Data ◽

Huge Amount ◽

Worst Case ◽

Running Time ◽

Java Programming ◽

Sorting Technique

In the era of new technology, we have huge amount of data to deal with arranging the huge amount of data has remained a big challenge. This research paper includes two types of sorting algorithm, Heap Sort and Insertion Sort and also their performance analysis on thebasis of running time along with their complexity. This paper includes the algorithms and theirimplementation in Java programming language. For theresults of this research study,the comparison ofthese two sorting algorithms with different type of the data at running time such as Large, Average, and Small. In Large,data pass100 integers in the array. For Average data pass 50integers in the array and for Small data pass10 integers in the array. It checks that,which sorting technique is efficient according to the input data. Then identifiesthe efficiency of these algorithms according to this data three cases used that is Best, Average and Worst Case. The result of this analysis is showing with the help of graphs to show that how much time both algorithms take while given the desired output

Download Full-text

High performance comparison-based sorting algorithm on many-core GPUs

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS) ◽

10.1109/ipdps.2010.5470445 ◽

2010 ◽

Cited By ~ 5

Author(s):

Xiaochun Ye ◽

Dongrui Fan ◽

Wei Lin ◽

Nan Yuan ◽

Paolo Ienne

Keyword(s):

High Performance ◽

Performance Comparison ◽

Sorting Algorithm ◽

Many Core

Download Full-text

Parallel Shellsort Algorithm for Many-Core GPUs with CUDA

International Journal of Grid and High Performance Computing ◽

10.4018/jghpc.2012040101 ◽

2012 ◽

Vol 4 (2) ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Chun-Yuan Lin ◽

Wei Sheng Lee ◽

Chuan Yi Tang

Keyword(s):

Graphics Processing Units ◽

Algorithmic Problem ◽

Radix Sort ◽

Sorting Algorithms ◽

Atomic Operation ◽

Data Elements ◽

Sample Sort ◽

Many Core ◽

Graphics Processing ◽

Memory Utilization

Sorting is a classic algorithmic problem and its importance has led to the design and implementation of various sorting algorithms on many-core graphics processing units (GPUs). CUDPP Radix sort is the most efficient sorting on GPUs and GPU Sample sort is the best comparison-based sorting. Although the implementations of these algorithms are efficient, they either need an extra space for the data rearrangement or the atomic operation for the acceleration. Sorting applications usually deal with a large amount of data, thus the memory utilization is an important consideration. Furthermore, these sorting algorithms on GPUs without the atomic operation support can result in the performance degradation or fail to work. In this paper, an efficient implementation of a parallel shellsort algorithm, CUDA shellsort, is proposed for many-core GPUs with CUDA. Experimental results show that, on average, the performance of CUDA shellsort is nearly twice faster than GPU quicksort and 37% faster than Thrust mergesort under uniform distribution. Moreover, its performance is the same as GPU sample sort up to 32 million data elements, but only needs a constant space usage. CUDA shellsort is also robust over various data distributions and could be suitable for other many-core architectures.

Download Full-text

A Novel Sorting Algorithm for Many-core Architectures Based on Adaptive Bitonic Sort

2012 IEEE 26th International Parallel and Distributed Processing Symposium ◽

10.1109/ipdps.2012.30 ◽

2012 ◽

Cited By ~ 22

Author(s):

Hagen Peters ◽

Ole Schulz-Hildebrandt ◽

Norbert Luttenberger

Keyword(s):

Sorting Algorithm ◽

Many Core

Download Full-text

Transmission Electron Diffraction Analysis by Computer

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100067194 ◽

1970 ◽

Vol 28 ◽

pp. 40-41

Author(s):

R.A. Ploc ◽

G.H. Keech

Keyword(s):

Electron Diffraction ◽

Input Data ◽

Reciprocal Lattice ◽

Computer Programme ◽

Transmission Electron Diffraction ◽

Usual Procedure ◽

Transmission Electron ◽

Complex Shapes ◽

Computer Input ◽

Diffraction Effects

An unambiguous analysis of transmission electron diffraction effects requires two samplings of the reciprocal lattice (RL). However, extracting definitive information from the patterns is difficult even for a general orthorhombic case. The usual procedure has been to deduce the approximate variables controlling the formation of the patterns from qualitative observations. Our present purpose is to illustrate two applications of a computer programme written for the analysis of transmission, selected area diffraction (SAD) patterns; the studies of RL spot shapes and epitaxy.When a specimen contains fine structure the RL spots become complex shapes with extensions in one or more directions. If the number and directions of these extensions can be estimated from an SAD pattern the exact spot shape can be determined by a series of refinements of the computer input data.

Download Full-text

Concurrent error detection in array dividers by alternating input data

IEE Proceedings E Computers and Digital Techniques ◽

10.1049/ip-e.1992.0019 ◽

1992 ◽

Vol 139 (2) ◽

pp. 123 ◽

Cited By ~ 4

Author(s):

C.-L. Wey

Keyword(s):

Error Detection ◽

Input Data ◽

Concurrent Error Detection

Download Full-text