scholarly journals DETERMINISTIC SAMPLE SORT FOR GPUS

2012 ◽  
Vol 22 (03) ◽  
pp. 1250008 ◽  
Author(s):  
FRANK DEHNE ◽  
HAMIDREZA ZABOLI

We demonstrate that parallel deterministic sample sort for many-core GPUs (GPU BUCKET SORT) is not only considerably faster than the best comparison-based sorting algorithm for GPUs (THRUST MERGE [Satish et.al., Proc. IPDPS 2009]) but also as fast as randomized sample sort for GPUs (GPU SAMPLE SORT [Leischner et.al., Proc. IPDPS 2010]). However, deterministic sample sort has the advantage that bucket sizes are guaranteed and therefore its running time does not have the input data dependent fluctuations that can occur for randomized sample sort.

Data sorting hasmany advantages and applications in software and web development. Search engines use sorting techniques to sorttheresult before itispresented totheuser.Thewordsinadictionary are insorted ordersothatthewords canbe found easily.There aremanysorting algorithms that areused in many domains to perform some operation and obtain the desired output. But there are some sorting algorithms that take large time in sorting the data. This huge time can be vulnerable to the operation. Every sorting algorithm has the different sorting technique to sort the given data, Stooge sort is asorting algorithm which sorts the data recursively. Stooge sort takes comparatively more time as compared tomany othersorting algorithms.Stooge sortworks recursively to sort the data element but the Optimized Stooge sort does not use recursive process. In this paper, we propose Optimized Stooge sort to reduce the time complexity of the Stooge sort. The running time of Optimized Stooge sort is very much reduced as compared to theStooge sort algorithm. The existing researchfocuses onreducing therunning time of Stooge sort. Our results show that the Optimized Stooge sort is faster than the Stooge sort algorithm.


2015 ◽  
Vol 5 (1) ◽  
pp. 1-12
Author(s):  
Chris Gorman ◽  
Clint Rogers ◽  
Iren Valova

AbstractSelf-organizing maps are extremely useful in the field of pattern recognition. They become less useful, however, when neurons fail to activate during training. This phenomenon occurs when neurons are initialized in areas of non-input and are far enough away from the input data to never move toward the input. These neurons effectively misrepresent the data set. This results in, among other things, patterns becoming unrecognizable.We introduce an algorithm called No Neuron Left Behind to solve this problem.We show that our algorithm produces a more accurate topological representation of the input space.We also show that no neuron clusters form in areas of noninput and that mapping quality of the SOM increases drastically when our algorithm is implemented. Finally, the running time of NNLB is better or comparable to classic SOM without it.


2021 ◽  
pp. 3-10
Author(s):  
Naoki Katoh ◽  
Hiro Ito

AbstractThis chapter introduces the “sublinear computation paradigm.” A sublinear-time algorithm is an algorithm that runs in time sublinear in the size of the instance (input data). In other words, the running time is o(n), where n is the size of the instance. This century marks the start of the era of big data. In order to manage big data, polynomial-time algorithms, which are considered to be efficient, may sometimes be inadequate because they may require too much time or computational resources. In such cases, sublinear-time algorithms are expected to work well. We call this idea the “sublinear computation paradigm.” A research project named “Foundations on Innovative Algorithms for Big Data (ABD),” in which this paradigm is the central concept, was started under the CREST program of the Japan Science and Technology Agency (JST) in October 2014 and concluded in September 2021. This book mainly introduces the results of this project.


2004 ◽  
Vol 11 (27) ◽  
Author(s):  
Gerth Stølting Brodal ◽  
Rolf Fagerberg ◽  
Gabriel Moruz

Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e. they have a complexity analysis which is better for inputs which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Omega(n log n) comparisons even when the input is already sorted. However, in this paper we demonstrate empirically that the actual running time of Quicksort <em>is</em> adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element <em>swaps</em> performed is <em>provably</em> adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n (1 + log (1 + Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior, and gives new insights on the behavior of the Quicksort algorithm. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort.


In the era of new technology, we have huge amount of data to deal with arranging the huge amount of data has remained a big challenge. This research paper includes two types of sorting algorithm, Heap Sort and Insertion Sort and also their performance analysis on thebasis of running time along with their complexity. This paper includes the algorithms and theirimplementation in Java programming language. For theresults of this research study,the comparison ofthese two sorting algorithms with different type of the data at running time such as Large, Average, and Small. In Large,data pass100 integers in the array. For Average data pass 50integers in the array and for Small data pass10 integers in the array. It checks that,which sorting technique is efficient according to the input data. Then identifiesthe efficiency of these algorithms according to this data three cases used that is Best, Average and Worst Case. The result of this analysis is showing with the help of graphs to show that how much time both algorithms take while given the desired output


Author(s):  
Chun-Yuan Lin ◽  
Wei Sheng Lee ◽  
Chuan Yi Tang

Sorting is a classic algorithmic problem and its importance has led to the design and implementation of various sorting algorithms on many-core graphics processing units (GPUs). CUDPP Radix sort is the most efficient sorting on GPUs and GPU Sample sort is the best comparison-based sorting. Although the implementations of these algorithms are efficient, they either need an extra space for the data rearrangement or the atomic operation for the acceleration. Sorting applications usually deal with a large amount of data, thus the memory utilization is an important consideration. Furthermore, these sorting algorithms on GPUs without the atomic operation support can result in the performance degradation or fail to work. In this paper, an efficient implementation of a parallel shellsort algorithm, CUDA shellsort, is proposed for many-core GPUs with CUDA. Experimental results show that, on average, the performance of CUDA shellsort is nearly twice faster than GPU quicksort and 37% faster than Thrust mergesort under uniform distribution. Moreover, its performance is the same as GPU sample sort up to 32 million data elements, but only needs a constant space usage. CUDA shellsort is also robust over various data distributions and could be suitable for other many-core architectures.


Author(s):  
R.A. Ploc ◽  
G.H. Keech

An unambiguous analysis of transmission electron diffraction effects requires two samplings of the reciprocal lattice (RL). However, extracting definitive information from the patterns is difficult even for a general orthorhombic case. The usual procedure has been to deduce the approximate variables controlling the formation of the patterns from qualitative observations. Our present purpose is to illustrate two applications of a computer programme written for the analysis of transmission, selected area diffraction (SAD) patterns; the studies of RL spot shapes and epitaxy.When a specimen contains fine structure the RL spots become complex shapes with extensions in one or more directions. If the number and directions of these extensions can be estimated from an SAD pattern the exact spot shape can be determined by a series of refinements of the computer input data.


Sign in / Sign up

Export Citation Format

Share Document