Parallel Clustering Algorithms for Image Processing on Multi-core CPUs

Author(s):  
Honggang Wang ◽  
Jide Zhao ◽  
Hongguang Li ◽  
Jianguo Wang
2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Wen Xiao ◽  
Juan Hu

Clustering is one of the most important unsupervised machine learning tasks, which is widely used in information retrieval, social network analysis, image processing, and other fields. With the explosive growth of data, the classical clustering algorithms cannot meet the requirements of clustering for big data. Spark is one of the most popular parallel processing platforms for big data, and many researchers have proposed many parallel clustering algorithms based on Spark. In this paper, the existing parallel clustering algorithms based on Spark are classified and summarized, the parallel design framework of each kind of algorithms is discussed, and after comparing different kinds of algorithms, the direction of the future research is discussed.


2020 ◽  
Vol 2020 ◽  
pp. 1-7
Author(s):  
Abdelilah Et-taleby ◽  
Mohammed Boussetta ◽  
Mohamed Benslimane

Clustering or grouping is among the most important image processing methods that aim to split an image into different groups. Examining the literature, many clustering algorithms have been carried out, where the K-means algorithm is considered among the simplest and most used to classify an image into many regions. In this context, the main objective of this work is to detect and locate precisely the damaged area in photovoltaic (PV) fields based on the clustering of a thermal image through the K-means algorithm. The clustering quality depends on the number of clusters chosen; hence, the elbow, the average silhouette, and NbClust R package methods are used to find the optimal number K. The simulations carried out show that the use of the K-means algorithm allows detecting precisely the faults in PV panels. The excellent result is given with three clusters that is suggested by the elbow method.


2011 ◽  
Vol 301-303 ◽  
pp. 1133-1138 ◽  
Author(s):  
Yan Xiang Fu ◽  
Wei Zhong Zhao ◽  
Hui Fang Ma

Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, more researchers try to design efficient parallel clustering algorithms. In this paper, we propose a parallel DBSCAN clustering algorithm based on Hadoop, which is a simple yet powerful parallel programming platform. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.


Author(s):  
Anurag Sinha ◽  
Harsh soni

Human beatboxing is a vocal art making use of speech organs to produce vocal drum sounds and imitate musical instruments. Beatbox sound classification is a current challenge that can be used for automatic database annotation and music-information retrieval. In this study, a large-vocabulary humanbeatbox sound recognition system was developed with an adaptation of Kaldi toolbox, a widely-used tool for automatic speech recognition. The corpus consisted of eighty boxemes, which were recorded repeatedly by two beatboxers. The sounds were annotated and transcribed to the system by means of a beatbox specific morphographic writing system (Vocal Grammatics). The image processing techniques plays vital role on image Acquisition, image pre-processing, Clustering, Segmentation and Classification techniques with different kind of images such as Fruits, Medical, Vehicle and Digital text images etc. In this study the various images to remove unwanted noise and performs enhancement techniques such as contrast limited adaptive histogram equalization, Laplacian and Harr filtering, unsharp masking, sharpening, high boost filtering and color models then the Clustering algorithms are useful for data logically and extract pattern analysis, grouping, decision-making, and machine-learning techniques and Segment the regions using binary, K-means and OTSU segmentation algorithm. It Classifying the images with the help of SVM and K-Nearest Neighbour(KNN) Classifier to produce good results for those images.


Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3438 ◽  
Author(s):  
Xia ◽  
Huang ◽  
Li ◽  
Zhou ◽  
Zhang

Remote sensing big data (RSBD) is generally characterized by huge volumes, diversity, and high dimensionality. Mining hidden information from RSBD for different applications imposes significant computational challenges. Clustering is an important data mining technique widely used in processing and analyzing remote sensing imagery. However, conventional clustering algorithms are designed for relatively small datasets. When applied to problems with RSBD, they are, in general, too slow or inefficient for practical use. In this paper, we proposed a parallel subsampling-based clustering (PARSUC) method for improving the performance of RSBD clustering in terms of both efficiency and accuracy. PARSUC leverages a novel subsampling-based data partitioning (SubDP) method to realize three-step parallel clustering, effectively solving the notable performance bottleneck of the existing parallel clustering algorithms; that is, they must cope with numerous repeated calculations to get a reasonable result. Furthermore, we propose a centroid filtering algorithm (CFA) to eliminate subsampling errors and to guarantee the accuracy of the clustering results. PARSUC was implemented on a Hadoop platform by using the MapReduce parallel model. Experiments conducted on massive remote sensing imageries with different sizes showed that PARSUC (1) provided much better accuracy than conventional remote sensing clustering algorithms in handling larger image data; (2) achieved notable scalability with increased computing nodes added; and (3) spent much less time than the existing parallel clustering algorithm in handling RSBD.


Author(s):  
Luminita Moraru ◽  
Simona Moldovanu ◽  
Anjan Biswas

Today, medical image processing and analysis are highly active research fields boosted by rapid technical developments in medical imaging field. This chapter describes common procedures such as thresholding methods and clustering algorithms (both non-hierarchical and hierarchical approaches) used for digital image processing, with specific reference to brain magnetic resonance images. These techniques represent starting points for other sophisticated methods such as segmentation and classification. The results, which are an outcome of these methods, are used for classification of neurodegenerative diseases such as Alzheimer, Pick's, Huntington's or cerebral calcinosis. A number of applications together with the code listing are provided with the aim to make the subject accessible and practical. The MATLAB software will help the readers to identify and choose the best solution for a particular problem.


Author(s):  
Dariusz Malyszko ◽  
Jaroslaw Stepaniuk

Clustering understood as a data grouping technique represents fundamental procedures in image processing. The present chapter’s concerns are combining the concept of rough sets and entropy measures in the area of image segmentation. In this context, comprehensive investigations into rough set entropy based clustering image segmentation techniques have been performed. Segmentation presents low-level image transformation routines concerned with image partitioning into distinct disjoint and homogenous regions. In the area of segmentation routines, threshold based algorithms and clustering algorithms most often are applied in practical solutions when there is a pressing need for simplicity and robustness. Rough entropy threshold based segmentation algorithms simultaneously combine optimal threshold determination with rough region approximations and region entropy measures. In the present chapter, new algorithmic schemes RECA in the area of rough entropy based partitioning routines have been proposed. Rough entropy clustering incorporates the notion of rough entropy into clustering models, taking advantage of dealing with some degree of uncertainty in analyzed data. RECA algorithmic schemes performed usually equally robust compared to standard k-means algorithms. At the same time, in many runs they yielded slightly better performances making possible future implementation in clustering applications.


2019 ◽  
Vol 29 (3) ◽  
pp. 150 ◽  
Author(s):  
Elham Jasim Mohammad

Nanotechnology is one of the non-exhaustive applications in which image processing is used. For optimal nanoparticle visualization and characterization, the high resolution Scanning Electron Microscope (SEM) and the Atomic Force Microscope (AFM) are used. Image segmentation is one of the critical steps in nanoscale processing. There are also different ways to reach retail, including statistical approximations.In this study; we used the K-means method to determine the optimal threshold using statistical approximation. This technique is thoroughly studied for the SEM nanostructure Silver image. Note that, the image obtained by SEM is good enough to analyze more recently images. The analysis is being used in the field of nanotechnology. The K-means algorithm classifies the data set given to k groups based on certain measurements of certain distances. K-means technology is the most widely used among all clustering algorithms. It is one of the common techniques used in statistical data analysis, image analysis, neural networks, classification analysis and biometric information. K-means is one of the fastest collection algorithms and can be easily used in image segmentation. The results showed that K-means is highly sensitive to small data sets and performance can degrade at any time. When exposed to a huge data set such as 100.000, the performance increases significantly. The algorithm also works well when the number of clusters is small. This technology has helped to provide a good performance algorithm for the state of the image being tested.


Sign in / Sign up

Export Citation Format

Share Document