Machine-part grouping and cluster analysis: similarities, distances and grouping criteria

2009 ◽  
Vol 57 (3) ◽  
pp. 217-228 ◽  
Author(s):  
J. Owsiński

Machine-part grouping and cluster analysis: similarities, distances and grouping criteriaThe paper considers the machine-part grouping problem, as equivalent to partitioning the set of machines and operations into subsets, corresponding to block diagonalisation with constraints. The attempts to solve the problem with clustering methods are outlined. The difficulties encountered are presented, related to (i) ambiguity of formulations; (ii) selection of criteria; and (iii) lack of effective algorithms. These are illustrated in more detail with a limited survey of similarity and distance definitions, and of criteria used, constituting the main body of the paper. The return is proposed to the basic paradigm of cluster analysis, as providing simple and fast algorithms, which, even if not yielding optimal solutions, can be controlled in a simple manner, and their solutions improved.

Author(s):  
Jiaxiong Pi ◽  
Yong Shi ◽  
Zhengxin Chen

Image content analysis plays an important role for adaptive multimedia retrieval. In this chapter, the authors present their work on using a useful spatial data structure, R*-tree, for similarity analysis and cluster analysis of image contents. First, they describe an R*-tree based similarity analysis tool for similarity retrieval of images. They then move on to discuss R*-tree based clustering methods for images, which has been a tricky issue: although objects stored in the same R* tree leaf node enjoys spatial proximity, it is well-known that R* trees cannot be used directly for cluster analysis. Nevertheless, R* tree’s indexing feature can be used to assist existing cluster analysis methods, thus enhancing their performance of cluster quality. In this chapter, the authors report their progress of using R* trees to improve well-known K-means and hierarchical clustering methods. Based on R*-Tree’s feature of indexing Minimum Bounding Box (MBB) according to spatial proximity, the authors extend R*-Tree’s application to cluster analysis containing image data. Two improved algorithms, KMeans-R and Hierarchy-R, are proposed. Experiments have shown that KMeans-R and Hierarchy-R have achieved better clustering quality.


2020 ◽  
Vol 31 (5) ◽  
pp. 565-574
Author(s):  
Alicja Ewa Gudanowska ◽  
Anna Kononiuk ◽  
Katarzyna Dębkowska

Trends and megatrends affecting the labour market are changing rapidly. Inevitable changes force a permanent need to redefine employees' competences in order to meet employers' expectations The scientific problem raised by the authors of the article is defining a methodology of identifying competences of future-oriented entrepreneurs. The aim of the article is to present the potential of cluster analysis for the selection of key competences of future-oriented entrepreneurs in the context of foresight research. The main research methods applied for this study were literature review and cluster analysis. Literature review covered global literature review, domestic literature review, higher education offer review, commercial foresight courses review as well as case studies. Both extensive literature review and the analysis of business practices allowed to identify more than one thousand six hundred competences of a future-oriented entrepreneur. The huge amount of competences were then the subject to preliminary assessment which resulted in the list of 39 items. The application of cluster analysis enabled to further reduce the number of competences. Finally, seven competences to be mastered by future-oriented entrepreneurs could be recommended such as, but not limited to: the ability to find and interpret weak signals of change and disruptions (wild cards and abnormal phenomena); the ability to act proactively; the ability to manage change and uncertainty; the ability to run strategic foresight within organization,  the ability to create organizational vision (both collective and individual); and seeing the big picture.


OENO One ◽  
1995 ◽  
Vol 29 (4) ◽  
pp. 183
Author(s):  
Lucinio Júdez ◽  
Javier Litago ◽  
Jesús Yuste ◽  
Antonio Soldevilla ◽  
F. Martinez

<p style="text-align: justify;">In this study, we put forward a procedure, based on the analysis of the principal components and cluster analysis, in order to guide the first stages of the clonal selection of the « Tinta dei Pais » variety. The material used is a set of vines from the region of Vega Sicilia, situated on the Duero valley of the Province of Valladolid in Spain. On these vines, we have originally observed eleven variables associated to their production. This study shows that three of these characters : weight of grape per plant, total acidity and weight of pruned wood, summarize a great part of the information that can be drawn from the set of variables. On the other hand, the graphs and tables, elaborated from statistical treatment of the data, provide an excellent synthesis when identifying the most important characteristics of the vines studied. This procedure can be very useful to the breeder, allowing him to reduce considerably the number of variables for analysis in the evaluation of the vines in the first stages of selection. In particular, it is found that the least advantageous vines arc those of either low yield or high total acidity.</p>


2017 ◽  
Vol 37 (3) ◽  
pp. 300-320 ◽  
Author(s):  
Michael J. Brusco ◽  
Renu Singh ◽  
J. Dennis Cradit ◽  
Douglas Steinley

Purpose The purpose of this paper is twofold. First, the authors provide a survey of operations management (OM) research applications of traditional hierarchical and nonhierarchical clustering methods with respect to key decisions that are central to a valid analysis. Second, the authors offer recommendations for practice with respect to these decisions. Design/methodology/approach A coding study was conducted for 97 cluster analyses reported in six OM journals during the period spanning 1994-2015. Data were collected with respect to: variable selection, variable standardization, method, selection of the number of clusters, consistency/stability of the clustering solution, and profiling of the clusters based on exogenous variables. Recommended practices for validation of clustering solutions are provided within the context of this framework. Findings There is considerable variability across clustering applications with respect to the components of validation, as well as a mix of productive and undesirable practices. This justifies the importance of the authors’ provision of a schema for conducting a cluster analysis. Research limitations/implications Certain aspects of the coding study required some degree of subjectivity with respect to interpretation or classification. However, in light of the sheer magnitude of the coding study (97 articles), the authors are confident that an accurate picture of empirical OM clustering applications has been presented. Practical implications The paper provides a critique and synthesis of the practice of cluster analysis in OM research. The coding study provides a thorough foundation for how the key decisions of a cluster analysis have been previously handled in the literature. Both researchers and practitioners are provided with guidelines for performing a valid cluster analysis. Originality/value To the best of the authors’ knowledge, no study of this type has been reported in the OM literature. The authors’ recommendations for cluster validation draw from recent studies in other disciplines that are apt to be unfamiliar to many OM researchers.


T-Comm ◽  
2021 ◽  
Vol 15 (6) ◽  
pp. 40-47
Author(s):  
Oleg I. Sheluhin ◽  
◽  
Dmitry I. Rakovsky ◽  

The process of marking multi-attribute experimental data for subsequent use by means of data mining in problems of detection and classification of rare anomalous events of computer systems (CS) is considered. The labeling process is carried out using three methods: manual preprocessing, statistical analysis and cluster analysis. Among the attributes of the metric type, the authors identified two macrogroups: “integral attributes” and “impulse attributes”. It is shown that the combination of statistical and cluster analysis methods increases the accuracy of detecting anomalous events in the CS, and also allows the selection of attributes according to their information significance. The expediency of manual preprocessing of data before clustering is shown by the example of dividing attributes into macrogroups, analyzing the density distribution using violin plot and removing the trend component using the method difference stationary series. With the help of construction of violin diagrams (Violin plot) for the attribute of the “integral” macrogroup, the distribution of states of the CS is shown. It is shown that the removal of the trend component by the DS-series method, normalization and reduction to absolute values allows more accurate marking of anomalous outliers, but this is not always acceptable. The interpretation of the clustering results performed for each normalized attribute shows that the normal values for all attributes are concentrated around zero values. The result of labeling experimental data is attribute-labeled data, where each attribute at the current time is assigned one of two states: abnormal or normal.


Author(s):  
Georgios N. Aretoulis ◽  
Christophoros H. Triantafyllidis ◽  
Jason Papathanasiou ◽  
Ioannis K. Anagnostopoulos

1980 ◽  
Vol 46 (1) ◽  
pp. 131-134 ◽  
Author(s):  
Mark Chignell ◽  
Barrie G. Stacey

Comparative evaluation of a variety of clustering methods on real and simulated data indicates that the appropriate method for a given set of data must be determined empirically. Selection of an appropriate method generally requires several preliminary analyses. With larger data sets, preliminary analyses on the whole may not be possible. As an alternative one may adopt an interactive strategy and break a large set into manageable subsets.


Sign in / Sign up

Export Citation Format

Share Document