scholarly journals Cluster analysis for multidimensional objects in fuzzy data conditions

Author(s):  
Yuriy Zack

This article presents many different areas of practical applications of multivariate cluster analysis under conditions of fuzzy initial data that are described in the literature. New algorithms and formula expressions are proposed for combining various multi-dimensional objects, the parameters of which are given by fuzzy-sets, into clusters along with calculating the coordinates of the centroids of their membership functions. Various types of clustering criteria are formulated in the form of minimizing the weighted average and the sum of distances between the centroids of objects and clusters presented in different metrics, as well as maximizing the distances between the centroids of different clusters. The formulations and mathematical models of three different NP-hard problems of multidimensional clustering in fuzzy-data conditions are proposed; while solving them any of the considered optimality criteria can be used. Heuristic algorithms for the approximate solution of two formulated problems have been developed. The algorithm for solving the 1st problem is illustrated with a numerical example. The obtained results can serve as a direction for further research and have wide practical applications.

1989 ◽  
Vol 21 (8-9) ◽  
pp. 1057-1064 ◽  
Author(s):  
Vijay Joshi ◽  
Prasad Modak

Waste load allocation for rivers has been a topic of growing interest. Dynamic programming based algorithms are particularly attractive in this context and are widely reported in the literature. Codes developed for dynamic programming are however complex, require substantial computer resources and importantly do not allow interactions of the user. Further, there is always resistance to utilizing mathematical programming based algorithms for practical applications. There has been therefore always a gap between theory and practice in systems analysis in water quality management. This paper presents various heuristic algorithms to bridge this gap with supporting comparisons with dynamic programming based algorithms. These heuristics make a good use of the insight gained in the system's behaviour through experience, a process akin to the one adopted by field personnel and therefore can readily be understood by a user familiar with the system. Also they allow user preferences in decision making via on-line interaction. Experience has shown that these heuristics are indeed well founded and compare very favourably with the sophisticated dynamic programming algorithms. Two examples have been included which demonstrate such a success of the heuristic algorithms.


2020 ◽  
Vol 17 (4) ◽  
pp. 73-80 ◽  
Author(s):  
Vera Snezhko ◽  
Dmitrii Benin ◽  
Artem Lukyanets ◽  
Larisa Kondratenko

Considering features of hydrological conditions for hydro-chemical system, this paper analyses the performance of the hydro-ecological status of the Kuban river basin.. The results of the study on water chemical composition depending on the distance from the source are presented. By comparing the results with the reference values of water quality, increased aluminium, zinc, and copper content was established. Respective dendrograms of hydro-ecological studies obtained according to performed analysis for the Kuban River and its tributaries are presented. The relevance of the findings received is p<0.0005 and the correlation coefficient corresponds to 0.935...1. The results of multivariate cluster analysis showed that the Kuban basin has an increased content of particular heavy metals such as aluminium, copper, and zinc.


Author(s):  
N. P. Szabó ◽  
B. A. Braun ◽  
M. M. G. Abdelrahman ◽  
M. Dobróka

AbstractThe identification of lithology, fluid types, and total organic carbon content are of great priority in the exploration of unconventional hydrocarbons. As a new alternative, a further developed K-means type clustering method is suggested for the evaluation of shale gas formations. The traditional approach of cluster analysis is mainly based on the use of the Euclidean distance for grouping the objects of multivariate observations into different clusters. The high sensitivity of the L2 norm applied to non-Gaussian distributed measurement noises is well-known, which can be reduced by selecting a more suitable norm as distance metrics. To suppress the harmful effect of non-systematic errors and outlying data, the Most Frequent Value method as a robust statistical estimator is combined with the K-means clustering algorithm. The Cauchy-Steiner weights calculated by the Most Frequent Value procedure is applied to measure the weighted distance between the objects, which improves the performance of cluster analysis compared to the Euclidean norm. At the same time, the centroids are also calculated as a weighted average (using the Most Frequent Value method), instead of applying arithmetic mean. The suggested statistical method is tested using synthetic datasets as well as observed wireline logs, mud-logging data and core samples collected from the Barnett Shale Formation, USA. The synthetic experiment using extremely noisy well logs demonstrates that the newly developed robust clustering procedure is able to separate the geological-lithological units in hydrocarbon formations and provide additional information to standard well log analysis. It is also shown that the Cauchy-Steiner weighted cluster analysis is affected less by outliers, which allows a more efficient processing of poor-quality wireline logs and an improved evaluation of shale gas reservoirs.


Geosciences ◽  
2018 ◽  
Vol 8 (10) ◽  
pp. 379 ◽  
Author(s):  
Matthew Kendall ◽  
Ken Buja ◽  
Charles Menza ◽  
Tim Battista

Globally, there is a lack of resources to survey the vast seafloor areas in need of basic mapping data. Consequently, smaller areas must be prioritized to address the most urgent needs. We developed a systematic, quantitative approach and on-line application to gather mapping suggestions from diverse stakeholders. Participants are each provided with 100 virtual coins to place throughout a region of interest to convey their mapping priorities. Inputs are standardized into a spatial framework using a grid and pull-down menus. These enabled participants to convey the types of mapping products that they need, the rationale used to justify their needs, and the locations that they prioritize for mapping. This system was implemented in a proposed National Marine Sanctuary encompassing 2784 km2 of Lake Michigan, Wisconsin. We demonstrate key analyses of the outputs, including coin counts, cell ranking, and multivariate cluster analysis for isolating high priority topics and locations. These techniques partition the priorities among the disciplines of the respondents, their selected justifications, and types of desired map products. The results enable respondents to identify potential collaborations to achieve common goals and more effectively invest limited mapping funds. The approach can be scaled to accommodate larger geographic areas and numbers of participants and is not limited to seafloor mapping.


2016 ◽  
Vol 21 (2) ◽  
pp. 101 ◽  
Author(s):  
Bima Anjasmoro ◽  
Suharyanto Suharyanto ◽  
Sri Sangkawati

The Feasibility study potential of small dams in Semarang District has identified 8 (eight) urgent potential small dams. These potential dams here to be constructed within 5 (five) years in order to overcome the problem of water shortage in the district. However, the government has limited funding source. It is necessary to select the more urgent small dams to be constructed within the limited budget. The purpose of the research is determining the priority of small dams construction in Semarang District. The method used to determine the priority in this study is cluster analysis, AHP and weighted average method. The criteria used to determine the priority in this study consist of: vegetation in the inundated area, volume of embankment, land acquisition area, useful storage, recervoir life time, water cost/m³, access road to the dam site, land status at abutment and inundated area, construction cost, operation and maintenance cost, irrigation service area and raw water benefit. Based on results of cluster analysis, AHP and weighted average method can be conclude that the priority of small dams construction is 1) Mluweh Small Dam (0.165), 2) Pakis Small Dam (0.142), 3) Lebak Small Dam (0.134), 4) Dadapayam Small Dam (0.128), 5) Gogodalem Small Dam (0.119), 6) Kandangan Small Dam (0.114), 7) Ngrawan Small Dam (0.102) and 8) Jatikurung Small Dam (0.096). Based on analysis of the order of priority of 3 (three) method showed that method is more detail than cluster analysis method and weighted average method, because the result of AHP method is closer to the conditions of each dam in the field.


2014 ◽  
Vol 919-921 ◽  
pp. 1630-1633
Author(s):  
Xiu Feng Ma

In the middle of the 20th century, geography has experienced a number "revolution". statistical occupies more and more important position in the urban geography research. This paper analyzes the application of statistics in the development of the urban geography. And with the method of case study, illustrates the multivariate cluster analysis, regression analysis and other statistical methods in the study of urban geography specific applications


2003 ◽  
Vol 140 (3) ◽  
pp. 297-304 ◽  
Author(s):  
W. ADUGNA ◽  
M. T. LABUSCHAGNE

Multivariate cluster and canonical variate analyses were undertaken for 10 genotypes of linseed (Linum usitatissimum L.) that were tested in a four-times replicated randomized block design across 18 environments (six localities by 3 years) of Ethiopia. The main aims of this study were to determine the similarities and differences of the genotypes and their testing environments, and to compare applicability of the two statistical methods. Cluster analysis grouped the genotypes into five classes in accordance with their original sources. The six locations and 18 environments were stratified into four and seven clusters, respectively. Three sites (Bekoji, Kulumsa and Sinana) were separately stratified, while three other ones (Holetta, Asasa and Adet) showed closer similarity. Canonical variate analysis indicated that ‘D33C’ and ‘D24C’ were distinguished from the other genotypes by their high oil contents. ‘N10D’ and ‘Norlin’ had closer values and were thus preferred for their good seed yield and earliness. Days to flowering and maturity, oil contents and lodging per cent played major roles in discriminating the genotypes. Comparison of the two methods showed clearer differentiation by cluster analysis than canonical variate analysis. Canonical variate analysis also contributed information on how each variable discriminated the genotypes and their test environments. Thus, both methods complement each other in providing useful information for more efficient variety development programmes.


Sign in / Sign up

Export Citation Format

Share Document