scholarly journals SOFTWARE ARCHITECTURE DECOMPOSITION USING ATTRIBUTES

Author(s):  
CHUNG-HORNG LUNG ◽  
XIA XU ◽  
MARZIA ZAMAN

Software architectural design has an enormous effect on downstream software artifacts. Decomposition of function for the final system is one of the critical steps in software architectural design. The process of decomposition is typically conducted by designers based on their intuition and past experiences, which may not be robust sometimes. This paper presents a study of applying the clustering technique to support system decomposition based on requirements and their attributes. The approach can support the architectural design process by grouping closely related requirements to form a subsystem or module. In this paper, we demonstrate our experiments in applying the approach to an industrial communication protocol software system and comparing several clustering algorithms. The result obtained from WPGMA (weighted pair-group method using arithmetic averages) shows closer resemblance than other clustering methods to the one developed by the designer.

2016 ◽  
Author(s):  
Matthew J Vavrek

Cluster analysis is one of the most commonly used methods in palaeoecological studies, particularly in studies investigating biogeographic patterns. Although a number of different clustering methods are widely used, the approach and underlying assumptions of many of these methods are quite different. For example, methods may be hierarchical or non-hierarchical in their approaches, and may use Euclidean distance or non-Euclidean indices to cluster the data. In order to assess the effectiveness of the different clustering methods as compared to one another, a simulation was designed that could assess each method over a range of both cluster distinctiveness and sampling intensity. Additionally, a non-hierarchical, non-Euclidean, iterative clustering method implemented in the R Statistical Language is described. This method, Non-Euclidean Relational Clustering (NERC), creates distinct clusters by dividing the data set in order to maximize the average similarity within each cluster, identifying clusters in which each data point is on average more similar to those within its own group than to those in any other group. While all the methods performed well with clearly differentiated and well-sampled datasets, when data are less than ideal the linkage methods perform poorly compared to non-Euclidean based k-means and the NERC method. Based on this analysis, Unweighted Pair Group Method with Arithmetic Mean and neighbor joining methods are less reliable with incomplete datasets like those found in palaeobiological analyses, and the k-means and NERC methods should be used in their place.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 951
Author(s):  
Jérémie Sublime ◽  
Guénaël Cabanes ◽  
Basarab Matei

The aim of collaborative clustering is to enhance the performances of clustering algorithms by enabling them to work together and exchange their information to tackle difficult data sets. The fundamental concept of collaboration is that clustering algorithms operate locally but collaborate by exchanging information about the local structures found by each algorithm. This kind of collaborative learning can be beneficial to a wide number of tasks including multi-view clustering, clustering of distributed data with privacy constraints, multi-expert clustering and multi-scale analysis. Within this context, the main difficulty of collaborative clustering is to determine how to weight the influence of the different clustering methods with the goal of maximizing the final results and minimizing the risk of negative collaborations—where the results are worse after collaboration than before. In this paper, we study how the quality and diversity of the different collaborators, but also the stability of the partitions can influence the final results. We propose both a theoretical analysis based on mathematical optimization, and a second study based on empirical results. Our findings show that on the one hand, in the absence of a clear criterion to optimize, a low diversity pool of solution with a high stability are the best option to ensure good performances. And on the other hand, if there is a known criterion to maximize, it is best to rely on a higher diversity pool of solution with a high quality on the said criterion. While our approach focuses on entropy based collaborative clustering, we believe that most of our results could be extended to other collaborative algorithms.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e1720 ◽  
Author(s):  
Matthew J. Vavrek

Cluster analysis is one of the most commonly used methods in palaeoecological studies, particularly in studies investigating biogeographic patterns. Although a number of different clustering methods are widely used, the approach and underlying assumptions of many of these methods are quite different. For example, methods may be hierarchical or non-hierarchical in their approaches, and may use Euclidean distance or non-Euclidean indices to cluster the data. In order to assess the effectiveness of the different clustering methods as compared to one another, a simulation was designed that could assess each method over a range of both cluster distinctiveness and sampling intensity. Additionally, a non-hierarchical, non-Euclidean, iterative clustering method implemented in the R Statistical Language is described. This method, Non-Euclidean Relational Clustering (NERC), creates distinct clusters by dividing the data set in order to maximize the average similarity within each cluster, identifying clusters in which each data point is on average more similar to those within its own group than to those in any other group. While all the methods performed well with clearly differentiated and well-sampled datasets, when data are less than ideal the linkage methods perform poorly compared to non-Euclidean basedk-means and the NERC method. Based on this analysis, Unweighted Pair Group Method with Arithmetic Mean and neighbor joining methods are less reliable with incomplete datasets like those found in palaeobiological analyses, and thek-means and NERC methods should be used in their place.


2007 ◽  
Vol 132 (3) ◽  
pp. 387-395 ◽  
Author(s):  
Guillermo Padilla ◽  
María Elena Cartea ◽  
Amando Ordás

Four clustering methods were compared for classification of a collection of 148 kale landraces (Brassica oleracea L. acephala group) from northwestern Spain based on morphologic characters: the unweighted pair group method using arithmetic averages (UPGMA) and the Ward method, hierarchical cluster algorithms, and the modified location model (MLM) applied to both the UPGMA and the Ward method (UPGMA-MLM and Ward-MLM, respectively). Comparisons were based on five criteria and on subjective considerations about the structure of each method and the characteristics of the material evaluated. Although the UPGMA-MLM was superior according to the objective criteria, its slight advantage with respect to the Ward-MLM strategy did not overcome the fact that the initial UPGMA cluster generated a classification with little value. The Ward-MLM strategy generated five homogeneous groups with defined morphologic characteristics. Moreover, the Ward-MLM strategy allowed the identification of redundant landraces, which would permit the number of accessions in further critical trials to be reduced.


2017 ◽  
Vol 35 (3) ◽  
pp. 285-292
Author(s):  
Cintia Graciele Da Silva ◽  
Edneia Zullian Dalbosco ◽  
Petterson Baptista Da Luz ◽  
Willian Krause ◽  
Vivian Loges ◽  
...  

The purpose of this study was to describe morphological traits and estimate genetic divergence and parameters between accessions of the genus Heliconia sp. from different municipalities in the state of Mato Grosso, Brazil. A set of 25 traits, 15 quantitative and 10 qualitative were evaluated. The genetic divergence was estimated based on Mahalanobis' distance, with the clustering methods known as Unweighted Pair Group Method using Arithmetic Averages (UPGMA). Genetic variability was observed for all assessed quantitative traits and the accessions were grouped in different classes. The traits with highest relative contribution to variability were longevity of flower stems and inflorescence length. The results indicated the existence of genetic variability among accessions of the Heliconiasp. germplasm bank, which can be used in breeding programs.


2016 ◽  
Author(s):  
Matthew J Vavrek

Cluster analysis is one of the most commonly used methods in palaeoecological studies, particularly in studies investigating biogeographic patterns. Although a number of different clustering methods are widely used, the approach and underlying assumptions of many of these methods are quite different. For example, methods may be hierarchical or non-hierarchical in their approaches, and may use Euclidean distance or non-Euclidean indices to cluster the data. In order to assess the effectiveness of the different clustering methods as compared to one another, a simulation was designed that could assess each method over a range of both cluster distinctiveness and sampling intensity. Additionally, a non-hierarchical, non-Euclidean, iterative clustering method implemented in the R Statistical Language is described. This method, Non-Euclidean Relational Clustering (NERC), creates distinct clusters by dividing the data set in order to maximize the average similarity within each cluster, identifying clusters in which each data point is on average more similar to those within its own group than to those in any other group. While all the methods performed well with clearly differentiated and well-sampled datasets, when data are less than ideal the linkage methods perform poorly compared to non-Euclidean based k-means and the NERC method. Based on this analysis, Unweighted Pair Group Method with Arithmetic Mean and neighbor joining methods are less reliable with incomplete datasets like those found in palaeobiological analyses, and the k-means and NERC methods should be used in their place.


1989 ◽  
Vol 4 (4) ◽  
pp. 241-244
Author(s):  
P. Lemoine

SummaryIt is difficult to undertake field studies with non marketed psychotropic drugs because of two apparently contradictory conditions : on the one hand, the methodology has to be rigorously controlled, and on the other hand, such studies have to be carried out in their future environment by general practitioners (GPs). Bearing in mind the lack of training and experience regarding this kind of approach, the author adopted a discussion group method according to the techniques developed by M. Balint. The study group comprised five GPs, a clinical pharmacology expert and a doctor from the pharmaceutical laboratory which had developed the test drug. These persons met on a monthly basis over a one year period. In the present paper, the author indicates the benefits of such a methodology, based on six years’ experience and several trials, with special emphasis placed on the pedagogical aspects.


2021 ◽  
Vol 13 (12) ◽  
pp. 6830
Author(s):  
Murat Guney ◽  
Salih Kafkas ◽  
Hakan Keles ◽  
Mozhgan Zarifikhosroshahi ◽  
Muhammet Ali Gundesli ◽  
...  

The food needs for increasing population, climatic changes, urbanization and industrialization, along with the destruction of forests, are the main challenges of modern life. Therefore, it is very important to evaluate plant genetic resources in order to cope with these problems. Therefore, in this study, a set of ninety-one walnut (Juglans regia L.) accessions from Central Anatolia region, composed of seventy-four accessions and eight commercial cultivars from Turkey, and nine international reference cultivars, was analyzed using 45 SSR (Simple Sequence Repeats) markers to reveal the genetic diversity. SSR analysis identified 390 alleles for 91 accessions. The number of alleles per locus ranged from 3 to 19 alleles with a mean value of 9 alleles per locus. Genetic dissimilarity coefficients ranged from 0.03 to 0.68. The highest number of alleles was obtained from CUJRA212 locus (Na = 19). The values of polymorphism information content (PIC) ranged from 0.42 (JRHR222528) to 0.86 (CUJRA212) with a mean PIC value of 0.68. Genetic distances were estimated according to the UPGMA (Unweighted Pair Group Method with Arithmetic Average), Principal Coordinates (PCoA), and the Structure-based clustering. The UPGMA and Structure clustering of the accessions depicted five major clusters supporting the PCoA results. The dendrogram revealed the similarities and dissimilarities among the accessions by identifying five major clusters. Based on this study, SSR analyses indicate that Yozgat province has an important genetic diversity pool and rich genetic variance of walnuts.


Plants ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 890
Author(s):  
Zifeng Ouyang ◽  
Yimeng Wang ◽  
Tiantian Ma ◽  
Gisele Kanzana ◽  
Fan Wu ◽  
...  

Melilotus is an important genus of legumes with industrial and medicinal value, partly due to the production of coumarin. To explore the genetic diversity and population structure of Melilotus, 40 accessions were analyzed using long terminal repeat (LTR) retrotransposon-based markers. A total of 585,894,349 bp of LTR retrotransposon sequences, accounting for 55.28% of the Melilotus genome, were identified using bioinformatics tools. A total of 181,040 LTR retrotransposons were identified and classified as Gypsy, Copia, or another type. A total of 350 pairs of primers were designed for assessing polymorphisms in 15 Melilotus albus accessions. Overall, 47 polymorphic primer pairs were screened for their availability and transferability in 18 Melilotus species. All the primer pairs were transferable, and 292 alleles were detected at 47 LTR retrotransposon loci. The average polymorphism information content (PIC) value was 0.66, which indicated that these markers were highly informative. Based on unweighted pair group method with arithmetic mean (UPGMA) dendrogram cluster analysis, the 18 Melilotus species were classified into three clusters. This study provides important data for future breeding programs and for implementing genetic improvements in the Melilotus genus.


2011 ◽  
Vol 46 (9) ◽  
pp. 1035-1044 ◽  
Author(s):  
Patrícia Coelho de Souza Leão ◽  
Sérgio Yoshimitsu Motoike

The objective of this work was to analyze the genetic diversity of 47 table grape accessions, from the grapevine germplasm bank of Embrapa Semiárido, using 20 RAPD and seven microsatellite markers. Genetic distances between pairs of accessions were obtained based on Jaccard's similarity index for RAPD data and on the arithmetic complement of the weighted index for microsatellite data. The groups were formed according to the Tocher's cluster analysis and to the unweighted pair‑group method with arithmetic mean (UPGMA). The microsatellite markers were more efficient than the RAPD ones in the identification of genetic relationships. Information on the genetic distance, based on molecular characteristics and coupled with the cultivar agronomic performance, allowed for the recommendation of parents for crossings, in order to obtain superior hybrids in segregating populations for the table grape breeding program of Embrapa Semiárido.


Sign in / Sign up

Export Citation Format

Share Document